🌱 1. Why This Distinction Matters
The truth
is that one of the biggest challenges in language testing is telling apart
what a learner can do (their underlying ability) and what
they do in each test (their performance).
This issue
has been a core dilemma in the field for decades. As Carroll (1968)
clearly put it, “we cannot test language competence directly; we can only
observe it through performance.” In other words, every time a student
speaks, writes, or listens during a test, we are seeing a glimpse of their
ability, but never the whole picture.
Spolsky
(1973) expanded on this idea, asking a question that still matters today: What
does it really mean to know a language — and how can we make someone show that
knowledge?
For
teachers, this means that tests don’t measure knowledge directly. They
measure how knowledge is revealed through language behaviour — and that behaviour
can change depending on the context, the topic, the task, and even the
student’s mood.
🎯 2. The Risk of Confusing Behaviour
with Ability
The fact is
that many test designers (and sometimes teachers) mistake performance for
ability. Upshur (1979) warned that when we interpret test results only as predictions
of future behaviour — like saying, “this student will do well in real-life
communication” — we risk overlooking what the test measures.
The problem
is that behaviour is not the same as the underlying ability.
- Behaviour is what we see
(the student’s responses, their fluency, their pronunciation).
- Ability is what we infer
(their knowledge, strategies, control of grammar and vocabulary).
When we
confuse the two, we limit our interpretation — and our test becomes less valid
and less useful.
Messick
(1981a) called this confusion the “operationist approach” — if what we
observe is the construct we want to measure. Cronbach (1988) criticized
this view too, arguing that tests should not be equated with the abilities
they are meant to represent. Instead, we must look deeper — at the processes
behind performance — to design better, fairer assessments.
🔍 3. Why “Direct Tests” Aren’t Always
Direct
You might
have heard that “direct tests” (like oral interviews or writing tasks)
are automatically more valid because they show “real” language use. The truth
is that this belief can be misleading.
Yes, a
speaking test looks authentic — but as researchers like Cronbach (1988)
remind us, appearance is not evidence. A direct test may show
performance in a controlled situation, but it still doesn’t give full access to
the person’s inner ability.
So, when we
assess a student speaking about familiar topics, we are observing a small
slice of their language world — one that depends heavily on test
conditions. That’s why we say language tests are always indirect indicators
of ability (Bachman & Palmer, 1996). What we see in a test task is a
performance — what we need to infer from it is the ability behind it.
⚖️ 4. Why “Face Validity” Isn’t Enough
Many
researchers — Carroll (1973), Lado (1975), Bachman (1988a), and others — have
criticized the idea that a test is valid simply because it looks right.
Stevenson (1985b) called this “the treacherous appearance of validity.” In
other words, just because a test seems authentic doesn’t mean it measures
what it claims to measure.
For
bilingual teachers, this is crucial: a test that “feels communicative” isn’t
automatically a good measure of communicative ability. We must go beyond face
value and examine content relevance, construct validity, and evidence
of reliability.
🌍 5. The Myth of “Real-Life”
Authenticity
It’s
tempting to think we can design a test that perfectly mirrors “real-life”
communication. But language in real life is infinitely variable and context
dependent. As Spolsky (1986) noted, every utterance depends on who’s
speaking, to whom, where, why, and under what conditions.
Imagine
designing a test for taxi drivers at an international airport (Bachman,
1990). You might think the language is simple — directions, prices, greetings. But
those interactions involve bargaining, politeness strategies, cultural
expectations, and situational adjustments. There’s no single “correct”
sample of this real-life behavior that can represent all possibilities.
So, even
when we aim for authenticity, we must accept that tests can only simulate,
not replicate, real communication. The goal is representativeness,
not perfect imitation.
🧭 6. What Teachers Can Do
Here’s the
empowering takeaway: When designing your own language tests, you can create valid
and meaningful assessments if you remember these principles:
- Define the construct clearly — what specific
ability are you trying to measure?
- Design tasks that reflect that ability, not
just the surface behaviour.
- Interpret results carefully — remember that a
test performance is a sample, not a complete portrait.
- Support your interpretation with clear reasoning and
evidence (e.g., through consistency, relevance, and alignment with your
teaching goals).
- Avoid overreliance on
“real-life appearance” — instead, ensure your tasks are relevant, fair, and connected
to your learners’ context.
As Cronbach
(1988) wisely summarized, we must look beyond the surface: “For understanding
poor performance, for remedial purposes, for improving teaching methods, and
for carving out more functional domains, process constructs are needed.”
In other
words — test the process, not just the product.
📚 References
Bachman, L.
F. (1990). Fundamental considerations in language testing. Oxford
University Press.
Bachman, L.
F., & Palmer, A. S. (1996). Language testing in practice. Oxford
University Press.
Carroll, J.
B. (1968). The psychology of language testing. Cambridge University
Press.
Cronbach,
L. J. (1988). Five perspectives on validity argument. In H. Wainer &
H. Braun (Eds.), Test validity (pp. 3–17). Lawrence Erlbaum.
Messick, S.
(1981). Evidence and ethics in the evaluation of tests. Educational
Researcher, 10(9), 9–20.
Spolsky, B.
(1986). Language testing: Art or science? Language Testing, 3(2),
147–153.
Upshur, J.
A. (1979). Functional language testing. Canadian Modern Language
Review, 35(2), 233–246.
No comments:
Post a Comment