How research can be misleading

The following is a good case to follow. A study purports to empiricaly establish something but has incredible flaws in assumptions and premises, methodology, conclusions, statistics, and so forth. I cannot shred studies in the way the person behind The Backseat Linguist has here, so I rely on such careful analyses of studies to help me understood how reliable they are.
This blog can be found at
Here is the review:

Study Reviewed:

Hill, M., & Laufer, B. (2003). Type of task, time-on-task and electronic dictionaries in incidental vocabulary acquisition. International Review of Applied Linguistics, 41(2), 87–106.

I recently came across a reference to a claim related to vocabulary acquisition that struck me as rather usual: Hill & Laufer (2003) stated that a second language reader would need to read 8,000,000 words in order to acquire a mere 2,000 words.  Schmitt (2008) repeats this claim to bolster his argument that developing a vocabulary sufficient to read native-level texts requires a strong dose of direct instruction.  It is worth quoting at length from the original article:

Studies on vocabulary acquisition from reading (without any enhancement tasks) show that pick up rates of unfamiliar words range from 1–5 words in a text of over 1,000 words (Zahar et al. 2001; Luppescu and Day 1993; Hulstijn 1992; Knight 1994; Paribakht and Wesche 1993). Similar gains occur during reading books. In Horst et al.’s (1998) experiment, an average of five words were gained from the reading of a simplified version of The Mayor of Casterbridge, a text of 21,000 words. Lahav (1996) conducted a study with students who read four simplified readers, each one of about 20,000 words, and found an average learning rate of 3–4 words per book. At this rate of growth, a second language learner would have to read in excess of eight million words of texts, or about 420 novels to increase their vocabulary by 2,000 words. This would appear to be a daunting and time consuming means of vocabulary development. It is therefore reasonable that L2 learners acquire their vocabulary not only from input, be it reading or listening, but also through word-focused activities. (p. 88, emphasis added)

How did Hill and Laufer arrive at this conclusion, and is their estimate correct?

The authors give the example of Horst et al.’s (1998) study of students reading a simplified reader version of The Mayor of Casterbridge (the Lahav study mentioned is unpublished). In Horst et al., the researchers tested students on a set of 45 words. The 45 words included eight that occurred seven or more times in the text but, because they were not part of a list of high-frequency words students studied in another aspect of their language course, were likely to be unknown to the subjects.  The rest of the words were randomly selected from among low- and medium-frequency words, occurring in the text six or fewer times.

A pretest determined that the students already knew about half of the 45 target words, so the average number of new words students could have acquired on the test that was administered to them was about 23.  After reading the novel, students took the post-test, which showed an average gain of around five words out of the 23 or so words that were new to the students.  The simplified version novel they read contained a little more than 20,000.  Therefore, Hill and Laufer concluded, you can only expect to pick up around five words for every 20,000 words you read. Based on that data, they arrived at the figure 8,000,000 words that would need to be read to acquire 2,000 words: If you acquire 5 words for every 20,000 words you read, then you would need to read 400 of these 20,000 word texts (2,000/5 = 400), or 8,000,000 words.

The error here should be clear: Horst et al. did not find that only five words were acquired by students after reading a 20,000 page book, but rather that subjects got five words correct on the test administered to the them. Hill and Laufer appear to have confused the population of all the unknown words in the text with the sample of words that were included in the test. (Krashen (2004) makes a similar point about this study.) In the Horst et al., the 45 words on the test were a sample of all of the potentially new words in the text. The researchers estimated that there were about 222 words (technically, word families) that their EFL subjects might not know. They then eliminated any word that appeared only once, which left them with 75 words, and then sampled 45 words from that list to create their test. Overall, their subjects acquired a respectable 22% of the new words they encountered, as measured on an immediate post-test (administered right after the actual reading).

Although the researchers felt that words that occur only once in a text were not good candidates for being acquired, other, later research has found that while the pick up rate for such words is low, it isn’t trivial, either.  Pellicer-Sanchez and Schmitt (2010) found that the meaning of 29% of the words that appeared only once in the text they used to measure incidental acquisition were recognized by their subjects on a post-test. Waring and Takaki’s (2003) subjects recognized the meaning of 16% of the words occurring once on an immediate post-test similar to Horst et al.’s.

Even if we assume that Horst et al.’s subjects already knew half of the untested words (as they knew half of the 45 tested words), and that the pick up rate were only 10%, that would leave around nine additional words acquired, effectively tripling the total number of words acquired (222 total words – 45 test words = 177 untested words, divided by 2 = 88.5 , multiplied by .10 = 8.5 words).

There are other potential problems with the Horst et al. study (again, see Krashen (2004)), and indeed with many attempts at estimating the number of words that can be acquired incidentally through reading. In any case, the study does not provide evidence for Hill and Laufer’s claim that we need to read 8,000,000 words in order to acquire 2,000 words.

An unrelated note: Hill and Laufer has been cited recently as a study comparing explicit vocabulary instruction with incidental acquisition (File & Adams, 2010). It is not. It is instead a study of different methods of dictionary use and post-reading questions; there is no “reading only” comparison included.

Leave a Reply

Your email address will not be published. Required fields are marked *