A few months ago, after reading the NMPED’s implementation plan for the Common Core standards, I promised a post about vocabulary development. The implementation plan describes six “shifts that that must take place in the next generation of curricula,” including a focus on academic vocabulary. There is a paragraph about the curricular shift related to academic vocabulary. Here it is:
Through reading, discussing, and writing about appropriately complex texts at each grade level, students build the general academic vocabulary they will need to access a wide range of complex texts in college and careers. Students gather as much as they can about the meaning of these words from the context of how the words are being used in the text. Teachers offer support as needed when students are not able to figure out word meanings from the text alone and for students who are still developing high frequency vocabulary (p.32).
Here’s the problem: the shift described in the plan A) isn’t much different than what a lot of folks are already doing and B) runs counter to the last five decades of research. The proposed curricular shift quoted above will NOT improve students’ ability to access, appreciate, and appropriate academic vocabulary.
There are many (many, many!) things in this world about which I am uncertain. This is not one of them.
How can I be so certain? Because when I preparing to write my dissertation (which–I just discovered–is available through google books and other online retailers), I read 60-years’ worth of scholarly articles on vocabulary development. Therefore, I am quite confident when I state: leaving students to “gather” information about vocabulary from context or by asking their teachers will not result in short- or long-term vocabulary growth.
What I know to be true about vocabulary development
(NOTE: if you choose to share this information with others, please respect my time and effort by citing this blog post.)
The short version:
- Vocabulary is vital to success in reading and comprehending (among other things) [too many to cite!].
- Some instruction is better than no instruction (Graves, 2006).
- Instruction which features a combination of definitional and contextual information is more effective than instruction that focuses on a single type of information (Mezynski, 1983; Stahl & Fairbanks, 1986).
- Students need long-term and rich instruction, multiple encounters, and opportunities for active learning in order to learn individual words (Beck et al., 2002).
- Teaching for word-ownership is a more effective approach to vocabulary instruction than teaching students to memorize individual words (Stahl & Nagy, 2006).
- Students who understand how words work (morphology, etymology) within sentences (syntax) are more likely to personalize and grow their own vocabulary knowledge.
The slightly-longer short version (from my dissertation):
Vocabulary is the primary resource within the meaning-making system of language. Teachers must, therefore, encourage students to see vocabulary as a resource for making meaning through effective and responsive instruction. “Virtually all authorities on literacy education agree. . . that vocabulary knowledge is vital to success in reading, in literacy more generally, in school, and in the world outside of school” (Graves, 2006, p. 2). Researchers have demonstrated conclusively that some instruction is better than no instruction (Graves, 2006); and that instruction which features a combination of definitional and contextual information is more effective than instruction that focuses on a single type of information (Mezynski, 1983; Stahl & Fairbanks, 1986). In order for students to learn—and retain—individual words, they need long-term and rich instruction, multiple encounters, and opportunities for active learning (Beck et al., 2002).
Instead of encouraging students to see themselves as agentive, traditional approaches to vocabulary teaching has taught students to see themselves as consumers, whose job it is to reproduce knowledge (Stahl & Nagy, 2006). For example, teaching students to look up, copy, memorize, and then reproduce dictionary definitions—a common strategy throughout the grades—is similar to teaching students to look up telephone numbers in the phonebook (Stahl & Nagy, 2006): “When you look up a phone number, ordinarily you remember it just long enough to dial it and then forget it almost immediately” (p. 64).
On the other hand, comprehensive vocabulary instruction that teaches specific words, immerses students in rich language, and promotes the growth of generative vocabulary knowledge facilitates a sense of ownership, as well as long-term vocabulary growth (Baumann, Edwards, Font, Tereshinski, Kame’enui, & Olejnik, 2002; Blachowicz & Fisher, 2004, 2006; Graves, 2006; Kamil & Hiebert, 2005; Scott & Nagy, 2004; Stahl, 1986; Stahl & Nagy, 2006; Templeton, 2004). Note: Graves (2006) divides the third element into: develop strategies for independent word learning, and raise word consciousness.
And finally, an excerpt from “Do not assume that words are like bricks”: Vocabulary Research in the 20th Century
Throughout the second half of the 20th century, researchers devoted much time and energy to investigating potential sources of vocabulary growth, which can be categorized loosely as incidental and instructional. Of course, there were studies which investigated a third possibility, and I label those mixed.
Incidental. Research in incidental growth of vocabulary reflects the underlying belief that there is no way that instruction alone can account for learning. Considering the size of vocabulary growth which occurs between 3rd and 7th grades, J. R. Jenkins, Stein, and Wysocki (1984) concluded that “there is reason to doubt that direct teaching of words accounts for the vocabulary growth said to occur during the upper elementary years” (p. 768); they tested their hypothesis that incidental learning from context would result in vocabulary growth in a study involving 112 fifth graders. The authors explained how their study filled a gap in the research: it explored factors that might influence incidental learning from context, including the relationship between frequency of exposure and learning; the role of prior experience; and the influence of reading ability.
To test their hypothesis, the authors administered four post-tests (three vocabulary, one reading comprehension) after an intervention during which students participated in word reading practice, and received a teacher demonstration; the participants received no explicit word meaning intervention. In their findings, the authors reported that the fifth grade students in the study learned (rather than derived) word meanings from context. However, the learning was neither easy nor large. The researchers found that at least two exposures were necessary to influence learning. They also concluded that prior exposure, or informal teaching, had pronounced effects on learning. In regards to the question about reading ability, they reported that “better readers were more likely to acquire word meanings” (J. R. Jenkins, Stein, & Wysocki, 1984, p. 781). Despite their findings, the authors concluded that they still had not achieved a satisfactory answer to the question of how vocabulary growth occurs during the school years.
McKeown (1985) distinguished her study from previous work by: creating more realistic settings for participants; extending research in word-meaning acquisition by asking how well participants could put their new knowledge to use; and by placing instructional implications at the center of the study. The purpose of McKeown’s (1985) study was to explore “differences in the process of acquiring word meanings from context in learners at different levels” (p. 483). The study included 30 fifth-graders, who completed a meaning-acquisition task created by the author, which led each participant “through a series of contexts containing an artificial word which eventually directed him or her to a designated meaning the word” (McKeown, 1985, p. 484). Participants received scores on seven steps: recognizing and testing of constraints (e.g. considering global and local contexts in accepting or rejecting words as possible definitions for the artificial word in the example sentence); coordination of differing contexts (e.g. the participant used information from divergent example sentences to make choices about possible meanings); exposure to additional contexts in which participants had an opportunity to refine word meaning; a task which asked participants to identify the target word’s meaning; and an evaluation task, in which participants decided whether an example sentence using the target word was good or bad (McKeown, 1985, pp. 486-487).
McKeown (1985) concluded that a child’s ability to work within contextual limits “enables one to extract accurate information about potential work meaning from context” (p. 492), and that the higher ability participants were significantly more able to do this than their lower ability counterparts. She also found evidence of “semantic interference [which] suggests that multiple contexts may impair the ability of low-ability learners to derive information from context regarding word meaning, at least if they are left to do so on their own” (McKeown, 1985, p. 493). Acquiring meaning from context is a complex and multi-staged process, and McKeown (1985) concluded that “even under conditions that seem nearly optimal, successful outcomes may not be forthcoming” (p. 493). McKeown’s (1985) final conclusion, which bore important instructional implications, was that lower-ability students required more than multiple exposures and having correct definitions in order to achieve ownership over new vocabulary. One possible remedy she proposed to address this problem was teacher modeling that would help learners to grasp the dynamic nature of word meanings, and to provide students with guided practice in testing word meanings in varying contexts.
Similarly, Nagy, Herman, and Anderson (1985) investigated whether students’ acquired vocabulary incidentally while reading natural text in a study involving 57 at- or above-grade-level eighth grade students. Participants read either a spy narrative or expository text about river systems, and then completed a story memory task related. After a brief interval, researchers interviewed students to ascertain degrees of understanding for the target words. Finally, the researchers administered a multiple-choice test measuring degrees of word knowledge. The interview and multiple-choice measures both demonstrated small, but significant learning from context. Keeping in mind the estimated number of words students must learn (Nagy & Anderson, 1984), the authors argued that their study provided further evidence to the benefits of wide reading as a way to promote vocabulary growth.
In a larger study, Nagy, Anderson, and Herman (1987) attempted to determine whether students would make gains in vocabulary knowledge as a result of incidental learning during normal reading. The study included 352 students in the 3rd, 5th, and 7th grades. The classroom teachers administered the Anderson-Freebody Checklist Vocabulary Test (Anderson & Freebody, 1983, as cited by Nagy, Anderson, & Herman, 1985) two weeks prior to the first of the two sessions which comprised the study. During the first session, participants read either two expository passages or two narrative passages (all were from grade-level texts). A week later, the researchers returned to the classroom, which was a surprise to participants, in order to administer a multiple-choice measure. The results of the study demonstrated “beyond reasonable doubt that incidental learning of word meanings does take place during normal reading” (Nagy et al., 1987, p. 261). The authors concluded that this study supported their earlier contention (Nagy, Herman, & Anderson, 1985) that reading “lots of good texts” (Nagy, Anderson, & Herman, 1987, p. 240) would lead to long-term vocabulary growth.
Herman, Anderson, Pearson, and Nagy (1987) explored the relationship between text features and incidental learning of vocabulary. Specifically, they examined how text features such as macrostructure (titles, topic sentences, and organizational strategies), microstructure (words that clarify temporal and logical relationships between words, phrases, and clauses), and conceptual elaborations (the degree to which concepts are expressed explicitly and concisely) affect readers’ incidental acquisition of vocabulary knowledge. The participants in this study were 309 eighth-grade students. The authors administered a multiple-choice test, which included each of the target words twice, with distractors that corresponded to two levels of difficulty; the Anderson-Freebody Checklist Vocabulary Test (Anderson & Freebody, 1983), as cited by Herman, Anderson, Pearson, & Nagy, 1987), which provided information about students’ prior knowledge; and an essay test. During the intervention, participants read different versions of a text, which had been altered to highlight macrostructure, microstructure, or elaboration of central concepts. The authors’ major finding was that students who received the elaborated concepts text experienced a greater increase in vocabulary growth than students who read the original or other revised versions. Therefore, the authors concluded that revising a text’s surface features would not be sufficient to promote incidental word acquisition. Rather, in support of Anderson and Freebody’s (1981) knowledge hypothesis, the authors concluded that “the concepts in the text must be elaborated so that a more complete body of knowledge is present” (Herman et al., 1987, p. 281).
Instructional. Bear and Odbert (1941) warned against over-reliance on using context to arrive at word meanings so that words did not remain “complete strangers” (p. 754). They also warned that, “The reader who has little insight and who is satisfied with a superficial dependence on context may continue to read at a low level of efficiency and make little vocabulary growth” (R. M. Bear & Odbert, 1941, p. 759). Similarly, Seegers(1946) warned against encouraging students to rely too much on context to determine the meanings of unknown words, and recommended that teachers “provide a classroom and school climate encouraging to the development of ideas and providing opportunities for reading, talking, writing, and thinking about [words and their underlying concepts]” (Seegers, 1946, p. 67). Therefore, researchers who believed that the source of vocabulary growth was related to direct instruction searched for the most effective instructional methods.
Miles (1945) explored the degree to which direct instruction affects students’ vocabulary growth over a two-and-a-half year period. Miles’ (1945) population consisted of sixty students in their second semester of twelfth grade; thirty of those students had received a semester of “intensive direct vocabulary instruction” (p. 285), during the first semester of tenth grade. The other thirty functioned as a control, because they had not received the intensive vocabulary instruction. Students in the vocabulary group received instruction designed to foster appreciation and comprehension of word meanings, with an emphasis on their oral vocabularies; the instruction also included blackboard work, spelling, sentence-writing, study of grammar, and vocabulary notebooks. Miles (1945) measured vocabulary knowledge with a standardized test, and reported a gain in median scores. “The gain by the direct method of teaching vocabulary even for one semester as shown by this experiment seems sufficiently significant to warrant further experimentation and study” (Miles, 1945, p. 286). The problem with Miles’ (1945) study, of course, is that she did not describe the actual methods of instruction, making it less useful to researchers looking to repeat the gains in vocabulary growth with other populations.
In contrast, Jenkins (1942) early study provided more information about the types of instruction used during the intervention. Jenkins (1942) investigated the degree to which “systematic vocabulary instruction improves general reading achievement and to determine the relative effectiveness of four methods of vocabulary study” (p. 347). Her study included the students in five seventh grade classes. Four of the classes received fourteen weeks of instruction, each with a different instructional approach. The fifth class operated as the control; they received no specific vocabulary instruction, except when specifically requested by a student. The four treatment groups were as follows: Class E1 worked in workbooks, which were supplemented by individual vocabulary notebooks, dictionary work, and discussion; Class E2 used word-cards, on which students recorded pronunciation and meaning, engaged in contextual discussion, and occasionally completed dictionary work; Class E3 made lists of synonyms, antonyms, and special words (such as vivid verbs, adjectives, and adverbs located in their reading); Class E4 engaged in morphological analysis, kept word study notebooks, and discussed word histories.
To ascertain the most effective approach to vocabulary instruction, Jenkins (1942) compared students’ progress on a standardized reading test, in recreational reading, and on functional use of vocabulary. She concluded that vocabulary instruction improved general reading abilities and influenced recreational reading. Of the methods tested, Jenkins (1942) found the word-card and word lists to be superior methods, but she warned that vocabulary instruction is not a “cure-all” for improving students’ overall achievement.
Gipe (1978-1979) designed a study of techniques for teaching word meanings which corresponded to three prominent views of vocabulary development: the association method; the category method; and the context method (p. 627). She also added a dictionary component, since dictionary work was commonly used as a method of learning new vocabulary. Her hypothesis was that “vocabulary retained by the subjects would differ according to which of the 4 methods they experienced” (Gipe, 1978-1979, p. 627). The participants were 113 students in four 3rd grade classes and 108 students in four 5th grade classes (due to absences, the final number of students in the study decreased to 93 and 78, respectively). In order to determine the target words for the study, Gipe (1978-1979) designed a checklist on which students indicated which of two sentences used an underlined word correctly; the classroom teachers administered the checklists as pre- and post- measures. Gipe (1978-1979) divided the 96 words most missed words into eight lists to be used during the study (one list per week for eight weeks). All participants received the same worksheet-based instruction (designed by the researcher) over the eight-week period, but the order in which each class received a particular method was determined randomly. In the association method, students memorized words paired with a familiar synonym or brief definition; the object of this task was to be able to reproduce the pairs later without looking at the worksheet. The category method required students to add words to a preexisting list of a target word plus three familiar words. Then, students sorted a random listed of the provided words without referring to their lists. In the context method, students read short passages containing the target words. After reading, the participants composed responses, based on personal experiences, to questions about the target words. During the dictionary training, participants were instructed to look up the target words, copy the definition, and write sentences containing the target words. At the end of each week, students completed an evaluation task, in which they filled blanks with the words they had learned that week. Analysis of the data led Gipe (1978-1979) to report a significant difference between the four methods; the context method was most consistently more effective in each analysis of the data. Gipe (1978-1979) extrapolated her results to discuss implications for instruction, and suggested that teachers implement the association method, which allow the learner to connect new vocabulary to already known words. It is important to note, that a follow-up study (Gipe, 1981, as cited by Beck & McKeown, 1991) did not support the earlier findings.
Direct instruction in deriving word meanings from context. Rather than rely on incidental learning alone, several studies aimed to raise students’ explicit awareness of ways to learn from context in order to leverage their vocabulary growth.
Carnine, Kame’enui and Coyle (1984) investigated the conditions under which context clues aid readers, and designed two studies, the first of which gathered descriptive data about students’ use of context clues “across levels of clarity, proximity, and learner experience” (p. 190). They reported the following findings: figuring out word meanings was easier when learner encountered the words in context; proximity of context clues aided learners; context clues in the form of synonyms were more useful to learners than other clues; and older students were more able to use context clues. The second study aimed to fill a gap the authors identified in a review of vocabulary research literature: “researchers have been less interested in pursuing intervention studies than they have in studies concerned with describing types of context clues” (Carnine et al., 1984, p. 196). Therefore, the second study compared two methods of vocabulary instruction: “rule plus systematic practice and systematic practice only” (Carnine et al., 1984, p. 197) with a control group which received no intervention. Their findings suggested “that intermediate-grade students cannot be assumed to have adequate contextual analysis skills” (Carnine et al., 1984, p. 200). Therefore the authors recommended a combination of direct instruction in deriving word meanings from context and intensive, systematic practice, which includes the opportunity to receive timely feedback.
After reviewing the research from the previous ten years, J. R. Jenkins, Matlock, & Slocum (1989) concluded that while students can learn from context, the probability of such learning is quite low. Rather, they argued that the two main sources of vocabulary growth come from direct instruction in the meanings of individual words and direct instruction in deriving individual word meanings from context. Their study investigated these two different methods with varying amounts of instruction (J. R. Jenkins et al., 1989). The 135 fifth-grade students who participated came from six intact classrooms, which were randomly assigned to the word meanings instruction treatment (three classes) and the deriving word meanings from context treatment (three classes). Additionally, within each of the classes, students were assigned to low, medium, and high amounts of practice.
The instructional procedures for the individual word meanings groups included definitions and sample sentences for the forty-five target words (which were divided into nine sets of five words). The low exposure group received 9 days of instruction in which they encountered each set of words once. Participants in the medium practice group received 11 days of instruction, with repetition of the first two sets of words. Students in the high exposure group received 20 days of instruction, during which they encountered all of the target words six times. Students in the deriving word meanings treatment groups were taught to use a general strategy, which emphasized the use of external context clues, whenever they encountered unfamiliar words. The strategy included five steps, for which the authors developed the acronym SCANR: “Substitute a word or expression for the unknown word. Check the context for clues that support your idea. Ask if substitution fits all context clues. Need a new idea? Revise your idea to fit the context” (J. R. Jenkins et al., 1989, p. 221, emphasis in original). As with the individual word meanings groups, participants in the deriving word meaning groups received 9, 11, and 20 days of instruction.
The researchers administered two pre-tests, which measured words-in-isolation and words-in-context. These tests included the forty-five target words, twenty of which were taught across all of the groups. Participants completed six post-tests, which included the same two pre-tests, two multiple-choice tests, and two tests which assessed students ability to derive meanings of nonsense words from context. Analyses were performed on the twenty words which were common to all treatment groups, and which pretests revealed were unknown to participants. The authors concluded that “the individual meanings instruction was superior to the deriving meaning instruction for teaching specific words” (J. R. Jenkins et al., 1989, p. 228) even at the lowest level of practice. Furthermore, more practice led to more learning. However, in the deriving meaning task, the training was deemed effective at helping students to infer meaning from context. Therefore, the researchers concluded that both instructional methods studied could substantially increase students’ vocabularies, because they operate in different ways. While the individual word meanings training added actual items to a student’s vocabulary, the deriving meaning training helped students to sharpen their ability to learn new words independent of direct instruction from a teacher (J. R. Jenkins et al., 1989).
White, Power, and White (1989) were among those who advocated a different approach to teaching students to arrive at word meanings: morphological analysis. Like other researchers of their time, White et al. (1989) were concerned with the mysterious, and for the most part unexplained, exponential growth in children’s vocabularies during the school years. Because direct instruction could not be shown to be responsible for the growth, White et al. (1989) proposed an alternative explanation to the incidental learning hypothesis. Building on the work of Tyler and Nagy (1987), who found that their participants were able to recognize the meaning of derivationally affixed high-frequency stems, as well as Wysocki and Jenkins’ (1987) evidence of the likelihood that morphological generalization made a contribution to students’ process of arriving at word knowledge, White et al. (1989) designed two studies to explore the role of morphological analysis in word learning. The first of White et al.’s (1989) studies examined the characteristics and frequency of affixed words, while the second study looked at characteristics of students. Their goal was to determine how much of vocabulary growth could be attributed to morphological generalization and whether direct instruction in morphological analysis could help students increase their vocabulary knowledge.
Although White et al.’s (1989) did not actually investigate an instructional model, the results of their linguistic and developmental analyses led them to conclude that close to 80 percent of the words that students were likely to encounter in school were indeed morphologically analyzable, and that students were likely to be able to conduct such analysis due to their knowledge of prefixes, stems, and affixes. Therefore, the authors concluded that their model of morphological analysis would be appropriate and beneficial for students in grades 4 and up.
Meta-analyses. Which of the aforementioned instructional methods was deemed the most effective? To answer that question, several meta-analyses conducted in the last part of the 20th century reviewed studies of instructional approaches to increasing students’ vocabularies.
Stahl (1986) proposed three principles for vocabulary instruction: include both definitional and contextual information; encourage “deep” processing; and provide multiple exposures. A primary goal of vocabulary instruction is to increase the number of words students know. It is also important, however, to attend to the depth and degrees of word knowledge. Not only do learners need to possess definitional information about a given word, they also need contextual information. Definitional information includes “knowledge of the logical relations between a word and other known words, as in a dictionary definition,” while contextual information “can be defined as knowledge of the core concept the word represents and how that core concept is changed in different contexts” (Stahl, 1986, p. 663). To improve comprehension, both types of information need to be present in a balanced manner. Therefore, knowing a word is more complex than simply recognizing it on a recall test.
Similarly, the second principle of effective instruction reflected developments in cognitive research, and encouraged teachers to help students move beyond superficial understandings of word meanings. Encouraging deep processing increased the chances that they would remember and be able to use the words in question. Stahl (1986) presented three increasingly deep levels of processing that enhanced overall reading comprehension. Association processing occurred when a learner created a link between the target word and its synonym, or between a word and a single context; comprehension processing happened when students demonstrated an understanding of the word meaning which went beyond the first-level association; finally, a student engaged in generation processing could extend the comprehension processing to use the word in novel settings. In order to achieve this depth of word knowledge or ownership, students must encounter words on multiple occasions and have opportunities to practice working with them (Stahl, 1986). The third principle, therefore, encouraged teachers to organize instruction so that it would provide students with multiple exposures to words, both of which “significantly improve comprehension” (Stahl, 1986, p. 665).
To decide which words to include in direct instruction (a subject upon which later researchers expanded, c.f. Beck et al., 2002), Stahl (1986) recommended that teachers consider whether knowing an unfamiliar word was necessary for overall text comprehension; whether the student would be likely to understand the word’s meaning based on context; and how thoroughly specific words would have to be taught in order for them to become known words in the student’s vocabulary. The research from which Stahl (1986) drew these conclusions was further detailed in Stahl and Fairbanks’ (1986) meta-analysis of the effects of vocabulary instruction on word learning and comprehension.
Citing work from the previous three decades, Stahl and Fairbanks (1986) argued that the differing results of those studies strongly suggested “the possibility that some methods of vocabulary instruction may be more effective than others” (p. 73). To ascertain the effectiveness of instructional methods, they examined 52 studies based on three method classification factors (definitional or contextual information; depth of processing; number of exposures) and two settings factors (individual vs. group instruction; time spent on instruction). They concluded that “vocabulary instruction is a useful adjunct to the natural learning from context” (Stahl & Fairbanks, 1986, p. 100). Specifically, Stahl and Fairbanks (1986) reported that vocabulary instruction produced significant effects on reading comprehension of passages that contained words taught during the instruction, and produced smaller, but still significant effects on comprehension of passages that did not always contain words targeted for instruction. Methods that included both definitional and contextual information influenced comprehension positively, while definition-only methods, or limited exposure methods did not. Finally, Stahl and Fairbanks (1986) explained that “the effects of vocabulary instruction are subtle and complex, but, given their potential effects on comprehension, they are worthy of further investigation” (p. 104).
Kuhn and Stahl (1998) reviewed 14 studies, which investigated methods for teaching students to learn words from context. They traced methodological trends, ranging from providing students with explicit taxonomies of context clues, to instruction designed to help students use cognitive strategies flexibly, to more general guidelines for employing context to learn word meanings. Ultimately, Kuhn and Stahl (1998) concluded that incidental word learning occurred as a result of increased practice, rather than due to any particular method: “Given the frequent recommendations that children be taught the use of context clues, the paucity of research evidence is disappointing” (p. 129).
In their meta-analysis of 21 instructional methods designed to increase students’ deliberate use of context to arrive at word meanings, Fukkink and de Glopper (1998) argued that, because the number of words a student needs to learn each year is very high (Nagy et al., 1985), even small increases in word learning could make significant differences for students. Additionally, “regardless of any impact on incidental word learning, students need strategies for coping with unfamiliar words encountered while reading” (Fukkink & de Glopper, 1998, p. 451). Their review considered five types of instruction: recognition and use of context clues; increased attention to context through cloze tests; development of general strategies; assistance with conceptualizing definitions; and practice-only. The results demonstrated that “deliberately deriving word meaning from context is amenable to instruction and the effect of even relatively short instruction is rewarding” (Fukkink & de Glopper, 1998, p. 462). The authors admitted that the outcome was surprising, because of the tendency of contemporary researchers’ extreme caution at approaching the question of the effectiveness of instruction in deriving meaning from context. Ultimately, they concluded that incidental word learning occurred incrementally over long periods, and that it was reasonable to expect that researchers should plan for long-term instruction prior to witnessing any “significant transfer effect from intentional word learning to incidental word learning” (Fukkink & de Glopper, 1998, p. 465). And thus, at the end of the century, questions about effective instructional methods continued to ring, while various members of the literacy research community proposed solutions.
Graves (1986; 1987) reviewed research related to learning and teaching vocabulary up to the mid 1980s. The review was oriented around three basic questions: What do students already know? What do students need to learn? What can be taught? Although researchers had proposed answers to these and other fundamental questions, Graves (1986) explained that there were still some questions to which researchers had few answers, and others for which the answers were still too speculative. Graves (1986) examined extant research related to vocabulary size, effects of vocabulary on comprehension, methods of teaching individual words, and “generative vocabulary instruction” (p. 49), which included research on instruction dealing with context clues and morphological analysis. After an in-depth examination of studies across the century (many of which are included in this paper), Graves (1986) reached several conclusions, the first of which was that “more needs to be learned about year-by-year growth in students’ vocabularies, about the depth of students’ word knowledge, and about the vocabularies of less able and less advantaged students” (p. 79). Graves’ (1986) second conclusion was that, while a variety of instructional methods had been tested, different methods made varying demands on teachers’ and students’ time were not likely to produce the same results across populations. He argued that researchers needed to consider “just what various methods accomplish and how they fit into the curriculum” (Graves, 1986, p. 79). Students’ abilities to use contextual and morphological clues to access word meaning were still in question when Graves (1986) conducted his review. More information was needed about how and when students drew upon those skills, and how better to teach those forms of analysis. In conclusion, Graves (1986) argued that there were two factors contributing which had prevented vocabulary research from influencing instruction across the century: the lack (with some notable exceptions, all of whom have already been mentioned in this paper) of a large group of researchers exploring long-term research; and the lack of a “coherent, fully articulated, long-term plan for vocabulary instruction” (Graves, 1986, p. 80).
Beck, McKeown, and Omanson (1987) described a vocabulary program, which replicated earlier research (Beck et al., 1982), and had at its heart the instructional goal of producing rich and flexible word knowledge. A key feature of their program was rich instruction, which: invited students to move beyond associating words with definitions; provided opportunities for maximum amounts of processing; and encouraged students to “manipulate words in varied and rich ways, for example, by describing how they relate to other words and to their own familiar experiences” (Beck et al., 1987, p. 149). Another feature of this program was that it provided multiple encounters with words over time. The final feature was that the program promoted creative word use (and thus vocabulary learning) outside of the classroom through an activity called Word Wizard, which rewarded students for finding examples of taught words outside the classroom or using those words in other settings. Throughout the program, teachers provided modeling for activities, and students were continually asked to make their understanding and thinking about words and vocabulary explicit.
Answering the question about what words are appropriate to include in direct instruction, Beck et al. (1987) introduced the three-tiered concept of vocabulary. The first tier encompasses basic words such as mother, go, and red, none of which would need to be included in instruction due to widespread familiarity. Another set of words that were deemed inappropriate for inclusion in general vocabulary instruction were the third-tier words, which included highly specialized words such as nebula, tidal pool, and divertimento. Words that occurred with high-frequency in the vocabulary of mature language users, and those deemed of general utility, such as unique, convenient, and procrastinate, belonged to the second tier, and Beck et al. (1987) suggested that instructional efforts be directed at teaching these words. They also emphasized the importance of gaining and raising students’ interest in words, which can further enrich in-class learning environments, as well as entice children to infuse their home environments with their enthusiasm for words. In conclusion, Beck et al. (1987) argued that while extant research had provided useful information about approaches to vocabulary instruction, no simple formula existed which could be applied across all settings. Rather, “the creation of effective vocabulary instruction calls for a careful crafting of experiences in consideration of specific learning goals, the words being taught, and the characteristics of the learners” (Beck et al., 1987, p. 162).
Herman and Dole (1988) reached a similar conclusion that the dilemma of crafting and delivering effective vocabulary instruction could not be addressed with simple solutions. In their overview of research-based approaches to vocabulary instruction, Herman and Dole (1988) concluded that more research was needed to determine effective methods for “teaching students to become independent word learners and efficient ways of helping students develop thorough understandings of important words and concepts” (Herman & Dole, 1988, p. 42). This desire for encouraging students to become independent word learners points to a larger question about vocabulary instruction about what researchers mean when they discussed students’ word knowledge.
Beck and McKeown (1991) addressed this question, and said that clarifying what it means to know a word is a fundamental issue underlying all vocabulary research. How one defines word knowledge reveals often un-examined assumptions. It is especially important to explore and understand the beliefs about word knowledge upon which the century’s vocabulary research was based. I examine some of these underlying assumptions about word knowledge in the next section.
What does it mean to know a word? Examining underlying assumptions about word knowledge
Ultimately, what we know about vocabulary growth depends largely on the questions we ask, which in turn reflect our often unexamined taken-for-granted beliefs about the nature of vocabulary knowledge itself. Cronbach (1942) was one of the first researchers to raise this issue, and although he is cited by some vocabulary researchers (c.f. Beck, McKeown, & Kucan, 2002), his message seems to have been muted as we moved through the century.
Cronbach (1942) faulted previous researchers for using single or simple measures to determine the presence or absence of word knowledge, because of the inherent limitations of measuring knowledge in all-or-nothing terms. Rather, Cronbach (1942; 1943) argued in favor of exploring and assessing degrees of word knowledge in order to better capture a fuller and richer understanding of what a learner knows about words. He argued that more sophisticated tests were needed to accurately assess students’ vocabularies: “Testing should determine the degree to which [a student’s] understanding is complete rather than to say that [the student] ‘knows’ or ‘does not know’ the word” (Cronbach, 1943). To enable more accurate assessment, Cronbach (1942) proposed five degrees of word knowledge: generalization—the ability to provide a definition of a word; application—the recognition of the appropriateness of particular words in differing situations; breadth of meaning—the understanding that words may have multiple meanings; precision of meaning—the ability to recognize that meanings are contextual; and availability—being able to use a word, both orally and in writing. Cronbach’s (1942) aim was to provide teachers with a diagnostic measure which would aid them in providing differentiated instruction, based on students’ strengths and weaknesses. Such instruction might prevent a situation in which a student could provide a definition of a word that did not correspond to a deep understanding or ability to use the word in speech or writing.
In a related article, Cronbach (1943) aimed to develop a measure which would assess a student’s ability to understand and apply new vocabulary in new situations, and transfer that knowledge across diverse settings. To that end, Cronbach (1943) proposed a multiple-item true-false test, in which the target word is followed by a series of words which the student marks true if that concept applies to the target word, or false if the concept does not apply to the target. For example, under the target word element, the student would see: brass, iron, water, sulfur, fire, and oxygen. For each choice, the student would mark true or false, and thus reveal the degree to which s/he had ownership over the target word. “As a diagnostic test, the multiple-item form is superior to the customary form in which only one response per word is obtained, since, when one obtains five or more responses, the student’s knowledge is more reliably measured” (Cronbach, 1943, p. 532). Cronbach (1943) tested his instrument with 209 high school students, and reported that the results were suggestive. He concluded:
It appears important in many situations to determine how precisely a student understands a word rather than whether he can pass a single-item test on the word. In other cases, one may wish to analyze objectively just what meaning a word has for a student, without necessarily implying that one meaning is correct. (Cronbach, 1943, p. 534).
Other researchers addressed the question of what it means to know a word both indirectly and directly, reflecting the understanding that research devoted to increasing the size of students’ vocabulary “assumes that enlargement of vocabulary is in itself a virtue, without questioning the dimensions of the concepts with which words are associated” (Serra, 1953, p. 277). Bear and Odbert (1941) devised a scale on which to measure degrees of familiarity with words, which ranged from “readily recognizable as old acquaintances” to “know but cannot quite place” to “complete strangers” (p. 754). The authors conducted a study of 225 first-year college students to discover whether participants were aware their own word knowledge, using measures of vocabulary knowledge, reading, as well as a psychological instrument. They concluded that “the average student’s insight into the extent of his word knowledge is faulty” (p. 759). Furthermore, the students who appeared to be most in need of vocabulary growth were often those who had the least amount of insight into the limitations of their own vocabulary knowledge.
Feifel and Lorge (1950) attempted to clarify existing knowledge about stages of word meaning development with their examination of responses provided by 900 children, ages six through fourteen. Echoing Cronbach (1942; 1943), Feifel and Lorge (1950) pointed out that many vocabulary tests failed to distinguish degrees of understanding. Drawing on research from the previous decade, Feifel and Lorge (1950) grounded their study on the assumption that the character of children’s definitions developed as the children aged. Furthermore, they argued that “the character and quality of the word definition given by the individual permitted insight into his thought processes” (Feifel & Lorge, 1950, p. 3). The authors administered a standardized vocabulary measure and classified participant responses according to Feifel’s (1949) previous study of how people respond to vocabulary questions on a standardized measure. The five categories of student responses were: synonym, description, explanation, demonstration, or error. Feifel and Lorge (1950) reported significant differences between younger and older children’s responses: older children provided explanation and synonym-types of definitions, where as younger children tended to provide descriptive, and demonstration-type responses. The younger children also appeared to perceive words more concretely, and were less likely to generalize.
Building on Feifel and Lorge’s (1950) work, Kruglov (1953) used the same response categories with a test of word recognition, which differed from the test of vocabulary recall used by Feifel and Lorge (1950). Kruglov (1953) argued that traditional vocabulary research had focused on the quantity of words a person could define “while the dimension of the quality of these word definitions has largely been ignored” (p. 229), therefore she constructed a multiple-choice vocabulary test, which corresponded to Feifel and Lorge’s (1950). The instrument was administered to four classrooms (one each at grades 3, 5, 7, 8; total population was 134) to test the hypothesis that the percentage of synonym and explanation-type responses would increase with age. Kruglov (1953) reported an increase in synonym-responses across the ages, but no significant difference for explanation-type responses. She concluded that “recognition vocabularies, just as recall vocabularies, differ in quality as well as in range from one age or grade level to the next” (Kruglov, 1953, p. 241). Even though Feifel and Lorge (1950) and Kruglov (1953) set out to explore the qualitative dimensions of word knowledge, their use of recall and recognition measures was problematic. Being able to complete recall or word recognition task does not necessarily indicate a firm grasp on, and ability to use, an underlying concept: “The ability to recognize a word does not ensure complete understanding” (Seegers, 1946, p. 61).
Serra’s (1953) review of research underscored the importance of expanding students’ breadth and precision of word meanings (T. L. Harris, 1969). Serra (1953) reviewed more than thirty studies from the previous three decades, and concluded that concept development was more successful when instruction invited, and honored, the use of students’ experiential knowledge; when teachers engaged in word study to broaden vocabulary and inform word meaning; and when students engaged with words’ multiple meanings (Serra, 1953, pp. 283-284). The importance of establishing rich and appropriate labels (words) for underlying concepts and experiences was (and continues to be) vital, because
Words, after all, are the deposit of experience—the result of what we have done or are thinking. They are the bearers of meaning—the symbols which represent experience. . . Words represent the concepts, the distillate of previous experience. (Dale, 1956, p. 114).
Measuring word knowledge with fixed categories (such as present or absent, known or unknown) ignores the fact that “The essence of language is fluidity, not rigidity” (Dale, 1956, p. 123). Eichholz and Barbe (1961) designed, and then tested, an approach to vocabulary instruction grounded in the belief that “any word in an individual’s vocabulary may be placed at some stage along a continuum whose extreme poles are known and unknown but which has intermediate stages of knowing” (p. 2, emphasis in original). The intermediate stages included: having heard or seen a word, with little effect; a word which motivated the reader to take some kind of action; a word which moved the reader across that threshold of action to engagement (action was characterized as using the dictionary, speaking to someone, or generally adopting an interested stance); a word which is was available for use, but whose multiple meanings were not fully understood by the reader; and finally a word that had been embodied by the reader, who was able to use it at will and with flexibility (Eichholz & Barbe, 1961, p. 2). In selecting words to include in the study, Eichholz and Barbe (1961) avoided words that students had not encountered previously. They believed that vocabulary growth, especially developing multiple-meanings for words, would come from experience and exposure.
It would be possible, but practically foolish, to teach words they had never even seen before. In order for vocabulary training to be at all valuable and permanent, there must be an opportunity for the individual to use the words . . . learned. (p. 3).
Eichholz and Barbe’s (1961) study involved 105 seventh graders (54 in the experimental group and 51 in the control group). Students in the experimental group received informal lectures, delivered once a week for eight weeks by the experimenters on word history, dictionary use, and other related topics. The goal of these talks was to raise students’ interest in words in general. The experimental class also completed two practice tests as homework, using a self-checking device created by the study authors. The students in the control group neither heard the informal lectures on word histories, nor did they complete practice tests. Results of the final multiple-choice test at the conclusion of the study revealed that the number of encounters a student had with a word influenced retention. The authors attributed the gain to practice, which the experimental group received both through their practice tests, and the use of the self-checking device. Therefore, the authors recommended that teachers adopt this method of instruction, because it did not take too much time away from teaching other content.
Although the researchers spent a considerable amount of their article describing the degrees of word knowledge, they used an instrument to determine retention which did not appear to explore those degrees. Ultimately, Eichholz and Barbe (1961) did not actually report findings related to degrees of knowledge, which was very unfortunate, because “the important fact about a child’s vocabulary may be, not the number of words [she or] he recognizes superficially, but the quality of [her or] his associations with different words” (Russell & Saadeh, 1962, p. 170).
To research the nature of children’s vocabulary, Russell and Saadeh (1962) conducted a study with 257 students in the 3rd, 6th, and 9th grades to determine whether and when a student would choose functional, abstract, or concrete definitions on a multiple-choice vocabulary test. For example, the target word count was followed by four possible responses, representing the three categories and a wrong response. The functional definition of count provided was “to find the number of things in a group”; the concrete definition was “to find how many pennies are in your pocket”; the abstract definition was “to say numbers in order—upward or downward”; and the incorrect response was “to tell numbers on after the other” (Russell & Saadeh, 1962, p. 171). The results indicated that 3rd grade students preferred concrete and functional responses. Similarly, the number of functional and abstract choices increased with age. The author concluded that children’s vocabulary should be measured for breadth and depth of meaning, as well as by the types of definitions that children choose.
Despite the studies reviewed above, which explored “the depth and complexity of vocabulary issues” (Beck & McKeown, 1991, p. 790), most research in vocabulary has been, and continues to be, based upon a view of word knowledge as receptive knowledge. Being able to recognize this view, and understandings its implications for research is an extremely important task to which I now turn.
Implications for current and future research
Looking back at the research I’ve reviewed in this paper, I am struck by two things. First, I see a need to re-examine how researchers have defined what it means to know a word, and the implications those definitions carry for interpreting the research. Second, I am left wondering about the overall purpose of vocabulary research (and instruction).
What does it mean to know a word? Researchers from across the century clearly understood the importance of gaining a more nuanced understanding of what it means to know a word. Beck and McKeown (1991) were neither the first nor the last to declare that “knowing a word is not an all-or-nothing proposition” (p. 791). In the early part of the century, Cronbach (1942; 1943) cautioned against testing for simple (single) word meanings, because such tests contribute little useful information. Dale (1956) argued against thinking of vocabulary knowledge as a process of accumulating “bricks,” because such an orientation denied the essential flexibility and unfixedness of language. Herman et al. (1987) warned that “if researchers are unaware of the incremental nature of vocabulary acquisition and fail to devise tests that are sensitive to partial gains in word knowledge, they may conclude erroneously that incidental acquisition of vocabulary knowledge has not occurred” (p. 264).
Despite these cautions, much of the vocabulary research conducted in the second half of the 20th century employed measures that reflected a dichotomous view of knowledge: the presence or absence of word knowledge. As such, these measures may have failed to capture the degrees of knowledge that so many researchers clearly valued. “If the goal is for students to fully understand and use words, then evaluations based on simple synonym matching or multiple-choice definitions will not tell us if that goal has been reached” (Beck et al., 2002, p. 11). While this fact does not mean we should dismiss all vocabulary research that measures vocabulary knowledge in binary terms, it does complicate my reading of the research. I’m challenged by the question of what we really know about vocabulary knowledge (and if it is even possible to quantify that knowledge). As I look forward to my own research, the goal of exploring the degrees and dimensions of student knowledge calls me, because “information needed by researchers and educators goes well beyond what can be learned from multiple-choice tests” (Beck & McKeown, 1991, p. 796). I believe that the field of vocabulary research will be richer and better able to describe what students know if we consider their knowledge about words as being located along a continuum.
What’s the purpose of encouraging vocabulary growth? Vocabulary growth serves many purposes. One common goal of researchers across the century has been to improve students’ reading comprehension through vocabulary. For those researchers, gains in reading comprehension were the final goal. In fact, Herman and Dole (1988) argue that studies of instructional contextual approaches to vocabulary growth had as their aim improved comprehension, not vocabulary. In a recent review of vocabulary assessment in the 20th century, Pearson, Hiebert and Kamil (2007) argue vigorously for increased teaching of vocabulary and also for more study of its relationship to comprehension. As Pearson et al. (2007) explain, “the assessment of vocabulary as it pertains to reading comprehension has almost exclusively emphasized the receptive dimension of vocabulary” (p. 284). Receptive vocabulary knowledge is generally classified as that which is activated in reading and listening; to be successful, a reader or listener must know (or at least have passing familiarity) with words she encounters. On the other hand, expressive vocabulary knowledge relates to one’s ability to be productive with language. Comprehension research has tended to focus on receptive knowledge (Pearson et al., 2007).
I have no argument with using vocabulary instruction as a vehicle for improving reading comprehension, and I do not intend to dismiss decades of research dedicated to exploring the effects of vocabulary instruction on comprehension. I do, however, have a question about what conditions need to be in place, and the kinds of research we need to conduct, in order to explore, and ultimately, promote expressive vocabulary knowledge. I am intrigued and challenged by need to create opportunities in which researchers (and teachers) can investigate a student’s “ability to distinguish a correct from an almost correct meaning, in order to know the range of situations in which [she or] he can use the term without error” (Cronbach, 1942, p. 208). I, too, want to help students to move students beyond receptive knowledge to expressive knowledge, wherein they have embodied their understanding and can use it productively to communicate, both in school and in life.
If you would like a copy of the References, please post a comment below.
Even though APA forbids them, I love footnotes. Here are mine:  According to J.R. Jenkins, Stein, and Wysocki (1984), learning occurs without direct instruction, whereas deriving word meanings results from explicit directions to consider unfamiliar words during reading.  The rule in question was that when students encountered an word unfamiliar word, they should attempt to discern its meaning from context.  The tests were classified as difficult, because all four of the distractor responses were semantically related, or easy, because only two of the potential responses were related.  Even though this study does not actually test a particular instructional method, the model of morphological analysis proposed by White et al. (1989) is interesting because it resembles the process recommended by (D. R. Bear, Invernizzi, Templeton, & Johnston, 2004, 2008) in which students learn to decompose affixed words, seek out their meaning-bearing stems, and recompose words in meaningful ways.  This scale resembles Dale’s (1965) four-stage scale for measuring word familiarity and knowledge: never saw it before; heard, but doesn’t know; recognizes within context; and knows it well.  A functional definition includes the function of the word, while an abstract definition lacks reference to a specific function.  Calfee and Drum (1986) expanded Cronbach’s (1942; 1943) work by adding ease of access and appreciation of wordplay, metaphor, and analogy.