Keywords

A substantial number of students struggle in learning to read (Fletcher, Lyons, Fuchs, & Barnes, 2018), with 30% or more of fourth graders reading below a basic level (NAEP 2015, 2017). A recent assessment of Tier 2 reading remediation indicated that children experience minimal benefit from such help (Balu et al., 2015). This is consistent with decades of research showing that even with reading intervention, weak readers typically remain weak readers (Jacobson, 1999; Maughan, Hagell, Rutter, & Yule, 1994; Protopapas, Sideridis, Mouzaki, & Simos, 2011; Short, Feagans, McKinney, & Appelbaum, 1986).

Despite these discouraging findings, research indicates that effective prevention and intervention efforts can reduce the percentage of at-risk readers who develop reading problems (Foorman, Francis, Fletcher, Schatschneider, & Mehta, 1998; NICHD, 2000; Shapiro, & Solity, 2008; Vellutino et al., 1996). There is evidence that struggling readers may be able to gain and maintain approximately one standard deviation of improvement on normed reading assessments (McGuinness, McGuinness, & McGuinness, 1996; Torgesen et al., 2001; Torgesen, Wagner, Rashotte, Herron, & Lindamood, 2010; Truch, 1994, 2003, 2004). It was strong research outcomes like this (Foorman et al., 1998; NICHD 2000; Torgesen et al., 2001; Vellutino et al., 1996) that prompted the development of response to intervention and multi-tiered systems of support (RTI/MTSS). There is little evidence, however, that the actual instructional and intervention techniques that yielded the highly effective research results have been incorporated into RTI/MTSS implementation efforts (Balu et al., 2015).

This chapter will examine the most effective prevention and intervention approaches for difficulties with word-level reading. Oakhill, Cain, and Elbro (Chap. 5, this volume) discuss interventions for reading comprehension difficulties not attributable to word reading problems. The goal of this chapter is to present and integrate findings from multiple relevant niche areas within the scientific literature on reading. This is intended to build a deeper understanding of how reading development unfolds and why some remedial approaches might work better than others.

1 Word Learning Research Versus Intervention Research

Empirical reading research is a vast global enterprise conducted by scientists in various branches of psychology, speech pathology, linguistics, education, special education, literacy, medicine, and neuroscience. In the USA, tens of millions of federal dollars are spent each year on such research, with millions more funded by other governments and private foundations. Reading research is reported in scientific journals and is largely unknown outside the community of researchers themselves. Studies of educational professionals consistently demonstrate that there is little familiarity with the findings from the scientific study of reading (Moats, Chap. 3, this volume). This includes K-3 general education teachers (Cunningham, Perry, Stanovich, & Stanovich, 2004), special education teachers (Boardman, Argüelles, Vaughn, Tejero Hughes, & Klingner, 2005), literacy specialists (Moats, 1994, 2009), and school psychologists (Nelson & Machek, 2007).

Every year, several hundred empirical research reports and reviews on reading appear in English-language scientific journals. The field is so vast that it is impossible for researchers to remain current with the entire enterprise. Reading scientists must specialize in one or more of the many niche areas within the field. This may explain a curious observation: The research on word-level reading intervention and the research on word learning (i.e., how we learn and remember written words for later recall) do not overlap in any substantive way. It is extremely rare for either of these specialized areas to cite research from the other area. Miles and Ehri (Chap. 4, this volume) review word learning, more properly referred to as orthographic learning. The present chapter is intended to provide an overview of the word reading intervention literature, but also integrates these two literatures. The goal of this chapter is to leverage the findings from both of these areas to inform best practice in prevention and intervention with reading difficulties and disabilities.

2 Prerequisite Issues

There are many different reading philosophies from which different (and even contradictory) remedial suggestions have arisen. How should one navigate through these possibilities in a manner that will inform best practices? To assist with this, we will examine some prerequisite issues critical to identifying the most effective interventions.

  1. (1)

    How to best measure/estimate intervention effectiveness.

  2. (2)

    Distinguish between effective instructional principles and effective programs.

  3. (3)

    The assumptions behind current approaches to teaching and remediating reading.

  4. (4)

    How research on orthographic learning can help interpret the findings from the intervention literature.

  5. (5)

    Why some children struggle and thus require intervention in the first place.

3 Determining Instructional or Intervention Effectiveness

There are multiple ways to measure progress in word reading skills. The four most common will be examined below.

3.1 Raw Score Improvements

Raw score improvements demonstrate progress, but they cannot tell us if a student is catching up. A weak second grader may go from 12 words correct per minute (wcpm) on a paragraph reading test to 36 wcpm. However, this tripling of raw scores does not necessarily mean this intervention is effective. During that same time frame, typically developing readers, on average, progressed from 50 to 95 wcpm. The 38-wcpm gap has grown to 59 wcpm. Thus, raw score improvements do not necessarily mean “catching up.”

3.2 Statistical Significance

Statistical significance is used to judge the likelihood that two statistical outcomes are due to chance rather than the factor under study (e.g., the type of reading intervention). Statistical significance cannot tell us if an intervention is effective. Perhaps, both approaches under study are inferior to all other approaches. In that case, statistical significance only means that one intervention was less ineffective than the other. An experimental group may show statistically significant gains compared to a control group while not closing the gap with typical peers. This is not a hypothetical concern. Numerous studies have demonstrated statistically significant differences compared to control groups, despite normative reading assessment gains of only 0 to 4 standard score points (e.g., Christodoulou et al., 2017; Mitchell & Begeny, 2014; Vaughn et al., 2010, 2012).

3.3 Effect Size

Effect size is a common statistic in intervention research. It indicates the magnitude of the difference between an experimental and control group, or between pretest and posttest scores. An effect size of +1.0 means one group made one standard deviation of improvement relative to the comparison group (or relative to the pretest scores). Despite its common use in intervention research, effect size cannot be consistently relied upon to determine intervention effectiveness. The authors of the intervention study that prompted Tier 3 of RTI stated that effect sizes are “misleading in that they do not provide information about the rate of normalization of reading skills. Instead, they describe the advantage in reading growth for children in an experimental condition relative to a control condition” (Torgesen et al., 2001, p. 34). Consider the following examples.

Vaughn et al. (2012) found a +.49 effect size, which represents the equivalent of about a 7.5 standard score point difference. However, the normative standard score gain for the experimental group was 0. This discrepancy resulted from the fact that the control group’s normative performance declined during the intervention period. The +.49 effect size was based on a comparison with the control group, not a normative group.

Christodoulou et al. (2017) reported an impressive +.96 effect size for a summer tutoring program for poor word readers. Yet, a normed word identification posttest yielded a gain of less than one standard score point (.61) by the experimental group. This discrepancy occurred because the experimental group was compared to an untreated control group of poor word readers during the summer break. The control group’s normative performance declined, resulting in the experimental group scoring much higher than the no-treatment control group on the post-intervention word identification test.

These examples illustrate how effect size can potentially make ineffective approaches seem effective. The reverse can be true as well. Torgesen et al. (2010) studied two intervention groups and a control group. The two intervention groups had similar results with a combined average effect size of +.53. Yet, the standard score point outcomes of the intervention groups were 21 and 23 points, gains that rank among the strongest in the intervention literature. The moderate +.53 effect size resulted from the fact the control group displayed a strong outcome of 14 standard score points. This significantly minimized the differences between experimental and control groups, yielding a moderate effect size.

All three of the studies used the same test to measure word reading improvement, the word identification subtest from the Woodcock Reading Mastery Test-Revised (WRMT-R). Despite using the same outcome yardstick, they obtained discrepant measurements between effect sizes and standard scores. This is because effect sizes involve comparing an experimental group to a control group. Control groups do not represent a stable baseline across studies. To determine intervention efficacy, it thus seems judicious to supplement effect sizes with normative score progress.

3.4 Normative Standard Score Point Gains

Keenan, Betjemann, and Olson (2008) noted that the inter-correlations among word identification subtests tend to be high (unlike reading comprehension subtests). This suggests that such normed subtests do a suitable job of reflecting and stratifying the skill levels of students in the general population. If that is the case, then nationally normed word identification subtests provide a useful supplement to effect sizes when determining the effectiveness of instructional or intervention approaches. Normative scores can suggest whether an experimental group’s progress allowed them to close the gap relative to a national norm group. “Standard scores are an excellent metric for determining the ‘success’ or ‘failure’ of interventions for children with reading disabilities, because they describe the child’s relative position within the distribution of reading skills in a large standardization sample” (Torgesen, 2005, p. 524).

Despite this strength, normative comparisons have difficulties as well. They do not represent an equal interval scale, and a few items on a subtest can make a larger or smaller impact on the standard score depending on the age of the student. Also, norms represent a slice in time and participants in research studies are typically being compared to an earlier cohort. Furthermore, if normative gains are not accompanied by an effect size comparison with a control group, it is difficult to know whether normative improvements were related to factors going on in that local situation independent of the intervention under study. As a result of these concerns, it appears that effect sizes and standard scores both appear to be needed to best determine the efficacy of a given intervention approach or program.

4 Distinguishing Between Principles and Programs

Although commercially available programs have been included within various studies, researchers typically select such programs to illustrate a particular underlying concept, principle, or general approach, not to do a study on that particular program. For example, if researchers want to compare a three-cueing system reading approach with a phonic approach, they typically select a commercially available example of each to address their research questions. This is more efficient than creating an experimenter-designed program in order to study a given principle or practice.

Though some programs, or parts of programs, have been used in research, there exists no Consumer Reports-style body of research evidence that allows educators to compare among existing reading programs. The majority of reading programs on the market have no direct research support reported in scientific journals. This means that educational professionals need to become familiar with the concepts and principles that research has shown to be effective or ineffective. With that knowledge, educators can make more informed decisions when considering various reading programs and intervention approaches.

The What Works Clearinghouse (WWC), bestevidence.org, and similar outlets seem to approximate that Consumer Reports-type of service for educational professionals. But those well-intentioned efforts have been problematic for at least two reasons. First, they rely primarily on effect size to judge program effectiveness, not standard score gains. Second, there is not a substantial pool of program-specific research from which these outlets can draw. Thus, the WWC and similar outlets are no substitute for educational professionals who are well-informed regarding the findings from reading research.

There is a more useful outlet for educators that avoids the inherent difficulties with the WWC, bestevidence.org, and similar sources. The U S Department of Education’s Institute of Education Sciences (IES) has developed IES Practice Guides that are useful sources of research information. They focus on findings related to concepts, principles, and approaches rather than on specific programs. Two useful examples, which can be easily accessed via an Internet search, are Foorman et al. (2016) and Gersten et al. (2008).

5 The Assumptions Behind Reading Instruction and Intervention

All of the traditional approaches to teaching reading can be classified into one of four general categories based on their unit of focus. For phonics instruction (Chall & Popp, 1996), the unit of focus is the letter and digraph (e.g., ch, sh, oa, ee). For the linguistic approach (Bloomfield & Barnhart, 1961), the focus is on the rime unit (clip, dip, lip, sip). For the whole word/look–say approach, the focus is the word as a unit (Adams, 1990). For the whole language/balanced literacy/three-cueing approach (Goodman, 1996), the unit of focus is the sentence or paragraph.

Within these approaches, there are variations and teachers generally draw techniques from multiple approaches. Nonetheless, it is useful to be aware of the underlying assumptions of each approach because they drive instruction and intervention. An examination of the assumptions behind them may provide a window into our current intervention efforts and may help us understand why these classic approaches provide limited benefits for struggling readers (Balu et al., 2015; Jacobson, 1999; Maughan et al., 1994; Protopapas et al., 2011; Short et al., 1986).

The Phonics Approach. The goal of phonics instruction is for students to independently read newly encountered words using letter-sound knowledge and phonological blending (Chall & Popp, 1996; Beck & Beck, 2013). Although the phonics approach supports the identification of unfamiliar words, it seems that phonics authorities assume that after multiple successful opportunities of phonetically decoding words, those words eventually are remembered as visual wholes (Chall & Popp, 1996; Beck & Beck, 2013), presumably based on the visual memory hypothesis, described below. This visual memorization is also called upon to address irregular or exception words (Chall & Popp, 1996; Beck & Beck, 2013). Despite these concerns, there is a large and long-standing history of research findings showing that phonics instruction in K-2 yields superior results to the linguistic, whole word, and balanced instruction approaches (Adams, 1990, Anderson, Hiebert, Scott, & Wilkinson 1985; Bond & Dykstra, 1967; Brady, 2011; NICHD, 2000).

The Linguistic Approach. The linguistic approach (Bloomfield & Barnhart, 1961) is intended to support beginning readers by focusing instruction on onsets and rimes, which generally is easier than phoneme-level processing (Adams, 1990). The assumption seems to be that rimes are learned as visual wholes. Thus, like phonics, this approach presumes that some form of visual memory supports learning to read.

Whole Word Approach. The whole word approach appears to be squarely founded upon the visual memory hypothesis. The visual memory hypothesis assumes that visual memory underpins skilled reading, as readers quickly access familiar words from a visual memory bank of some sort. This has very strong intuitive appeal. When we look at a chair and say “chair,” or we see the printed word chair and say “chair,” it feels like the same process—visual input and verbal output. But this strong intuition does not align with numerous research findings.

The Inadequacy of the Visual Memory Hypothesis. Because the visual memory hypothesis appears to play an essential role in the whole word, phonics, and linguistic approaches, it is important to consider its validity. Multiple findings from independent lines of research have clearly demonstrated that, despite its intuitive appeal, the visual memory hypothesis does not accurately describe how skilled readers store or retrieve printed words. This evidence is briefly summarized below (see Kilpatrick, 2015 for citations).

Some of the reasons we know that reading is not based in any substantial way on visual memory include: (1) There is a very weak correlation between visual memory skills and word-level reading; (2) there are moderate to strong correlations between various phonemic tasks and word-level reading; (3) individuals who are deaf have great difficulty with word-level reading despite their typical visual memory skills; (4) studies using different fonts, cases, and personal handwriting, including mixed case studies (e.g., wOrDs LiKe tHiS), show that it is the sequence of letters that is stored and instantly activated, not the visual appearance of the word; (5) brain imaging studies show differing activation patterns between naming written words that are familiar to us, naming nonsense words, naming faces, and naming visually presented objects (Dehaene & Cohen, 2011; Sand & Bolger, Chap. 10, this volume); (6) anecdotally, we sometimes “block” on people’s names when we see them or even the names of visually presented objects (“hand me that thingy over there”), yet we never fail to recall familiar written words, suggesting that word reading involves more than a simple visual–phonological retrieval process. None of these six factors are explicable based on the intuitive notion that visual memory plays a major role in remembering written words.

Rather than visual memory, readers store words in long-term memory based on orthographic memory (Ehri 2005, 2014; Ehri & Saltmarsh, 1995; Kilpatrick, 2015, Miles & Ehri, this volume; Rack, Hulme, Snowling, & Wightman, 1994; Share, 1995, 2011). This refers to a memory for a particular letter order, regardless of the visual characteristics of the word (i.e., uppercase, lowercase, varying fonts, or personal handwriting in cursive or script). For example, in the word bear, none of the four uppercase letters looks the same as its lowercase counterpart (BEAR, bear). Yet once a child learns a word in one case, they typically have instant retrieval of that word in the other case, or another font, despite the visual dissimilarities. It is thus the letter order that comprises orthographic memory.

Whole Language/Balanced Literacy. The assumption inherent in this approach is that the sentence or paragraph context is a significant contributor to word-level reading (Goodman, 1996, 2005). The idea is that three systems simultaneously cue the reader to gain meaning from print: context, linguistic information (grammar and syntax), and letter-sound knowledge (often only the first letter is needed to help the previous two cueing systems to correctly determine the word). A key assumption is that context plays a large role in identifying written words. This theory cannot account for the fact that skilled readers can quickly and accurately read words in isolation, while struggling readers cannot. Although context is essential for meaning, it is rarely necessary for instant and accurate word recognition (except for homographs like wind/wind, dove/dove, present/present).

5.1 A Note on Reading Approaches

All four of the classic approaches have origins in the 1800s or earlier (Adams, 1990; Smith, 1965), predating the scientific revolution in reading. Our current reading instruction and remediation continue to be based on the same assumptions, even though research in the last 40–50 years has invalidated many of these assumptions. The visual memory hypothesis, which plays a central role in the whole word approach and a supporting role in the phonics and linguistic approaches, is inconsistent with a vast amount of research findings (Kilpatrick, 2015). The fourth approach, whole language/balanced literacy, promotes strategies that are inconsistent with research findings about how we remember written words. Perhaps, it is not surprising, then, that we have had a long history of discouraging results when addressing reading difficulties (Balu et al., 2015; Jacobson, 1999; Maughan et al., 1994; Protopapas et al., 2011; Short et al., 1986).

6 Contributions from the Orthographic Learning Research

Recall that orthographic learning involves remembering a word such that it is instantly, effortlessly, and accurately recalled and requires no phonic decoding or guessing. Miles and Ehri (Chap. 4, this volume) detail how this works, so it is recommended that the reader becomes familiar with that chapter in order to fully understand what follows. Yet, it may be useful to highlight some key points that will guide our understanding as we seek to interpret why different interventions yield widely differing outcomes.

First, orthographic learning research indicates that word reading is not based on visual memory (Ehri, 2005; Kilpatrick, 2015; Share, 1995). It should thus not be surprising that remedial approaches that focus primarily on visual exposure yield limited results. Visual memory-based intervention methods include expecting children to memorize words as unanalyzed wholes and reading practice approaches that assume visual exposure/repetitions will develop visual memories of those words.

Second, the two major cognitive theories of orthographic learning, Ehri’s orthographic mapping theory and Share’s self-teaching hypothesis, both affirm the centrality of letter-sound knowledge and phonemic skills for storing words in long-term memory. Although this may seem counterintuitive, it is strongly supported in the research literature (Cardoso-Martins, Mamede Resende, & Assunção Rodrigues, 2002; Dixon, Stuart, & Masterson, 2002; Ehri, 2005, 2014; Kilpatrick, 2015; Laing & Hulme, 1999; Miles & Ehri, this volume; Rack et al., 1994; Share, 1999; Stuart, Masterson, & Dixon, 2000). The implication is that instructional approaches that do not focus on letter-sound skills and phonemic skills should not be expected to yield optimal results.

Third, studies examining Share’s self-teaching hypothesis have shown that for typically developing readers from second grade on, only one to four exposures are needed before a newly encountered word becomes permanently stored for later, effortless retrieval (Cunningham, Perry, Stanovich, & Share, 2002; Share, 1999, 2004). If a student routinely requires many more exposures than that, the student’s orthographic learning ability is presumably impaired. Because orthographic learning is based on letter-sound skills and phonemic skills, it is the acquisition of those skills that will allow students to improve their ability to remember written words, not simply providing multiple exposures.

Fourth, orthographic learning theory and the self-teaching hypothesis both propose that the process of remembering words is implicit. This tenet is easily confirmed. Consider the fact that adult skilled readers have an instantaneously accessible word reading vocabulary (called an orthographic lexicon or sight word vocabulary) ranging from 30,000 to 80,000 words (Crowder & Wagner, 1992; Rayner & Polletsek, 1989), depending on reading experience. It seems fair to say that we do not remember putting conscious effort into storing tens of thousands of words (with occasional exceptions for very difficult or unusual words). The vast majority of words in our very large orthographic lexicons were added incidentally after encountering and sounding out new words (Share, 1995, 1999, 2011).

Research supporting Share’s self-teaching hypothesis indicates that students add new words to their sight vocabularies/orthographic lexicons after successful encounters—via phonic decoding—with previously unfamiliar words in the context of silent reading of real text (Cunningham et al., 2002; Share, 1999, 2004). If an unfamiliar word is not phonically decoded, the prospects for remembering the word diminish greatly (Share, 1999). Ehri’s orthographic mapping theory explains the cognitive mechanism underlying this memory process (Ehri, 2005; Kilpatrick, 2015; Miles & Ehri, Chap. 4, this volume). A word’s pronunciation is parsed into its segmented phonemes, which in turn is mapped onto the letters in the written word. What is already known and established in long-term memory is the oral form of the word. This known pronunciation is used to encode/remember the word’s written form (Ehri, 2005), which only happens if students have skilled access to the phonemes within the oral pronunciations. Lacking such proficient phonemic skills disrupts this connection-forming process.

As mentioned, the connection-forming process behind orthographic mapping appears to be implicit, that is, automatic and largely unconscious. If the process of storing words in long-term memory is largely unconscious in nature, it necessarily follows that the letter-sound and phonemic skills required to support that process must also be at a level of proficiency such that they are automatic and unconscious.

7 Determining Why Students Struggle in Word-Level Reading

When making decisions about remediation for students with poor word reading skills, we should consider the large research literature which investigates why some students struggle in learning to read words. Most popular notions about poor word-level reading, particularly when using the term dyslexia, focus on presumed visual–spatial–perceptual deficits. Such notions are inconsistent with the scientific findings (Ahmed, Wagner, & Kantor, 2012; Fletcher et al., 2018; Vellutino, Fletcher, Snowling, & Scanlon, 2004). Reading researchers operationally define word reading difficulties/dyslexia as poor performance in word identification tests despite adequate effort and opportunity, and not due to blindness, deafness, or severe intellectual disability (Fletcher et al., 2018; Hulme & Snowling, 2009; Vellutino et al., 2004). Poor word-level reading combined with typical language skills is referred to as dyslexia, while poor word-level reading combined with weak language skills is referred to as mixed reading difficulty or garden-variety poor readers (Gough & Tunmer, 1986; Joshi, Chap. 1, this volume). Regardless, in either case the poor word-level reading appears to be the result of the same causal factors.

What causes poor word-level reading/dyslexia? Elsewhere in this volume, the genetic and neurodevelopmental factors are discussed (Byrne, Olson, & Samuelsson, Chap. 9; Sand & Bolger, Chap. 10). For our purposes, dyslexia is the result of the phonological-core deficit (Fletcher et al., 2018; Hulme & Snowling, 2009; Morris et al., 1998; Stanovich & Siegel, 1994; Vellutino et al., 2004). There is a consensus that individuals with the phonological-core deficit display one or more of the following:

  • Poor phonemic awareness/analysis

  • Poor phonemic blending/synthesis

  • Poor rapid automatized naming

  • Poor phonological working memory

  • Poor letter-sound knowledge/nonsense word reading.

For years, researchers have referred to the phonological-core deficit as the most common cause of dyslexia, which seems to leave the door open to other possible causes. It is worth noting, however, that a recent review of dyslexia research referred to the phonological-core deficit multiple times as the “universal cause” of dyslexia (Ahmed et al., 2012). The authors did not explain this important shift in terminology. However, their reasoning can be inferred from the dyslexia research literature in that (1) we fail to find students who are struggling word-level readers who receive a “clean bill of health” on all five phonological-core characteristics listed above (Morris et al., 1998), and (2) four decades of scientific research into dyslexia have yet to reveal a compelling case for alternative causal explanations. There may be correlational features that occur among students with dyslexia, but there is no evidence for causality (Vellutino et al., 2004). The conclusion drawn from this is that with the caveats mentioned above, poor word-level reading is caused by poor phonological processing at some level or another. This conclusion is not surprising given the nature of alphabetic writing.

The Alphabetic Principle. Alphabetic writing systems are designed to capture the speech stream. Characters (letters) within alphabetic writing represent the individual sounds (phonemes) produced when people speak. Letters and letter combinations (e.g., ch, th; ee, oa) represent phonemes, not words. In any alphabetic writing system, we write phoneme-based characters that we string together to form words. These letter strings represent the sequences of sounds in the pronunciations of oral words. English is more inconsistent in phoneme-to-letter representation than “regular” written languages such as Italian or Spanish (Seymour, Aro, & Erskine, 2003; Ziegler & Goswami, 2005). Nonetheless, English writing is designed to transcribe oral speech at the level of individual phonemes within oral language. The insight that the characters on the page represent the segmented phonemes within spoken words is called the alphabetic principle. Poor word-level readers have poor awareness of the phonemic structure of spoken language and thus struggle with developing and applying the alphabetic principle. Because phoneme-level skills are necessary for both phonic decoding and remembering words, poor conscious or unconscious access to the phonemic structure of spoken language makes it very difficult to acquire these central aspects of reading.

We can conclude from the research on dyslexia that students’ word-level reading difficulties are primarily the result of the phonological-core deficit (Ahmed et al., 2012; Morris et al., 1998; Vellutino et al., 2004). Their poor access to the phonemic structure of the spoken language makes reading difficult for them. Reading interventions that successfully address this underlying problem would be expected to have better results than interventions that do not address the source of their reading difficulty.

7.1 Summary of Prerequisite Issues

We have examined the five prerequisite issues that help to establish the groundwork for making sense of the reading intervention research literature. First, we will rely on the assumption that for word-level reading, normative standard score outcomes are a useful supplement to effect size for determining intervention effectiveness. Second, we acknowledge that instruction and intervention research focuses primarily on principles and approaches, rather than on validating specific reading programs. Thus, we will seek to abstract from that research the best practices in terms of principles and approaches. Third, all of the four classic ways of approaching reading instruction and intervention (phonics, linguistics, whole word, whole language/balanced literacy) were developed long before the scientific study of reading and are insufficiently consistent with the findings from that research. Although the phonics approach yields superior results compared to the other three, it lacks a reliable mechanism for helping students remember the words they read.

Fourth, the orthographic learning literature has generated findings that can be used to guide our understanding of reading intervention. These include (1) word storage and retrieval are not based on visual memory; (2) letter-sound skills and phonemic skills are central to remembering words; (3) from second grade on, new words are remembered after only 1–4 exposures in typically developing readers; and (4) memory for words is largely an implicit, unconscious process, so the letter-sound and phonemic skills that support that process must also be proficient enough to operate unconsciously. Fifth, the nature of alphabetic writing combined with the last 40 years of research into dyslexia suggests that the phonological-core deficit is centrally responsible for word-level reading difficulties. These five prerequisite considerations provide important organizing principles and generate predictions that will bring clarity to the large and growing word-level reading intervention research.

8 Orthographic Learning Findings “Predict” Prevention and Intervention Outcomes

As mentioned previously, the orthographic learning and the word-level intervention literatures function independently. There appear to be no prospective studies designed to examine prevention or intervention from the perspective of Ehri’s and Share’s orthographic learning theories. However, we can do a “thought experiment” that involves applying findings from the orthographic learning research to the existing prevention and intervention research. This can yield valuable insights into those existing literatures.

The general findings from the orthographic learning research yield three predictions, or more specifically, expectations (i.e., because they interpret preexisting data). First, attempts at teaching struggling readers using visual memory strategies would not be expected to produce strong results, whether via visual memorization of whole words or reading practice using sentences and paragraphs. For students not skilled at remembering the words they read, multiple exposures would have limited benefit.

A second expectation would be that instruction and intervention efforts that do not include both letter-sound instruction and instruction in phonemic awareness skills would not have results as strong as interventions that include both of those elements.

The third expectation would be that intervention efforts that promote letter-sound skills and phonemic skills to the point of automaticity would yield better results than those that only result in simple accuracy on such tasks, but which lack automaticity. It is presumed that automaticity in letter-sound skills and oral phoneme analysis skills would facilitate the implicit and unconscious orthographic mapping process (Ehri 2005, 2014; Kilpatrick, 2015, 2018; Miles & Ehri, Chap. 4, this volume).

As mentioned, these expectations or predictions represent a thought experiment. It is nonetheless useful because it allows us to conceptually apply the orthographic learning research findings to understanding the instructional differences found among various intervention studies. The next sections provide an overview of the prevention and intervention research and will illustrate how orthographic learning research explains the widely varying standard score point outcomes we find within the intervention literature.

9 Prevention of Reading Difficulties

There is extensive evidence showing that a large proportion of reading difficulties can be prevented. The National Reading Panel (NRP) reviewed a large body of K-1 studies showing dramatic reductions in the number of struggling readers when students were explicitly taught phonological awareness and letter-sound relationships (NICHD, 2000).

The NRP found that students trained in kindergarten and/or first grade in phonological awareness performed at the level equivalent to 7 standard score points higher in reading than those who did not receive such training. This dropped off to 4–5 standard score points at follow-up which is expected given that most students eventually learn basic phonological awareness skills without being taught (Kilpatrick, 2015). The picture was quite different with at-risk students. The Panel found an impressive 13 standard score point difference in reading between trained and untrained at-risk students. This difference increased to 20 points at follow-up indicating the enduring benefit to at-risk readers. Such students do not appear to develop these skills on their own, so if these skill deficits are not addressed, most at-risk students continue to struggle in reading.

The NRP found similar results for teaching letter-sound skills in K-1. Those trained explicitly and systematically in letter-sound relationships averaged the equivalent of 6 or 7 standard score points higher on word reading tests than those without such instruction. At-risk students showed an even greater benefit. They performed at a level equivalent to 11 points higher than their untrained at-risk counterparts on tests of word reading.

Application of Orthographic Learning Research. Empirical studies that support Ehri’s orthographic mapping theory and Share’s self-teaching hypothesis affirm that words are remembered based on their letter sequence (i.e., orthographic memory), irrespective of the appearance of the word (uppercase, lowercase, differing fonts, and handwriting). They also affirm that letter-sound knowledge and phonemic awareness skills are both central to the word memory process. It is well established that letter-sound knowledge and phonological skills are important for phonic decoding (called phonological recoding by researchers; e.g., Share, 1995), yet their centrality for remembering newly encountered words (Cardoso-Martins et al., 2002; Dixon et al., 2002; Ehri, 2005, 2014; Kilpatrick, 2015, 2018; Laing & Hulme, 1999; Stuart et al., 2000) seems less well known.

The research on preventing reading difficulties, though conducted independently of the orthographic learning research, is consistent with their findings. The successful prevention studies routinely used control groups with instruction based on assumptions from the classic visual memory-based whole word approach and/or the traditional three-cueing system approach (the basis of whole language and balanced instruction). As suggested above, the orthographic learning research would predict that visual memory-based instruction and three-cueing-oriented instruction would not promote learning to read as well as instruction that directly focuses on letter-sound relationships and on the oral phonemic structure of spoken words, which is what the prevention research indicates. Thus, the orthographic learning research and the prevention research closely align to build a strong foundation for understanding the nature of reading development, as well as the specific K-1 curricular elements needed for helping to prevent reading difficulties.

10 Interventions for Students with Reading Difficulties

10.1 Previous Reviews of Research

Since 1999, there have been over three dozen reviews and meta-analyses of the reading intervention research (e.g., Bus & van IJzendoorn, 1999; Edmonds et al., 2009; Ehri, Nunes, Stahl, & Willows, 2001; Flynn, Zheng, & Swanson, 2012; NICHD, 2000; Suggate, 2016; Torgesen, 2004, 2005; Wanzek & Vaughn, 2007; Wanzek et al., 2013). It is not the purpose here to catalog those reviews, nor to independently review the hundreds of intervention studies that have been conducted over the last 40 years. Rather, the goal is to identify important trends in those reviews, and the intervention research more generally, which highlight a significant and encouraging pattern in the research results.

Factors Affecting Intervention Outcomes. The various reviews and meta-analyses have examined numerous mediating factors that may influence the outcomes of intervention efforts. Five of the most commonly researched mediating factors are (1) socioeconomic status (SES); (2) age/grade of the student; (3) instructor-to-student ratio; (4) severity of the reading problem; and (5) length of intervention.

The findings across these reviews do not necessarily align with intuition. The first two factors appear to have a small overall impact on intervention outcome. Although SES is highly correlated with reading scores in nonintervention research, its impact on intervention outcomes appears to be much more modest (e.g., Suggate, 2016). Also, younger students generally seem to benefit more from intervention than older students, although this is only a modest trend in the literature (e.g., Flynn et al., 2012). The other three factors do not show a consistent pattern across studies. For example, contrary to popular assumption, 1:1 instruction resulted in no better results than 1:2 or 1:3 (e.g., Elbaum, Vaughn, Hughes, & Moody, 2000).

Although perhaps counterintuitive, these findings are nonetheless encouraging, since we cannot change a student’s SES, nor his or her age. Also, 1:1 instruction and lengthy interventions are impractical and expensive. It is also encouraging that the most severe cases can make significant progress. In studies with the strongest outcomes, students gained approximately a standard deviation in reading regardless of their starting point. For example, 87 students in the McGuinness et al. (1996) study began about one standard deviation below the mean and finished at the mean. In the Torgesen et al. (2001) study, 60 students started, on average, two standard deviations below the mean and finished at about one standard deviation below the mean.

There are three common features found in the traditional reviews and meta-analyses of the reading intervention literature that are of interest here. First, they have all yielded generally modest results across reviews, presenting a rather non-optimistic picture for the prospects for struggling readers to normalize their reading performance.Footnote 1 Second, most of the reviews focused on mediating factors like those mentioned above (age, intervention length, instructor/student ratio, SES, etc.). Surprisingly, few (e.g., Flynn et al., 2012) examined the nature of the remedial instruction as a mediating factor.

Third, most reviews and all meta-analyses used effect size as their primary or lone metric for determining the impact of the mediating factors, as well as their estimates of efficacy in general. For reasons previously discussed, reliance on effect size alone could yield results that obscure an underlying pattern, since this metric may overestimate or underestimate the impact of any given intervention, relative to normative gains. In two reviews (Flynn et al., 2012; Torgesen, 2005), the authors indicated that perhaps normative scores should also be considered when seeking to determine efficacy:

Standard scores are an excellent metric for determining the “success” or “failure” of interventions for children with reading disabilities. (Torgesen, 2005, p. 524)

Finally, researchers need to use norm-referenced measures of reading ability to ensure that intervention learning transfers to general skill application, as well as provides a reference with which one can compare performance. (Flynn et al., 2012, pp. 30–31)

Torgesen’s (2005) review was rare in that it focused on normative scores, but it was not a systematic review. It was a selective presentation of intervention findings based on a combination of case studies, an in-depth presentation of an earlier published study (Torgesen et al., 2001), and a listing and brief overview of 14 studies. However, Torgesen (2005) did not distinguish between some of the finer differences among the studies he reviewed in terms of the precise content of the phonics and phonological awareness instruction, nor did he explore the possible factors as to why some studies yielded moderate standard score gains (5–9 points) and others had stronger results (12–19 points).

Following the lead of Torgesen (2005), Kilpatrick (2015) focused on normative score gains when reviewing some of the more commonly cited and reviewed intervention studies. This synthesis revealed a pattern in which instructional approaches directly aligned with the magnitude of the standard score point gains. It appears that this pattern had not been previously identified. One speculation is that the reliance upon effect sizes, which have the potential of underestimating or overestimating the impact of particular interventions, may have obscured this pattern. Another possibility is that, as mentioned previously, few reviews examined the nature of the remedial instruction as a mediating factor. A summary of that non-meta-analytic research synthesis is presented below.

10.2 The Phonemic Proficiency Intervention Continuum

When one examines intervention studies using standard score gains on nationally normed word identification subtests, an interesting pattern emerges. Consistent with the orthographic learning research literature, instruction that focuses on visual memorization and visual exposure through reading practice results in minimal standard score gains among struggling readers. By contrast, much greater improvements have been found on normed word identification tests when reading interventions directly address and train the skills that the orthographic learning literature indicates are needed for remembering words (i.e., phonemic awareness and letter-sound skills). When examining actual instructional practices found in the intervention studies and using standard score gains as an index of intervention efficacy, three general levels of standard score point outcomes emerge. These levels align closely with three different levels of intensity of the phonemic awareness instruction across the various studies.

  • Minimal: 0–5.8 standard score point gains

  • Moderate: 6–9 standard score point gains

  • Highly effective: 10–25 standard score point gains.

Minimal: 0–5.8 Standard Score Point Gains. In this category are interventions that involve visual memorization, reading practice (including repeated readings), and phonics instruction not supplemented with oral-only phonemic awareness training. Most studies in this group of instructional approaches yielded 2–4 standard score points.

An example in this category is READ 180, which relies on practice and exposure and does not teach phonics or phonemic awareness. Most studies and reviews of this program only report effect sizes with no standard scores (e.g., Slavin, Cheung, Groff, & Lake, 2008). However, Papalewis reported two standard score point gains after a year in the program (from the 20th percentile to the 24th percentile, i.e., 87.5–89.5). Failure Free Reading is marketed as a “nonphonic” approach for making large reading gains through extensive reading practice during a 100-hour intervention program. The standard score results range from 1 to 5 points on normed word reading tests (Algozzine & Lockavith, 1998; Keller & Just, 2009; Torgesen et al., 2007).

Repeated Reading. Repeated reading appears to be a popular intervention for struggling readers, but its efficacy is often assumed rather than demonstrated. A 2009 review of research on repeated readings did not find sufficient efficacy for the method (Chard, Ketterlin-Geller, Baker, Doabler, & Apichatabutra, 2009). Two recent reviews of repeated reading appear to present it in a somewhat positive light (Lee & Yoon, 2017; Stevens, Walker, & Vaughn, 2017). Yet, the authors said that they found support for improvement in reading rate “only by using nontransfer practiced passages for students with RD [reading disabilities]” (Lee & Yoon, 2017, p. 221). The review by Stevens et al. (2017) found very little evidence for transfer to unpracticed passages. Additionally, both reviews relied on effect size and neither review addressed the issue of standard score gains on normative assessments. In studies of repeated reading that report normative scores, gains tend to range from 1 to 5 standard score points (e.g., O’Connor, White, & Swanson, 2007; Wexler, Vaughn, Roberts, & Denton, 2010).

The orthographic learning literature provides a lens for interpreting these findings. For students not skilled in orthographic mapping (i.e., remembering written words), simple exposure and repetition do not improve their ability to retain newly encountered words in any substantial way. The theoretical basis for repeated readings (see Chard et al., 2009) does not adequately address why some students struggle in remembering words. Since repeated reading interventions do not teach the skills required for efficient orthographic mapping, they would not be expected to yield strong, sustained normative results with struggling readers. Likewise, interventions involving large amounts of reading practice (not repeated reading) have similar, limited results (O’Connor et al., 2007; Wexler et al., 2008; see comments above on Failure Free Reading and READ 180). Ultimately, there is no research evidence to suggest that repeated reading, or similar practice-based interventions, substantially closes the gap between struggling readers and their typical peers.

Phonics Instruction Without Additional Oral Phonemic Awareness. Letter-sound skills are essential for learning to read an alphabet-based writing system. They are also a necessary but not sufficient ingredient in orthographic learning (Ehri, 2005, Share, 1995). Phonological blending, which is the skill needed to blend phonemes into words, is a central element in phonic decoding (NICHD, 2000; Share, 1995). Thus, if a student can successfully sound out real or nonsense words, phoneme-level blending skills have been established. A beginning reader apparently does not require phoneme analysis skills to do phonic decoding. Thus, letter-sound knowledge + phoneme-level blending = phonic decoding.

However, according to Ehri’s theory of orthographic learning, memory for written words requires the additional phonological skill of phoneme analysis. As mentioned, there is ample empirical support for the notion that phonemic analysis skills are central to creating orthographic memories of written words. It appears that phoneme-level blending to contributes to reading via its role in phonic decoding while phonemic analysis appears to assist in establishing a memory of the letter order of a written word via attaching pronunciations of words to their written forms (Ehri, 2005; Kilpatrick, 2015). Note that the flow of information in this memory process goes from (1) stored oral pronunciations to (2) pronunciations segmented at the phoneme level to (3) the letters that represent those oral pronunciations. This represents the opposite flow of information from what we find in phonic decoding, which goes from (1) letters to (2) phonemes to (3) oral pronunciations. Skilled readers display proficiency in both directions.

This apparent division of duty between two phonological skills, blending and analysis, helps explain a common pattern in the research literature. When students are given explicit and systematic phonics instruction, but no additional oral-only phonemic awareness/analysis instruction, their normed nonsense word reading scores grow substantially, often 10, 15, or 20 standard score points. However, their gains on normative tests of real-word identification tend to be in the 2–5 standard score point range (Blachman et al., 2004; Kuder, 1990; Ritchey & Goeke, 2006; Stebbins et al., 2012; Torgesen et al., 2007; Vaughn et al., 2012). This can be accounted for from the orthographic learning literature. As mentioned, memory for words is implicit and thus the letter-sound and phonemic analysis skills that underlie this memory process must also be implicit. Below it will be argued that simple segmentation and blending accuracy, without automaticity, are not enough to efficiently add words to the orthographic lexicon. However, that degree of phonemic skills appears to be sufficient for phonic decoding, allowing them to make gains on tests of nonsense word reading.

Whether using a practice-based/visual memory approach or even an explicit and systematic phonic approach lacking oral phonemic awareness training, reading interventions that do not address the underlying phonemic inefficiencies of students with the phonological-core deficit do not display strong normative gains on real-word reading tests.

Other Approaches with Limited Outcomes. Other approaches display limited reading improvements, such as the use of color overlays or lenses, visual tracking training or other visual training, the use of a special font, and catering to students’ learning styles.

Visual color overlays and lenses might possibly address optical sensitivity but do not directly relate to reading difficulties (Wilkins, Lewis, Smith, Rowland, & Tweedie, 2001). Presumably, overlays or lenses make reading more comfortable for such individuals. There is no evidence, however, that such an optical condition causes dyslexia or that overlays can turn struggling readers into average readers.

Studies of visual tracking and other visual trainings have not resulted in improved reading scores. There are hundreds of studies showing that poor readers struggle with reading words in isolation, even though visual tracking is not required for single word reading. There appears to be no evidence that there are students who are competent readers of words in isolation but who, due to visual tracking problems, struggle in reading sentences and paragraphs. There may well be a correlation between visual tracking and dyslexia, but correlation is not the same as causation. Indeed, the evidence seems to suggest that poor reading causes poor visual tracking (Ahmed et al., 2012). The eyes of students who are poor readers dart back and forth to use context to understand what they read, because many words are not familiar to them, and they cannot reliably sound out those words. Research shows that when typical students are given text above their reading level, their visual tracking deteriorates as their eyes dart about the text in an effort to determine the meaning of many unfamiliar words (Ahmed et al., 2012; Hyönä & Olson, 1995). Also, students with alleged poor visual tracking display no such tracking issues when reading text that is easy for them. If such students had an inherent visual tracking problem, we would expect poor tracking at all levels of readability.

Some have observed that poor visual tracking in struggling readers may also apply to nonword stimuli. To understand a possible reason for this, consider the fact that aside from reading, there is no other activity during which humans use refined ocular-motor skills in which eyes sweep in a very smooth, precise, and consistent horizontal manner for long periods of time. Since students with dyslexia do very little reading, and from the outset their reading is characterized by eyes darting around for clues, it is difficult to know how they would develop the precise and untiring horizontal ocular-motor scanning abilities similar to their typically developing peers. Such speculation aside, the American Academy of Pediatrics (2009) teamed up with professional optometric and ophthalmological associations to publish a joint statement, asserting that visual training practices do not benefit children in their reading skills.

The Dyslexie font was developed to help those with dyslexia to read. On the developer’s Web site, they say, “The most common reading errors of dyslexia are swapping, mirroring, changing, turning and melting letters together” (www.dyslexiefont.com/en/typeface/ retrieved August 6, 2018). It is not clear what research they were referring to, given that the most common reading errors in dyslexia have to do with simple accuracy and fluency, typically independently of the characteristics they mention. The transpositions of letters among struggling readers (e.g., reading form as from or spilt as split) are only one of several issues related to accuracy that such readers display. Orthographic learning research demonstrates that there is no need to appeal to confusions based on the visual characteristics of a given font, as long as it is legible to the reader. Rather, this letter transposition phenomenon is best understood as the student not having a precise memory for that specific letter sequence combined with inaccurate phonic decoding skills. A word that a student has not orthographically mapped does not have a stable existence in his or her long-term memory for the precise letter order. The Dyslexie font appears to thus be based on a misconception about dyslexia that dyslexia is characterized by visual confusion. A recent study of the Dyslexie font bears this out. Kuster, van Weerdenburg, Gompel, and Bosman (2018) did two studies of the Dyslexie font. One study included 170 children with dyslexia, and the second studied 147 students, some with dyslexia (n = 102) and some without (n = 45). They found in both studies that neither the students with dyslexia nor the typical readers showed any benefit from the Dyslexie font with either reading speed or accuracy compared to Ariel or Times New Roman, nor did they prefer the Dyslexie font over the others.

Teaching to a student’s learning style (visual learner vs. auditory learner vs. kinesthetic learner; global learner vs. analytic learner; and left-brain learner vs. right-brain learner) is a highly intuitive concept that has been a mainstay in education. The popularity of instruction based on learning styles continues, despite four decades of research showing that it is not effective (for reviews, see Kavale & Forness, 1987; Pashler et al. 2008; Stahl, 1995; Stahl & Kuhn, 1999). Any time and effort devoted to a learning style approach would have the disadvantage that it directs time and effort away from approaches that work.

Moderate: 6–9 Standard Score Point Gains. In this category are interventions that involve systematic phonics instruction and basic phonemic awareness instruction (segmentation and blending), combined with reading practice. It is notable that all the intervention studies with this level of results or higher (next category) included explicit and systematic phonics instruction. It appears there are no studies that have yielded and maintained normative standard score gains above 5 points on word-level reading tests that excluded explicit letter-sound instruction. This reinforces the notion that phonics skills are necessary but not sufficient for struggling readers to demonstrate substantial improvements.

Lovett and colleagues (e.g., Frijters, Lovett, Sevcik, & Morris, 2013; Lovett et al., 1994; Lovett, Lacerenza, Borden, Frijters, Steinbach, & De Palma, 2000; Lovett, Lacerenza, De Palma, & Frijters, 2012) have published numerous intervention studies, often with struggling readers from the late elementary level to the high school level. Across various studies, students were trained in letter-sound skills and other code-based reading strategies (e.g., looking for familiar letter sequences within unfamiliar words). From their descriptions of their interventions, it appears that basic phonological awareness is trained via segmentation and blending activities. Their studies tend to produce outcomes in the range of about 7–8 standard score point gains on normed tests of word identification (Frijters et al., 2013; Lovett et al., 2012).

Similar results were obtained by Rashotte, MacPhee, and Torgesen (2001). They studied first through sixth graders who received an intervention that consisted of phonemic segmentation and blending training, phonics, and “reading and writing for meaning” (p. 123). The students in first through fourth grades gained 8 standard score points on a normed word identification subtest, while the fifth- and sixth-grade participants gained 7 points. The intervention group gained 19 points on a normed nonsense word reading test. As mentioned previously, phonemic blending along with letter-sound skills appears to be all that is needed to develop phonic decoding skills, but is not sufficient for skilled orthographic learning. This may explain the large gains in the normative nonsense word reading test, while real-word reading demonstrated more moderate gains. Some of the literature reviews acknowledge this pattern of stronger gains with nonsense words relative to real words (e.g., Bus & IJzendoorn, 1999; Torgesen, 2005). Most studies in this outcome category taught phoneme segmentation. It is argued below that simple segmentation training and assessment are not able to assure that segmentation skills are automatic, which appears to be necessary to become efficient at orthographic learning.

Highly Effective: 10–25 Standard Score Point Gains. In this category of studies are interventions using more challenging phonemic manipulation activities along with systematic phonics instruction and reading practice. One of the earliest such studies was that of Alexander, Andersen, Heilman, Voeller, and Torgesen (1991). They demonstrated an average of 12.5 standard score point gains on the WRMT-R Word Identification subtest among 7–12-year-olds. Their WRMT-R Word Attack (nonsense word reading) improved by 20 points. Their study was limited because there were only ten participants. Yet, it inspired other studies with similar, strong results.

The most influential study in this category is that of Torgesen and colleagues (2001), which played a role in prompting Tier 3 of RTI. These researchers intervened with 60 severely reading disabled third- through fifth-grade students. Their initial average score on the WRMT-R Word Identification subtest was at the second percentile. Half of the students were provided a commercially available intervention program consisting of phonemic manipulation activities, phonics instruction, and reading practice. Only about 5% of the instructional time was allotted to reading practice. The other half of the participants were provided an experimenter-designed program consisting of the same three elements, but 50% of the instructional time was allotted to reading practice.

Both groups of students gained an average of 14 standard score points on the WRMT-R Word Identification subtest and 20–27 points on the Word Attack subtest. At a two-year follow-up, additional testing showed the word identification score for both groups averaged 18 points above their pretest scores, suggesting additional improvement and no regression. The researchers indicated that 39.5% of these students did well enough that they no longer required special educational help in reading.

Simos et al. (2002) replicated the Torgesen et al. (2001) study while examining the impact of reading improvements on the brain. They did pre-intervention and post-intervention MSI brain scans with students with reading disabilities and age-matched peers who had typical reading skills. Due to the limits imposed by the cost of MSI scans, only eight students with reading disabilities participated. Their ages ranged from 7 to 17. They used two commercially available intervention programs that contained all three of the same key elements of phoneme manipulation training, explicit phonics instruction, and reading practice. Three students used one program, and five used the other. Six of the eight poor readers had initial normed word identification scores below the 3rd percentile, while the other two had scores at the 13th and 18th percentiles. After the intervention, the percentiles ranged from the 38th to 60th percentiles. When translated into standard scores, and the individual performances tallied, these students made an average of 25 standard score point gains on the word identification test. Additionally, the clear pre-intervention differences in brain activation patterns between the reading disabled and typical readers on the MSI scan disappeared in the post-intervention scans.

Truch (1994) presented clinical data on 281 clients with dyslexia who ranged in age from 5 to 55. At that time, his clinic used the Lindamood Auditory Discrimination in Depth (ADD) program. The ADD program used intensive phoneme manipulation activities, letter-sound instruction, and reading practice. On average, clients gained 17 standard score points on the word reading subtest from the Wide Range Achievement Test-Revised (WRAT-R) and 17 points on the WRAT-R spelling subtest. Gains were equivalent across all age groups.

A noteworthy element of the Truch (1994) study was that only one single client out of 281 did not improve his or her phonemic awareness in response to direct training. That represents less than one half of one percent of the study sample. Also, 75% of the clients reached the ceiling on the Lindamood Auditory Conceptualization Test after the training, which assesses phonemic manipulation skills. For those individuals to reach ceiling suggests that they achieved a functionally average level of phonemic awareness as a result of this training. Given the large number of participants (compared to 10 in Alexander et al., 1991 and 8 in Simos et al., 2002), this suggests that nearly all individuals of any age (24 clients were between the ages of 18 and 55) can improve their phonemic awareness skills with appropriate training, and a large majority (75% in Truch’s study) can develop virtually normal phonemic awareness skills.

This finding deserves careful consideration. Most intervention studies either provide no oral phonemic training or only provide the more basic segmentation and/or blending training. In such circumstances, normed standard score gains ranged from 0 to 9 points. But when more advanced phonemic skills are trained, using phoneme manipulation activities, word reading score gains range from 10 points (Wise, Ring, & Olson, 1999) to 25 points (Simos et al., 2002). If there is a causal connection between these more advanced phonemic skills and reading, it is encouraging to note that the key skill that weak readers lack is indeed malleable and correctable. And when corrected, alongside explicit phonics instruction and reading practice, we see the largest intervention gains in all of the intervention literature. Treatment resistors, the name given to students who do not respond well to explicit phonics interventions and reading practice interventions, typically lack sufficient phonemic awareness skills and letter-sound skills (Torgesen, 2000). The Truch (1994) study, and other studies reviewed from this “highly effective” category, shows that the underlying deficits hindering the progress of dyslexic readers can be successfully remediated.

McGuinness et al. (1996) demonstrated an average of nearly 14 standard point gains in real-word identification and 19.5 points in nonsense word reading using the Phono-Graphix program. That program includes the three key elements that produce the highest results in the research literature, phonemic manipulation training, phonics instruction, and reading practice. Their clinical study involved 87 students ranging in age from 6 to 16.

The 12-Hour Effect. An interesting finding with the McGuinness et al. (1996) study was that their results were achieved following only 12 hours of instruction. Since these results seemed overly positive, Truch (2003) sought to replicate them using the Phono-Graphix program. He had a larger clinical sample of 203 participants and achieved similar sized standard score point gains, but it took an average of 80 hours of instruction to achieve this rather than 12. Being aware of the findings from McGuinness et al., (1996), Truch examined data on the tutored clients after the first 12 hours of instruction. He found an average of 7 standard score point gains in word reading after that brief period. By 80 hours, it had grown to 13.7. Truch (2003) accounted for the difference in the timing of the similar outcomes as being due to the fact that his clients initially had more severe reading difficulties than those in the McGuinness et al. (1996) study.

Truch (2003) identified what he called the “12-hour effect.” After reporting results with hundreds of individuals tutored in the ADD and Phono-Graphix programs (Truch, 1994, 2003), Truch (2004) developed his own intervention program called Discover Reading. It contained the same three key elements as the other two. He gathered data on 146 clients tutored in this program after the first 12 hours of intervention. He found they made an average of 6.5 standard score point gains in word reading after that short period, and by 80 hours reached an average of 14.4 standard score point gains. In each of these three studies (McGuinness et al. 1996; Truch, 2003, 2004), the phonemic awareness skills reached ceiling by 12 hours of instruction. After that, the students continued to grow in their word reading skills, presumably because they now had the cognitive architecture to more efficiently remember the words they were reading. With no intervention studies directly informed by the orthographic learning research, this interpretation remains speculative.

One interesting finding across the three studies just described (McGuinness et al., 1996; Truch 2003, 2004) is that phonemic awareness skills were developed in struggling readers in a short period of time. Another study that demonstrated this was Bhat, Griffin, and Sindelar (2003). They provided phonemic manipulation training with 40 students in sixth to eighth grade whose average initial phonemic awareness skills were in the first to second percentile as assessed on the Comprehensive Test of Phonological Processing (CTOPP). After 18 sessions across four weeks, the researchers saw a 29 standard score point improvement on the CTOPP Phonological Composite. However, there was no improvement in word reading, likely due to the fact that the study lacked any phonics instruction or reading practice, which the authors acknowledged. Despite this, Bhat et al. (2003) reinforced the studies by McGuinness et al. (1996) and Truch (2003, 2004), showing that deficient phonemic awareness skills can be remediated quickly using phonemic manipulation activities. It is also notable that nearly all of the studies discussed in this section required less than half of a school year. The rapid nature of these gains was acknowledged in a federal report (Torgesen et al., 2007). Citing studies reviewed in this and the previous section, the report stated, “Several studies have recently shown that intensive, skillfully delivered instruction can accelerate the development of reading skills in children with very severe reading disabilities, and do so at a much higher pace than is typically observed in special education programs” (Torgesen et al., 2007, p. 1).

Across various studies, several intervention programs were used to generate very strong results, some were experimenter designed while other are commercially available. Each of these programs had the same three elements: phonemic manipulation training, explicit and systematic phonics instruction, and reading practice. This illustrates one of the prerequisite considerations mentioned earlier in the chapter. One may argue that the ADD, Phono-Graphix, and Discover Reading programs are research-based, yet each has only a few studies that examined them directly. However, when we examine the common instructional elements across studies, we develop a picture of a more well-established research-based approach to addressing reading difficulties.

There are other studies that meet the criteria for the highly effective category (Torgesen et al., 1999, 2010; Torgesen, Rashotte, Alexander, Alexander, & MacPhee, 2003; Wise et al., 1999), which all share the same fundamental instructional characteristics. The success of these characteristics can be understood when we consider them in light of research on orthographic learning and on dyslexia (see below).

10.3 Summary of the Three Levels of Intervention Outcomes

The varying pattern of outcomes described above, based on differing instructional protocols, represents a phonemic proficiency intervention continuum. When no (or minimal) phonemic awareness is incorporated into an intervention, the gains are limited. When some phonemic skills are taught, but they represent primarily accuracy in the basic tasks of blending and segmentation, the results are stronger. However, when the phonemic awareness training includes the more challenging phonemic manipulation activities, the results represent the strongest outcomes in the word reading intervention literature. This pattern is consistent with the orthographic learning literature and was anticipated two decades ago. One of the studies with highly successful outcomes compared three intervention groups, each varying in the explicitness or nature of the phonemic awareness intervention. The authors noted, “The most phonemically explicit condition produced the strongest growth in word level reading skills” (Torgesen et al., 1999, p. 579).

Recall that phonemic skills are essential for efficient orthographic mapping to occur, that is, efficient storage of words for later retrieval. For this to happen, phonemic skills must be automatic and largely unconscious. Thus, when phonemic skills are trained to the level of accuracy, but not to automaticity, there may be improvements in phonic decoding skills, but limited improvements in the ability to efficiently add words to the orthographic lexicon (i.e., sight vocabulary). Yet when students receive more challenging phonemic awareness training, particularly using phoneme manipulation activities (phoneme deletion and substitution of phonemes within various positions within words), a greater degree of phonemic proficiency develops (see below). This presumably allows students to more easily remember the words they read, resulting in the largest standard score point gains found in the intervention literature (e.g., Alexander et al., 1991; McGuinness et al., 1996; Simos et al., 2002; Torgesen et al., 1999, 2001, 2010; Truch, 1994).

Kilpatrick (2015, 2018) offers an explanation for why phonemic manipulation activities likely provide a greater degree of phonemic proficiency than phonemic segmentation and blending training. Consider what is required to accomplish a phoneme deletion or substitution task. To delete or substitute a phoneme from a blend (e.g., to delete the /l/ from slip to get sip or change /l/ in fly to /r/ to get fry), one must (1) segment the word, (2) isolate the location of the target sound in that word, (3) delete or substitute the sound, and (4) blend the remaining sounds. Thus, skills associated with four conventional phonological awareness tasks (segmentation, isolation, manipulation, and blending) are all performed as part of a single task. If a student is able to respond to such items instantly, as typically developing readers can, then the amount of time devoted to any one of those four tasks is minimal, suggesting a substantial degree of proficiency.

The key skill needed for orthographic mapping is phoneme-level analysis/segmentation (Ehri, 2005). But when a response to a task requires only segmentation, there is no way to know for certain if an immediate response involves automaticity and unconscious access to the phonemes, or if a student quickly deconstructed the word to correctly respond to that segmentation task. However, if a student responds instantly to a phoneme manipulation task, where four conventional phonemic tasks occur in rapid succession, then one’s confidence is increased that the analysis/segmentation skill is automatic and unconscious.

The integration of the orthographic learning literature and the word reading intervention literature presented in this chapter currently lacks direct, empirical demonstration. As previously mentioned, there exist no studies in the intervention literature that were explicitly based on the orthographic learning theories of Ehri or Share. However, the practice of using the combination of phoneme manipulation activities, explicit phonics instruction, and reading practice yields the largest standard score point gains in all of the intervention literature, supported by moderate to strong effect sizes. This suggests that regardless of the theoretical reasons why this instructional formula is so successful, it appears to represent best practice with struggling word-level readers.

11 Summary and Conclusions

It was mentioned earlier that the existing reviews and meta-analyses of intervention research present a fairly non-optimistic picture of the prospects of students with reading difficulties making large and sustained improvements in their reading skills. In this chapter, it has been argued that after addressing key, prerequisite issues, a more optimistic picture comes into focus.

The use of standard score gains to determine intervention efficacy and the examination of the instructional components of intervention studies in light of the orthographic learning literature results in the emergence of a pattern of results not identified in previous intervention reviews and meta-analyses. This pattern should provide encouragement to educators because it indicates that when instructional and intervention efforts are aligned with a scientific understanding of word learning, struggling readers make far greater gains than we have seen with approaches based on older assumptions about reading.

Skilled word reading requires letter-sound skills and phonemic skills to the level that they allow students not only to sound out new words, but to efficiently remember words via orthographic mapping (Ehri, 2005, 2014). Prevention studies show that students trained in these skills in K-1 have fewer reading problems than those without such training. Struggling readers whose remedial interventions include these central elements outperform those whose interventions do not. Also, the more extensive phonemic training using phonemic manipulation activities fares even better.

The intervention research seems to be best understood in light of the orthographic learning research. When viewed from that perspective, we see a phonemic proficiency continuum emerges. Given the phonemic nature of our alphabetic writing system, this should not come as a surprise. The degree of progress in real-word reading appears to be related to the level of proficiency in phonemic skills trained in the intervention. When no phonemic awareness is directly trained, there are limited results. When basic accuracy in phonemic segmentation and/or blending is trained, there are measurably better results. With more in-depth instruction in phonemic awareness using phonemic manipulation training, which presumably fosters phonemic proficiency, students gain, on average, a full standard deviation in word reading. Although this pattern of outcomes does not spring from intuitive or traditional assumptions about reading, it is consistent with the orthographic learning literature. At present, it appears that incorporating these three elements into instruction and intervention represents best practice.