1 Introduction

Research on reading comprehension has focused on the central role of word recognition skills in skillful reading (e.g. Gough and Tunmer 1986; Hoover and Gough 1990; Perfetti 1985). According to the dual route cascaded model (DRC) proposed by Coltheart et al. (2001) word recognition can be achieved via two different routes: a phonological route by which every word has to be recoded letter by letter based on grapheme-phoneme-correspondence rules, and an orthographical route by which written words can be directly mapped onto mental representations of word forms without an intermediary step of grapheme-to-phoneme-translation. Several studies with English-speaking children and adults provided evidence for the relevance of the two routes (e.g. Paap and Noel 1991; Shankweiler et al. 1996) and the implication of the dual route model that both phonological recoding skills and orthographical decoding skills should be fundamental for childrenʼs reading comprehension (Cunningham and Stanovich 1990; Shankweiler et al. 1999; Tunmer and Chapman 2012). However, most of the available evidence comes from studies with English-speaking children which differs from languages such as German in orthographic transparency. Thus, the question whether and to what extent phonological recoding skills and orthographical decoding skills also contribute to the reading comprehension of children learning to read in German requires further clarification. A second and related issue refers to the time course of development of both skills during reading acquisition. Frith (1986) assumes that children learning to read proceed from a letter-by-letter recoding strategy to an orthographical decoding strategy which becomes the dominant strategy in skillful readers. In the present study we addressed these questions by examining the relative contributions of phonological recoding and orthographical decoding skills to reading comprehension skills in primary school children from Grades 2 to 4.

In the following sections we will highlight the dominant role of successful word recognition as precursor of reading comprehension. Afterwards, we will discuss the role of phonological recoding and orthographical decoding in visual word recognition and its implications for individual differences in reading comprehension skill. Subsequently, we will turn to the time course of development of word reading skills. Finally, we will present our study and discuss the findings and implications with respect to cognitive models of visual word recognition (e.g. DRC model, Coltheart et al. 2001; strong phonological model, Frost 1998) and the developmental model of word recognition proposed by Frith (1986).

2 The role of word recognition in reading comprehension

To become skillful readers children have to acquire several cognitive skills at the word, sentence, and text level (Müller and Richter 2014; Perfetti 2001; Richter and Christmann 2009). They have to learn to recognize written words, to retrieve their meanings from the mental lexicon, to integrate word meanings into sentence interpretations, and to build and continously update a coherent mental model of the text. An important line of research has focused on the crucial role of well-functioning word-level processes for good reading comprehension. As pointed out by Perfetti in his verbal efficiency hypothesis (1985) and by Perfetti and Hart in their lexical quality hypothesis (2001; 2002; see also Perfetti’s DVC Decoding, Vocabulary, and Comprehension triangle, 2010), reliable representations of word forms and meanings and their rapid and efficient retrieval is at the core of skillful reading comprehension. Efficient processes at the word level are assumed to release cognitive resources that can be used at higher levels of processing such as the sentence and text level. Another approach stressing the unique role of word recognition in reading comprehension is the simple view of reading (Gough and Tunmer 1986; Hoover and Gough 1990). According to this view, reading comprehension (R) is defined as the product of decoding skills (D) and linguistic comprehension (C): R = × C. Thus, Gough and Tunmer assume that reading comprehension can be perfectly predicted by a reader’s ability to decode words from written text and his or her general ability to comprehend language. Thus, the only predictor specific to reading comprehension that differs from spoken language comprehension is an individualʼs ability to recognize written words. In support of this assumption, several studies provided evidence for a strong correlation of word recognition abilities and reading comprehension in children using various tasks to measure word recognition (e.g. word and pseudoword reading: Golinkoff and Rosinski 1976; Hoover and Gough 1990; Joshi and Aaron 2000; Shankweiler et al. 1999; letter and word identification: Kendeou et al. 2009; lexical decision tasks: Knoepke et al. 2013; Richter et al. 2013).

3 Word recognition: Phonological recoding and orthographical decoding

The question of how to conceptualize and to operationalize word recognition skills is a matter of debate (Kirby and Savage 2008; Knoepke et al. 2013; Tunmer and Champman 2012; see also Hoover and Gough 1990). Gough and Tunmer (1986) stated that “word recognition skill (in an alphabetic orthography) is fundamentally dependent upon knowledge of letter-sound correspondence rules” (p. 7). However, this definition incorporates only one of two possible routes of word recognition assumed by the DRC model (Coltheart et al. 2001; Coltheart 2005), namely the route of phonological recoding or the non-lexical route. Via this phonological route, words are translated letter by letter into a phonological code by means of grapheme-phoneme-correspondence rules. Based on the phonological code the respective lexical entry is retrieved from the mental lexicon in a way similar to auditory word recognition. This route is most important for beginning readers because it enables the reader to recode new and unknown word forms based on single grapheme-phoneme mappings. However, more experienced readers have built up a sight vocabulary (Ehri 2005a) that allows a more rapid and efficient word recognition. The reader recognizes orthographic word forms as a whole and maps them directly onto his or her lexical entries without the preliminary step of grapheme-to-phoneme translation. This route of orthographical decoding is called the lexical route (Coltheart et al. 2001; Coltheart 2005). Orthographical decoding allows effortless recognition of familiar words and words with irregular spelling which are already part of the sight vocabulary. According to the DRC model, both routes start to operate in parallel when encountering a word. The route that recognizes the word faster and more reliably, i.e. the more efficient route, gains stronger activation and accesses the lexical entry: In particular when known words with high frequency are processed, the orthographical route is more efficient. In contrast, low-frequent and unknown words are more likely to be recognized via the phonological, non-lexical route (see Paap and Noel 1991; for an application of the DRC model to German see Ziegler et al. 2000).

The assumption of the DRC model that skillful readers use phonological information only for reading low-frequent or unknown words implies that orthographic decoding skills should be far more important in these readers than phonological decoding skills. In contrast to this view, the connectionist triangle model (Plaut et al. 1996; Seidenberg and McClelland 1989) and the strong phonological model (Frost 1998) suggest a slightly different conclusion. According to the triangle model, phonological information is stored in the mental lexicon and used along with orthographic and meaning representations whenever words are processed. Thus, word recognition regularly involves phonological information. The relative importance of phonological and orthographic information is assumed to depend on the word characteristics (frequency and consistency) and reading proficiency. Importantly, however, the triangle model implies that phonological information is always used to some degree in word recognition. Similarly, the strong phonological model (Frost 1998) posits that phonological information is accessed automatically and early in word processing, suggesting that it plays a regular role in word recognition (for meta-analytic evidence from masked priming studies, see Rastle and Brysbaert 2006).

4 Development of phonological recoding and orthographical decoding

Beginning readers must recode an unknown written word at least once letter by letter before information about its orthographic form can be added to the mental lexicon. As a consequence, phonological recoding is the prerequisite for the development of the orthographical decoding route. Thus, beginning readers should basically rely on the phonological route because almost every word in its written form is new and unknown to them. Only recognizing the same written word over and over again allows the reader to build up a representation of the word form as part of the mental lexicon. These orthographical representations can then be used for word recognition as well, either in a direct way (lexical route according to the DRC model, Coltheart et al. 2001) or in concert with phonological and meaning representations (according to the triangle model, Plaut et al. 1996). Thus, the assumption seems reasonable that word recognition in beginning readers is primarily accomplished via the phonological recoding route and in experienced readers via the orthographical decoding route (Hoover and Gough 1990; Tunmer and Chapman 2012).

A prominent approach describing the development of word-recognition skills during reading acquisition is the three-stage developmental model by Frith (1986). She assumes three reading strategies of word recognition, which are acquired in a serial order. It is important to note that the strategies in Frithʼs model are not under the strategic control of the reader: Rather, they describe processes that occur automatically and usually without conscious awareness of the reader, at least when they are sufficiently routinized (Ehri 2005b). The first strategy children employ is the logographic strategy. Often prior to entering reading instruction, children are able to recognize a small sample of words based on their graphic features, i.e. children are not aware of the grapheme structure of words but read them in an iconic fashion. The second strategy children use when they enter reading instruction is the alphabetic strategy. They systematically recode words letter by letter, translating each grapheme into its corresponding phoneme (this strategy is consistent with the phonological recoding route; for a more differentiated distinction of alphabetic strategies see Ehri 2005b). The final and most mature strategy children acquire when learning to read is the orthographic strategy. Children recognize words as whole orthographic units without translating graphemes into phonemes first (this strategy is consistent with the orthographical decoding route assumed in the DRC model, Coltheart et al. 2001). Frith further assumes that the acquisition of the three strategies is serial, i.e. children proceed from one strategy to the next whereby each strategy is built on the previous one. Accordingly, children proceed from the logographic strategy to the alphabetic strategy and subsequently to the orthographic strategy.

The developmental model of reading proposed by Frith (1986) differs from the dual route model (Coltheart et al. 2001) and – even more so – from the triangle model (Plaut et al. 1996) and the strong phonological model (Frost 1998) in its implications for the relative importance of phonological recoding and orthographical decoding skills. Although, as discussed earlier, experienced readers should assign more weight to the orthographical decoding route because it is more efficient, dual route models such as the DRC model (Coltheart et al. 2001) nevertheless predict readers to make use of both routes: In case of unknown or low-frequent words readers are expected to use the phonological recoding route and in case of familiar, high-frequent, and irregular words they should employ the route of orthographical decoding. The triangle model (Plaut et al. 1996) and the strong phonological model (Frost) even assign a more prominent role to phonological recoding skills in more experienced readers. The reason is that these models predict that phonological information is regularly used whenever words are processed. In line with these assumptions, the importance of both orthographic and phonological skills was confirmed in several studies with children (e.g. Richter et al. 2013; Shankweiler et al. 1999; Tunmer and Chapman 2012), adolescents, and adults (Paap and Noel 1991; Shankweiler et al. 1996), which found both phonological recoding and orthographical decoding abilities to be highly predictive of reading comprehension. However, given that orthographical decoding is usually more efficient, it seems reasonable to assume that readers might at least gradually shift from a rather phonologically based to a rather orthographically based word recognition strategy during reading acquisition.

5 The present study

The present study followed three related aims. The first aim was to investigate whether both phonological recoding skills and orthographical decoding skills are predictive of reading comprehension skills in German primary school children and which unique contributions both skills make to reading comprehension skill. All models of word recognition discussed so far and the developmental three-stage model (Frith 1986) were originally proposed for English-speaking children. However, German is a language with higher orthographical consistency than English (e.g. Landerl et al. 1997; Wimmer and Goswami 1994), that is, most German words have a regular spelling perfectly consistent with the German grapheme-to-phoneme-correspondence rules. As a consequence, the phonological recoding route might be more efficient than it is in languages such as English which are characterized by a comparatively high amount of irregularly spelled words (e.g. Seymour et al. 2003). It would therefore not be surprising to find phonological skills to be more strongly predictive of reading comprehension in German than they are in English (see the discussion in Ziegler et al. 2000).

A second aim of the present study was to examine the potential shift from a rather phonologically based recoding strategy in beginning readers to a rather orthographically based decoding strategy in advanced readers as predicted by Frith (1986). If such a shift occurs, we would expect to find a decrease in the strength of the relationship of phonological recoding skills and reading comprehension skills with increasing grade level. At the same time, we would expect the strength of the relationship of orthographical decoding skills and reading comprehension skills to increase. However, if children do not shift from a phonological to an orthographical word recognition strategy but instead rely on both word recognition skills once they are sufficiently developed (Coltheart et al. 2001; Coltheart 2005), we would expect to find phonological recoding skills to remain highly predictive of reading comprehension skills across all grade levels. A somewhat different prediction is implied by the strong phonological model (Frost 1998) and by the connectionist triangle model (Plaut et al. 1996; Seidenberg and McClelland 1989) according to which phonological processing is always involved in visual word identification.

Another issue to be addressed by the present study was whether skills of word recognition gradually become less predictive of reading comprehension skills in children with enhanced phonological recoding and orthographical decoding skills. The simple view of reading formula R = × C (Gough and Tunmer 1986; Hoover and Gough 1990) predicts that on a high level of word recognition skills (D) reading comprehension skills (R) should depend exclusively on general (higher-level) comprehension skills (C) and vice versa. Thus, the improvement of phonological recoding and orthographical decoding skills should reduce their predictive power for reading comprehension because variance in reading comprehension skills might then be better accounted for by general comprehension skills at the sentence and text level (such as inference skills and meta-cognitive strategies, e.g. Oakhill et al. 2003). In contrast to this view, studies with normally developing adolescents and adults found that even for older and more advanced readers both types of skills are highly predictive of reading comprehension (Paap and Noel 1991; Shankweiler et al. 1996). Thus, the third aim of the present study was to explore whether the relationship of phonological recoding and orthographical decoding skills with reading comprehension are linear across the complete range of individual differences in word-level skills or whether the relationships follow a quadratic trend with a stronger relationship in the lower range of word-level skills.

6 Method

6.1 Participants

Participants were 992 primary school children recruited from 21 schools (72 classes) in Cologne, Frankfurt am Main and Kassel (Germany). Of these children, 967 took part in the sentence comprehension task and 214 children took part in the text comprehension task; 189 children participated in both tasks.

Participants of the sentence comprehension task.

The data of 100 children (10.3 %) were excluded from the analyses of the sentence comprehension data because data were missing for more than 20 % of the trials in at least one of the tasks included in the analysis. Moreover, the data of 201 non-native German speaking children (20.8 %) were also excluded from the analyses. Of the remaining 666 children (325 boys and 329 girls, for 12 children gender information was missing), 232 children were in Grade 2 (age: M = 8.41; SD = 0.39; Min = 7.25; Max = 10.25), 190 children were in Grade 3 (age: M = 9.44; SD = 0.56; Min = 8.17; Max = 11.83), and 244 children were in Grade 4 (age: M = 10.42; SD = 0.42; Min = 9.00; Max = 12.42).

Participants of the text comprehension task.

The data of 22 children (10.3 %) were excluded from the analyses of the sentence comprehension data because data were missing for more than 20 % of the trials in at least one of the tasks included in the analysis. The data of 43 non-native German speaking children (20.1 %) were also excluded from the analyses. Of the remaining 149 children (68 boys and 74 girls, for 7 children gender information was missing), 58 children were in Grade 2 (age: M = 8.29; SD = 0.35; Min = 7.58; Max = 9.08), 40 were in Grade 3 (age: M = 9.38; SD = 0.41; Min = 8.33; Max = 10.42), and 51 were in Grade 4 (age: M = 10.29; SD = 10.38; Min = 9; Max = 11.17).

Socio-demographic data were collected via a questionnaire completed by the parents and were supplemented by a questionnaire completed by the teacher when information was missing. Children only participated in the study when parents provided written consent.

6.2 Variables

The phonological recoding task, the orthographical decoding task, and the sentence comprehension task described in the following paragraphs were taken from the computerized German-speaking test battery ProDi-L (Richter et al. 2012; Richter et al. in Druck).

Phonological recoding skills.

Childrenʼs phonological recoding skills were assessed by the computerized phonological comparison task embedded in ProDi-L. Children were presented with 64 pairs of pseudowords (62 test pairs and 2 practice pairs), i.e. meaningless strings of phonemes or graphemes that were congruent with the phonological and orthographical rules of German. They were asked to indicate whether the two pseudowords matched (e.g. risamo- risamo) or mismatched (e.g. tebedika-tebudiki). The first pseudoword was presented orally over headphones, the second pseudoword was presented subsequently in written form on the notebook screen. Children responded by pressing a green button on the keyboard for match or a red button for mismatch. Half of the pseudoword pairs consisted of matching, the other half of mismatching pseudowords. They consisted of one to four open syllables with a simple consonant-vowel structure. The mismatching pseudowords contained either one or two mismatching vowels, which occurred in stressed or unstressed syllables. The orally presented stimuli were recorded by a male speaker.

Orthographical decoding skills.

Childrenʼs ability to orthographically decode words was assessed by a computerized lexical decision task taken from ProDi-L. Children were presented with 92 written words (e.g. Traktor/tractor) and pseudowords (e.g. Spinfen). Their task was to indicate whether the presented item was a real word or not. Half of the presented items were existing German words and the other half were pseudowords. Children responded by pressing a green button on the keyboard for yes, this is a real word or a red button for no, this is not a real word. Both words and pseudowords varied in length (number of word characters: M = 5.68; SD = 1.08; Min = 3; Max = 10; number of pseudoword characters: M = 6.09; SD = 2.02; Min = 3; Max = 12) and frequency (log-transformed frequency of words: M = 1.81; SD = 1.03; Min = 0.00; Max = 3.77; Mannheim Corpus of the CELEX data base for written German; Baayen et al. 1995). The words which were used to construct the pseudowords (e.g. the pseudoword Maum was created by replacing the first letter of Baum/tree) were matched in frequency with the real words used in this task (log-transformed frequency: M = 1.66; SD = 0.97). Pseudowords varied in the degree to which they resembled real German words. Pseudowords similar to existing words (e.g. Nand) were based on words with a regular German spelling (e.g. Sand/sand) and pseudowords dissimilar to existing words (e.g. Koveau) were based on words with an irregular German spelling (e.g. Niveau/level). Some of the pseudowords were pseudohomophones, which sound like an existing German word but have a different orthography (e.g. Heckse instead of Hexe/witch).

Sentence comprehension skills.

Childrenʼs ability to comprehend sentences was assessed by a computerized sentence verification task taken from ProDi-L. Children were presented with 46 written declarative sentences (44 test sentences and 2 practice sentences), which either contained a true statement about the world (e.g. Züge fahren auf Schienen/Trains run on rails) or a false statement (e.g. Treppen sind ein rotes Gemüse/Stairs are red vegetables). Children were asked to verify the statements by pressing the green button on the keyboard for yes, the statement is correct or the red button for no, the statement is not correct. Half of the sentences contained true and the other half false statements. The sentences varied in the number of propositions and length (number of characters: M = 34.87; SD = 12.37; Min = 15; Max = 61). The true statements also varied in predictability (e.g. predictable: Giraffen haben lange Hälse/Giraffes have long necks; less predictable: Zitronen sind gesunde Früchte/Lemons are healthy fruits), and the false sentences varied in the degree to which they violated general world knowledge (e.g. Schnecken sind schnell/snails are fast), i.e. expressed propositions that were sensible but incongruent with well-known facts, or rather semantic knowledge (e.g. Husten ist blau/Cough is blue), i.e. expressed propositions that were incongruent with basic semantic features of the focal word (for a discussion of the distinction between world knowledge violations and semantic anomalies, see Isberner and Richter in Druck).

Text comprehension skills.

Childrenʼs text comprehension skills were assessed by the sub test Text Comprehension of ELFE 1–6 (computerized version, Lenhard and Schneider 2006), which is a standardized reading comprehension test battery for German primary school children from Grade 1 to Grade 6. Children were presented with 20 short texts and were asked to answer questions concerning the content of each text by choosing one of four multiple-choice items.

6.3 Procedure

Phonological recoding skills, orthographical decoding skills, and sentence comprehension skills were assessed in the context of a cross-sectional study investigating processes of listening and reading comprehension in German primary school children with various measures on the word, sentence, and text level (ProDi-L: Prozessbezogene Diagnostik des Leseverstehens bei Grundschulkindern, Richter et al. in Druck; see also Richter et al. 2012, Richter et al. 2013). Children were tested together in classrooms of the participating schools. All written items were presented on notebook computers (font: Verdana, visual angle: 1.5 degrees). The three tasks were embedded in a story of an extraterrestrial named Reli who came to earth to learn the earthlings’ language. He asked the children to help him by indicating when he did something wrong. Reli introduced the tasks in short animated video clips and walked the children through them. All test items were presented in randomized order. Prior to each task, children saw two practice items for which they received feedback from Reli. When children gave an incorrect answer, the practice items were repeated until they answered all practice items correctly. As measures of efficiency log-transformed response times (measured from stimulus onset to the press of the response button), which are assumed to reflect the degree of routinisation of cognitive processes, and response accuracies, which are assumed to reflect the reliability of cognitive processes, were recorded. The log-transformation served to normalize the distribution of the otherwise skewed response time data. For each child and each task, the mean was calculated across all items for response accuracies and log-transformed response times respectively. In addition, the log-transformed reaction times were screened and adjusted for outliers by a two-step procedure. First, reaction times 3 standard deviations or more below the item-specific mean were discarded from the analysis. Second, reaction times 2 standard deviations or more below or above the person-specific mean were discarded from the analysis. From the two measures, an integrated test score was calculated by dividing children’s mean accuracy by their mean log-transformed response times. These integrated test scores reflect both the degree of reliability and routinisation, i.e. the overall efficiency of a cognitive process. In the following analyses we will therefore primarily focus on the results found for integrated test scores while taking accuracy and response time results into account to support their interpretation.

Subsequent to the tasks of the cross-sectional study, the computerized ELFE subtest Text Comprehension was conducted. The ELFE test scores, which were calculated by counting the number of children’s correct responses, served as measure of text comprehension skills. The tasks were presented in two separate sessions, which lasted approximately 45 min, and were carried out on different days.

7 Results

Mean log-transformed response times, mean accuracies and integrated test scores as measures of sentence comprehension and the ELFE test scores as measures of text comprehension were analyzed using Hierarchical Linear Models (HLM, Raudenbush and Bryk 2002) with intercepts randomly varying between school classes. All models were estimated with the software package lme4 for R (Bates et al. 2011). All significance tests were based on a Type I error probability of 0.05 (two-tailed). Descriptive statistics for the sentence-comprehension measures are reported in Table 1 and for the text-comprehension measures in Table 2.

Table 1 Descriptive statistics for response time (log-transformed), accuracy, and integrated test scores as dependent variables in the phonological recoding task, orthographical decoding task, and the sentence comprehension task (N = 666)
Table 2 Descriptive statistics for response time (log-transformed), accuracy, and integrated test scores as dependent variables in the phonological recoding task and the orthographical decoding task, and the ELFE test scores of the text comprehension task (N = 149)

Separate models were estimated for sentence comprehension and text comprehension measures as dependent variables. For both the sentence comprehension and the text comprehension data, we first estimated a model for the integrated test scores which reflect measures of the efficiency of component processes of reading. To interpret the results for the integrated test scores, we subsequently estimated models for the response accuracies and the response times. Because the only measure that was available for the ELFE sub test Text Comprehension was the number of correct responses, the outcome variable was the same in all three models. The predictors were entered into the model in two steps. 1) In the first step, phonological recoding skills, orthographical decoding skills and grade level were included as grand-mean centered predictors. Moreover, grade-level interaction terms with phonological recoding skills and orthographical decoding skills were included into the model. These interaction terms accounted for possible developmental changes in the extent to which phonological recoding and orthographical decoding skills predict sentence and text comprehension skills. 2) In the second step, phonological recoding skills and orthographical decoding skills were squared and added to the model. The quadratic terms allowed to test for the assumption that the linear correlations between both word recognition measures and sentence and text comprehension skills might be more strongly pronounced for children with poor word recognition skills as compared to children with more advanced word recognition skills. Finally, we included the interaction terms of grade level with the squared predictors into the model to account for possible developmental changes in the quadratic relationships. In all models, the intercepts were allowed to vary randomly between school classes to account for the nested data structure (students nested within classes). The parameter estimates for the fixed and random effects in the HLMs are provided in Table 3 for the sentence comprehension measures as dependent variables and in Table 4 for the text comprehension measures as dependent variables.

Table 3 Fixed effects and variance components in the HLM with response accuracy, response time (log-transformed), and integrated test scores as dependent variables for sentence comprehension skills and response accuracy, response time (log-transformed), and integrated test scores as measures of phonological recoding skills and orthographical decoding skills
Table 4 Fixed effects and variance components in the HLM with the ELFE test scores as dependent variable for text comprehension skills and response accuracy, response time (log-transformed), and integrated test scores as measures of phonological recoding skills and orthographical decoding skills

7.1 Sentence comprehension task

Integrated test scores.

The HLM for the integrated test scores revealed significant main effects for phonological recoding skills and orthographical decoding skills (Table 3, Model 1). Children with more efficient phonological recoding processes (b = 0.08; t (660) = 3.8; p < 0.05) and more efficient orthographical decoding processes (b = 0.37; t (660) = 16.3; p < 0.05) also exhibited more efficient sentence comprehension skills. The positive relationship with sentence comprehension skills was more strongly pronounced for orthographical decoding skills than for phonological recoding skills. Including the squared predictor variables and their interactions with grade level did not change the significance of the linear effects (Table 3, Model 2). However, Model 2 revealed significant main effects for the quadratic terms of phonological recoding skills (b = 2.49; t (656) = 3.24; p < 0.05) and orthographical decoding skills (b = − 4.03; t (656) = − 3.62; p < 0.05) as well as a significant interaction of grade level and squared orthographical decoding skills (b = − 3.54; t (656) = − 3.03; p < 0.05). The quadratic regression for the squared predictor variables is displayed in Fig. 1a and b. The positive relationship between orthographical decoding skills and sentence comprehension skills was strongest in the lower range of orthographical decoding skills and became weaker with increasing orthographical decoding skills. Moreover, this pattern became more pronounced with increasing grade level. In contrast, for phonological recoding skills, the quadratic relationship with sentence comprehension skills showed a reverse pattern: Unexpectedly, the positive relationship with sentence comprehension skills was weaker in the lower range of phonological recoding skills and became stronger with increasing phonological recoding skills.

Fig. 1
figure 1

Linear and quadratic relationships between a phonological recoding skills (integrated test scores) and b orthographical decoding skills (integrated test scores) with sentence comprehension skills (integrated test scores)

In order to clarify the sources of the effects for the integrated measures, we estimated two further models with the accuracy and reaction time data.

Accuracy data.

The HLM for the mean accuracy data revealed significant main effects for phonological recoding and orthographical decoding skills (Table 3, Model 1). Children who responded with high accuracy in the phonological recoding task (b = 0.09; t (660) = 3.6; p < 0.05) and with high accuracy in orthographical decoding task (b = 0.30; t (660) = 9.0; p < 0.05) also responded with higher accuracy to the sentences in the sentence comprehension task. This relationship was stronger for orthographical decoding skills than it was for phonological recoding skills. Including the quadratic terms and their interaction with grade level in the model did not change the significance of the linear effects (Table 3, Model 2). However, Model 2 revealed a significant main effect for squared orthographical decoding skills (b = − 1.52; t (656) = − 5.64; p < 0.05) and a significant interaction of grade level and squared orthographical decoding skills (b = − 0.95; t (656) = − 3.42; p < 0.05). The relationship between orthographical decoding skills and sentence comprehension skills was strongest in the lower range of orthographical decoding accuracy and became weaker the more the accuracy values approached 100 %. This pattern became more pronounced with increasing grade level.

Response latencies.

The HLM for the mean log-transformed response times (Table 3, Model 1) again revealed significant main effects for phonological recoding skills (b = 0.10; t (660) = 2.8; p < 0.05), orthographical decoding skills (b = 0.75; t (660) = 24.1; p < 0.05), and grade level (b = − 0.06; t (660) = − 4.8; p < 0.05) as well as a significant interaction of grade level and orthographical decoding skills (b = − 0.09; t (660) = − 2.4; p < 0.05). The main effect for grade level indicates an overall increase in response speed with increasing grade level. Children who provided faster responses in the phonological recoding and the orthographical decoding task also responded faster in the sentence comprehension task. This relationship was slightly more pronounced for the response times in the orthographical decoding task compared to the phonological recoding task but it decreased slightly with grade level. Including the quadratic terms and their interactions with grade level into the model did not change the significance of the linear effects (Table 3, Model 2). In contrast to the integrated test scores and the accuracy data, none of the squared predictor variables or their interactions reached significance.

The results of the sentence comprehension task can be summarized as follows: Consistent with the assumptions of the DRC model (Coltheart et al. 2001) and the triangle model (Plaut et al. 1996) orthographical decoding skills were predictive of sentence comprehension skills at all grade levels. Second, phonological recoding skills were also associated with sentence comprehension skills. This relationship was weaker than the one of orthographical decoding skills and sentence comprehension but it was nevertheless substantial and did not decrease from Grade 2 to Grade 4. The consistently strong relationship of phonological recoding skills with sentence comprehension is compatible with the triangle model as well as the strong phonological model (Frost 1998) but not so much with the DRC model and the developmental model by Frith (1986). The described pattern was found in the integrated test scores and both of its components, that is the accuracy data and the reaction time data. Moreover, the strength of the positive relationship of orthographical decoding skills with sentence comprehension was strongest in the lower range of orthographical decoding skills. This pattern was found for the integrated test scores and also for the accuracy data but not for the reaction time data. In contrast, the relationship between phonological recoding skills and sentence comprehension skills was more strongly pronounced in the upper range of phonological recoding skills. This pattern was found only for the integrated test scores. Thus, the assumption derived from the simple view of reading (Gough and Tunmer 1986) that the relationship of word recognition skills and reading comprehension skills is weakest in the upper range of word recognition skills was supported only partially by the data.

7.2 Text comprehension task

Integrated test scores.

The HLM for the ELFE test scores and the integrated test scores as measure of phonological recoding and orthographical decoding skills revealed significant main effects for phonological recoding and orthographical decoding skills (Table 4, Model 1). Children with more efficient phonological recoding processes (b = 50.78; t (143) = 2.5; p < 0.05) and more efficient orthographical decoding processes (b = 201.71; t (143) = 8.0; p < 0.05) also reached higher test scores in the text comprehension task (see Fig. 2a and b). The positive relationship with text comprehension skills was more strongly pronounced for orthographical decoding skills than for phonological recoding skills. Including the squared predictor variables and their interactions with grade level did not change the significance of the linear effects (Table 4, Model 2). However, Model 2 revealed a significant main effect for the quadratic term of phonological recoding skills (b = 1970.45; t (139) = 2.4; p < 0.05). The quadratic regression is depicted in Fig. 2a. Again, unexpectedly, the positive relationship between phonological recoding skills and text comprehension skills was weaker in the lower range of phonological recoding skills.

Fig. 2
figure 2

a Linear and quadratic relationships between phonological recoding skills (integrated test scores) and text comprehension (ELFE test scores) and b linear relationship between orthographical decoding skills (integrated test scores) with text comprehension skills

Again, in order to clarify the sources of the effects for the integrated measures, we estimated two further models with the accuracy and reaction time data.

Accuracy data.

The HLM for the ELFE test scores and the mean accuracies as measure of phonological recoding and orthographical decoding skills revealed a significant main effect for orthographical decoding skills (Table 4, Model 1). Children who provided highly accurate responses in the orthographical decoding task also provided highly accurate responses in the text comprehension task (b = 38.70; t (143) = 8.7; p < 0.05). Including the quadratic terms and their interaction with grade level changed the significance of one of the linear effects (Table 4, Model 2). Now, Model 2 revealed a significant main effect for phonological recoding skills (b = 10.14; t (139) = 2.4; p < 0.05). Children who responded with higher accuracy in the phonological recoding task also responded with higher accuracy in the text comprehension task. The positive relationship with text comprehension skills was more strongly pronounced for orthographical decoding skills than for phonological recoding. None of the squared predictor variables or their interactions reached significance.

Response latencies.

The HLM for the ELFE test scores and the mean log-transformed response times as measure of phonological recoding and orthographical decoding skills revealed significant main effects for grade level and orthographical decoding skills (Table 4, Model 1). Overall, children reached higher test scores in the text comprehension task with increasing grade level (b = 1.38; t (143) = 2.9; p < 0.05). Children who provided faster responses in the orthographical decoding task also provided more accurate responses in the text comprehension task (b = − 5.72; t (143) = − 4.1; p < 0.05). Including the quadratic terms and their interaction with grade level did not change the significance of the linear effects (Table 4, Model 2). However, Model 2 revealed a significant main effect for squared phonological recoding skills (b = − 7.31; t (139) = − 2.0; p < 0.05), indicating that the relation between phonological recoding skills and text comprehension skills was strongest in the upper range of response times in the phonological recoding task.

The results for the text comprehension task can be summarized as follows: Again, phonological recoding skills as well as orthographical decoding skills were predictive of text comprehension across all grade levels, as was predicted by the DRC model (Coltheart et al. 2001) and the triangle model (Plaut et al. 1996). Children with better phonological recoding skills and better orthographical decoding skills exhibited better text comprehension skills. This pattern was found in the integrated test scores and the accuracy data for phonological recoding and orthographical decoding skills and for orthographical decoding skills in the reaction time data. The positive relationship was strongest for orthographical decoding skills and text comprehension skills as predicted by the DRC model in particular. Nevertheless, the relationship of phonological recoding skills and text comprehension was substantial in all grade levels. This finding coheres well with the triangle model and the strong phonological model (Frost 1998). Finally, the strength of the positive relationship of phonological recoding skills and text comprehension skills was weaker in the lower range of phonological recoding skills. This was found for the integrated test scores. In contrast, the negative relationship of phonological recoding times and text comprehension skills was weakest for faster phonological recoding times. This finding was predicted by the simple view of reading (Gough and Tunmer 1986). However, the assumption that the relationship of word recognition skills and reading comprehension skills is weakest in the upper range of word recognition skills was supported only partially by the data.

8 Discussion

The first aim of the present study was to investigate whether and to what extent phonological recoding skills and orthographical decoding skills are both predictive of reading comprehension skills in German primary school children. We found both phonological recoding and orthographical decoding skills to be predictive of sentence and text comprehension skills at all grade levels. The one exception was that text comprehension skills were not predicted by the speed of phonological recoding processes but only by their accuracy and the integrated test scores. Given that the integrated test scores capture both the speed and the reliability aspect of the efficiency of phonological recoding processes, our results indicate that, overall, phonological recoding skills are predictive of text comprehension as well. Consistent with the DRC model of word recognition (Coltheart et al. 2001; Coltheart 2005), orthographical decoding skills appeared to be more strongly predictive of sentence and text comprehension skills than phonological recoding skills. Apparently, by the end of Grade 2, many German primary school children have already built a sufficient sight vocabulary that enables them to access the lexical entries of many words directly and efficiently from their written forms via the lexical route. Nevertheless, phonological recoding skills made a significant contribution to reading comprehension across all grade levels. Given that the test items of the sentence comprehension test (ProDi-L, Richter et al. 2012, in Druck) and the text comprehension test (ELFE 1–6, Lenhard and Schneider 2006) did not contain very rare words or words likely to be unknown to primary school children (at least not to those in upper grade levels), we infer that phonological skills are relevant for word recognition in primary school children beyond the restricted category of unknown or low-frequent words. Thus, the results indicate that the children regularly made use of phonological information whenever words were recognized. This conclusion is in line with the triangle model (Plaut et al. 1996) as well as the strong phonological model (Frost 1998). Furthermore, because of the high orthographical consistency in German words (Landerl et al. 1997; Wimmer and Goswami 1994), it seems not surprising that phonological decoding abilities play a prominent role during reading comprehension in beginning and more experienced German readers as well (see discussion in Ziegler et al. 2000).

The second aim of the present study was to investigate a potential shift from a rather phonologically based recoding strategy in beginning readers to a rather orthographically based decoding strategy as predicted by Frith’s (1986) three-stage developmental model of reading. Such a shift would be indicated by interactions of grade level with phonological recoding and orthographical decoding skills. However, only one interaction with grade level reached significance and its pattern runs counter the predictions implied by Frithʼs model: The positive relationship between sentence comprehension speed and orthographical decoding speed decreases from Grade 2 to 4. This interaction was in contrast to Frith’s predictions. No further grade-level interactions with one of the linear predictors reached significance. In sum, phonological recoding and orthographical decoding skills were both strongly predictive of reading comprehension at all grade levels. However, it is possible that we might have observed evidence in favor of a shift if we had investigated a broader range of grade levels. In contrast to children at the end of Grade 2, first graders and early second graders might have relied more strongly on a phonological strategy and less strongly on an orthographic strategy. Nevertheless, according to Frith we would then expect to find orthographical decoding skills to be exclusively predictive of reading comprehension in Grades 2 to 4. Although orthographical decoding skills were more strongly predictive of reading comprehension than phonological recoding skills, they were not exclusively predictive of reading comprehension. One explanation supporting Frith’s model might be that children gradually shift from one strategy to the next with an intervening phase where the two strategies overlap for some time (see Frith 1986). Our results might reflect this phase of overlap where children rely on both the phonological and the orthographical strategy before finally shifting to the orthographical strategy. However, it seems very unlikely that such an overlap would persist for two years without any indication of a change. Thus, we found no evidence in favor of the three-stage developmental model of reading proposed by Frith for German primary school children from Grades 2 to 4.

The final issue we addressed was whether the strength of the relationship of phonological recoding and orthographical decoding skills with reading comprehension depends on the level of childrenʼs word recognition skills. To this end, we included phonological recoding and orthographical decoding skills as squared predictors into the multilevel regression analyses. We found that orthographical decoding skills predicted sentence comprehension more strongly in children with poorer decoding skills and less strongly in children with highly developed decoding skills (this was found for integrated test scores and accuracy data). In addition, this effect was more strongly pronounced for older than for younger children. One possible interpretation of this pattern of results is that the more efficient orthographic decoding functions, the more variance in reading comprehension skills has to be attributed to other skills. This explanation is consistent with the simple view of reading (Gough and Tunmer 1986; Hoover and Gough 1990), which states that an improvement in word recognition D reduces its predictive power of reading comprehension R by leaving the remaining variance in R to be explained by listening comprehension skills C (see e.g. Stothard and Hulme 1992). It is also consistent with the lexical quality hypothesis (Perfetti and Hart 2001) and the verbal efficiency hypothesis (Perfetti 1985) according to which efficient word recognition skills are a necessary but by no means sufficient condition for successful reading comprehension. Previous studies have identified several skills that affect reading comprehension performance in primary school children beyond single word recognition such as semantic integration processes, comprehension monitoring, working memory (Oakhill et al. 2003), inference making (Cain and Oakhill 1999), and grammatical sensitivity (Willow and Ryan 1986). Thus, we would assume that in children with highly developed orthographical decoding skills, such higher-order reading skills might play a more important role in reading comprehension. Remarkably, the squared predictor orthographical decoding skill explained a significant amount of variance only in sentence comprehension skills but not in text comprehension skills. One explanation for this difference could be the fact that different types of tasks were used to assess children’s sentence and text comprehension skills. We will return to this issue later when we address potential limitations of the present study.

One unexpected finding requiring further clarification is that, in contrast to orthographical decoding skills, phonological recoding skills were more strongly associated with sentence and text comprehension skills in children with highly developed phonological recoding skills compared to children with poorly developed phonological recoding skills (this was found for integrated test scores in both the sentence and text comprehension task and for response times in the text comprehension task). How can this unexpected finding be explained? As can be seen from Tables 1 and 2, the majority of children have already developed fairly high phonological recoding skills and only few children appear to have extensive difficulties with phonologically recoding pseudowords. This is not surprising. As discussed earlier, it seems reasonable to assume that by the end of Grade 2 the majority of children have already acquired sufficient phonological recoding skills. However, it seems that within this major group of children with highly developed phonological recoding skills there is a smaller but nevertheless systematic amount of variance left to account for differences in reading comprehension. In contrast, the variance in the smaller group of children with very poor phonological recoding skills might be rather unsystematic. A possible explanation is that children with poor phonological recoding skills use word recognition strategies other than phonological recoding to compensate for their poor word recognition performance. For example, they might recognize many words based on some salient graphical features rather than grapheme-to-phoneme translations (logographic strategy, Frith 1986). It might also be the case that they use sentence- or text-level skills such as context or world knowledge in a top-down fashion to infer the words they have difficulties to recognize (as assumed by the interactive compensatory model by Stanovich 1980). Overall, we assume that children with poor phonological recoding skills probably rely on other skills below or beyond the word level to compensate for poor word recognition abilities.

In sum, we found that both word recognition skills, phonological recoding and orthographical decoding, are associated with reading comprehension skills throughout all grade levels. If we assume a causal relationship between the two word-level and reading-comprehension skills, namely that efficient phonological recoding and orthographical decoding are at the core of skilled reading comprehension even at Grades 3 and 4, our results have some practical implications with respect to reading education. First, because both skills, phonological recoding and orthographical decoding, make individual and separable contributions to reading comprehension, early reading acquisition should be supported by practical exercises aimed at fostering these skills. Furthermore, the fact that we found significant relationships of phonological recoding and orthographical decoding skills with reading comprehension skills even in third and fourth graders highlights the possibility that fostering both word level skills even in older and more advanced readers might be fruitful to enhance their reading comprehension skills. Another implication concerns children who exhibit word recognition difficulties. Here, it is essential to find out which word level skill exactly is impaired to what extent and to create an optimal individual support plan to ensure target-oriented training for the impaired readers.

The results of the present study need to be interpreted with its limitations in mind. First, we differentiate between only two skills of visual word recognition: phonological recoding and orthographical decoding skills. However, we have to consider the possibility that there are processing units on a level between single graphemes or whole orthographical word forms, which might assist word recognition during reading, such as morphemes and syllables. For example, several studies have established syllable-frequency effects or syllable-length effects in Spanish (e.g. Barber et al. 2004), French (e.g. Ferrand and New 2003), and German (e.g. Conrad and Jacobs 2004), suggesting that syllables play a role in visual word recognition. Due to the fact that the phonological recoding task in our study does not differentiate between the recognition of single graphemes and whole syllables, we can not rule out the possibility that our phonological recoding measures reflect syllable recognition skills to some extent. As a result, it is possible that children’s ability to efficiently recognize written syllables account for a unique portion of variance in reading comprehension. This issue requires further clarification in future research.

The second potential limitation concerns the comparability of the sentence- and text-comprehension tasks. Both tasks assess comprehension but are likely to differ in terms of cognitive requirements and measures. Whereas children had to verify whether sentence contents made sense in the sentence comprehension task by providing yes/no-answers under mild time pressure, the ELFE subtest text comprehension required the processing of text passages, of a multiple-choice-question following each passage, and the identification of the correct response out of four possibilities. Thus, performing the text comprehension task might have involved more complex linguistic cognitive processes (such as the establishment of local and global coherence, see van Dijk and Kintsch 1983) and extra-linguistic cognitive processes (such as keeping information active in working memory and comparing several alternative answers) than the sentence comprehension task. In part, these differences are simply due to the fact that text comprehension per se is a more complex task than sentence comprehension. However, it must be noted that due to the multiple-choice format of the text comprehension task the text comprehension data might also to some extent reflect offline comprehension processes and strategies that are not part of text comprehension itself. Nevertheless, the results we found for both the sentence and the text comprehension task were fairly comparable for all grade levels, schools, and school classes.

A third limitation of the present study is its cross-sectional design. In fact, the best way to investigate developmental questions (such as the applicability of Frith’s 1986, three-stage developmental model of reading to German primary school children) is by means of longitudinal designs. We cannot rule out the possibility that the absence of a potential shift from a rather phonological to a rather orthographical word recognition strategy as predicted by Frith (1986) in our data is due to accidental grade level differences. Moreover, Frith points out that each child progresses from one strategy to the next at his or her own pace independent of age or grade level. Thus, a longitudinal investigation might possibly reveal evidence in favor of her theory, which we failed to track down with a cross-sectional design. However, our findings are perfectly in line with several recent studies demonstrating that both phonological recoding and orthographical decoding are highly associated with reading comprehension in children, adolescents, and even adults (e.g. Paap and Noel 1991; Richter et al. 2013; Shankweiler et al. 1996; Shankweiler et al. 1999; Tunmer and Chapman 2012). Furthermore, our findings appear to be consistent throughout all three grade levels as well as for different schools and school classes. Therefore, it seems unlikely that our findings were simply due to accidental grade level differences.

To conclude, our results consistently demonstrate the significant role that both phonological recoding and orthographical decoding skills play in successful reading comprehension throughout the elementary school years, somewhat surprisingly even in Grades 3 and 4. Educators should take both routes of word recognition processes into account when designing reading curricula and interventions for poor readers.