Acquiring a system of lexical representations that permit efficient word recognition is an essential part of learning to read in any language (Perfetti, 2007). A key part of this system is learning the mapping between print and spoken word, which lays a solid foundation for lexical and phonological processing (Caravolas, Volin, & Hulme, 2005; Tibi & Kirby, 2018; Perfetti & Harris, 2013). Thus, a good association between orthography and phonology contributes to learning to read—at least in alphabetic languages, where orthographic and phonological knowledge mutually facilitate each other (Ehri, 1998).

However, there is a need to understand how these principles generalize cross-linguistically (e.g., Caravolas et al., 2005; Moll et al., 2014; Tibi & Kirby, 2018; Ziegler, Bertrand, Tóth, Csépe, Reis, & Faísca, et al., 2010). For instance, it is less clear how these word-level and subject-level factors interact in nonalphabetic scripts, such as Chinese, especially from a developmental perspective. The role of orthographic knowledge has been emphasized in learning to read Chinese (e.g., Ho, Ng, & Ng, 2003; Ziegler et al., 2010); however, associating orthography to phonology may be important even in Chinese (Guan, Liu, & Perfetti, 2011; Moll et al., 2014).

Comprehensive models of word-recognition skills in developing readers, with an emphasis on children with reading disabilities, have not been developed in word reading in Chinese. Recently, the development of crossed random effects models permit a closer look at these questions through item-level analysis of word reading (e.g., Steacy, Elleman, Lovett, & Compton, 2016). For instance, Kearns et al. (2016) used item-response crossed random effects models (Bates, Mächler, Bolker, & Walker, 2015) to explain variability in children’s English word recognition ability at the item level using a comprehensive set of item-, child-, and word-level predictors and revealed variability of contributions on each of these levels to word reading among 5th graders, especially between early vs. late emerging RD. Using a similar approach, Steacy et al. (2016) examined the transfer of decoding skills to English word reading among children with RD and found that sublexical emphasis on item-level characteristics, including specific decoding strategies taught in the Phonological and Strategy Training (PHAST) or Phonics for Reading (PFR) programs, facilitates transfer among those with better word reading skills. Additionally, Elleman, Steacy, Olinghouse, and Compton (2017) utilized such crossed random effects models of word recognition that simultaneously take into account child-level and word-level factors to understand the acquisition of novel English in grade 3–5.

However, most such work has been done in alphabetic languages, and we still know little about how these word-level and child-level characteristics—and their interaction—contribute to early word reading across ages in the logographic language of Chinese. Hence, we apply such models to simultaneously examine the character- and child-level factors that contribute to character recognition development within a large sample (365,760 total trials) of good and poor readers, derived from a national-level reading assessment and intervention project in China (Guan, Kwok, & Wang, 2019a; Guan, Zhao, Kwok, & Wang, 2019b; NIES, 2012).

Character-level factors

Character recognition is the most important reading skill for children during primary school literacy education (Shu & Anderson, 1997). Several properties affect character recognition. Here, we focus on those relevant to learning the mapping between printed and spoken forms on the sublexical and lexical levels (Ho et al., 2003). Specifically, we examine the neighborhood structure of the two major mappings: orthographic to phonological consistency (Taraban & McClelland, 1987) and orthographic to semantic transparency (Marslen-Wilson, Tyler, Waksler, & Older, 1994).

Character naming is facilitated by a consistent mapping between particular orthographic representations and the corresponding phonology. In Chinese, approximately 80% of characters include phonetic and semantic radicals (Shu, Chen, Anderson, Wu, & Xuan, 2003). The phonetic radical gives a hint to the pronunciation, and the semantic radical gives a hint to the meaning of the character. Thus, orthography-to-phonology consistency can be defined in Chinese as the ratio of the number of characters containing the same phonetic radical with the same pronunciation to the total number of characters containing that phonetic radical.

The consistency effect refers to the fact that naming responses are faster and more accurate for words high in consistency (see examples under “Measures”), though mainly for low-frequency words, in both English and Chinese (Jared, 2002). This effect has been interpreted as supporting a single mechanism for converting printed words (or pseudowords) into speech sounds based on the statistical mapping observed between orthography and phonology. In particular, effects of consistency in Chinese imply that, in learning or developing the statistical mappings between orthography and phonology, orthographic similarity participates in the phonology of individual word representation (Hsu, Tsai, Lee, & Tzeng, 2009).

An analogous transparency effect may be obtained because the semantic radical of Chinese characters often gives a hint to the character meaning. Thus, semantically transparent characters may result in better performance because an opaque semantic radical activates divergent meanings, causing greater difficulties in understanding a character (Weekes, Castles, & Davies, et al., 2006). Indeed, increasing evidence indicates, for adults, reading a complex character involves processing both its phonetic and semantic radicals (Ding, Peng, & Taft, 2004). However, less is known about how this ability progressively develops in children learning to read Chinese.

A further theoretical question is whether these two properties—phonological consistency and semantic transparency—interact. The bidirectional interactive activation model (BIAM, Grainger & Ferrant 1994; Grainger & Ziegler 2011) proposes that, when phonology is well-learned, phonological activation can reverberate to orthographic representation in visual word recognition (for evidence in Chinese, see Lee, Hsu, Chang, Chen, & Chao, 2015). Specifically, it is plausible that transparency matters less when there is high consistency. Characters high in phonological consistency may be easy to recognize because there are more examples of such characters containing similar phonological cues from the same phonetic radicals. In contrast, for low-consistency characters, less can be gleaned from past experience with statistical regularities, so recognizing the characters may depend on whether or not they are semantically transparent. Moreover, this interaction effect may vary with development; at higher grade levels, children may have sufficiently high-quality lexical representations that they no longer need to leverage semantic transparency to recognize characters. This possibility points to the need to also consider the interaction of item-level factors and with child-level factors.

Child-level factors

Learning to read requires developing fully specified and precise phonological, orthographic and semantic knowledge about words (the lexical quality hypothesis; Perfetti, 2007). According to the Universal Phonological Principle (Perfetti & Harris, 2013), phonology is automatically activated during character decoding. A key child-level factor in developing these representations may be phonological awareness, the ability to perceive and manipulate sound units of spoken language (Wagner & Torgenson, 1987). Evidence suggests that the awareness of the phonological structure of words plays a pivotal role in developing reading ability in alphabetic orthographies, such as English (Bradley & Bryant, 1983), as well as other kinds of orthographies (Hu & Catts, 1998), including Chinese (Siok & Fletcher, 2001),.

Other general language awareness skills may also contribute to developing high-quality lexical representations (Goswami & Bryant, 1990). In particular, orthographic awareness refers to children’s understanding of the conventions used in the writing system of their language (Treiman & Cassar, 1997). In Chinese, orthographic awareness involves awareness that some orthographic features, including the sublexical form of radicals, convey information about the meanings of characters. Because large groups of characters sharing the same semantic radical are often related, children’s awareness of the functions of radicals may be a powerful tool for literacy learning. Ho et al. (2003) demonstrated that various types of semantic radical knowledge, including the position and semantic category of radicals, correlate significantly with character reading and sentence comprehension.

Reading acquisition in dyslexia or poor readers

The patterns of influences we describe above, among the general population of developing readers, may differ in learners with dyslexia or poor readers. Dyslexia is generally defined as a specific difficulty in the accuracy and/or fluency of word recognition, spelling, and decoding abilities despite normal intelligence and educational opportunities (Tunmer & Greaney, 2010). Researchers have proposed various causes for developmental dyslexia, but a key underlying deficit may be phonological problems, such as difficulty breaking down words into separate sounds (Stein, 2018). Thus, Liberman (1983) called developmental dyslexia a language disorder, a failure to acquire phonological skills even though many dyslexics seem to have no speech or language problems. It is thus implied that readers with poor reading skills may make less use of phonological information in particular.

Nevertheless, recent research on dyslexia showed that reading Chinese likely requires other abilities in addition to phonological processing (McBride, Wang, & Cheang, 2018; Peng, Wang, Tao, & Sun, 2017). Problems with orthographic knowledge and rapid automatized naming were particularly evident in Chinese dyslexic readers (Ho et al., 2003; Peng et al., 2017), leading some researchers to conclude that orthography-related difficulties may be the main problem in Chinese dyslexia (Ho et al., 2003). However, we know less about developmental changes in character recognition among dyslexic readers.

Toward models of Reading development

Finally, it is important to consider how these above effects may change with development (Juhaz, Yap, Raoul, & Kaye, 2018). Although there is general evidence that many item-level effects seem to diminish with increasing age (Davies, Arnell, Birchenough, Grimmond, & Houlson, 2017), this may not be true for all such properties. In particular, theoretical models of reading development (e.g., Zevin & Seidenberg, 2004, 2006) predict that, as experience with a writing system accumulates, the consistency of a word or character’s match to the overall orthography-to-phonology mapping (i.e., its consistency) should become more important.

Here, we not only examine how the effects of these variables vary across grade levels, but consider the form of this development. A few crucial studies suggest that developmental changes might not be linear. For example, Berninger, Abbott, Nagy, and Carlisle (2010) conducted a growth curve analysis on three kinds of linguistic awareness—phonological, orthographic and morphological—and found the greatest growth during the first three grades but some additional growth thereafter for PA, and substantial growth after fourth grade for OA and MA.

Examining developmental trajectories is particularly informative in the present study, in which we also aim to capture effects of poor readers. Some work (e.g., Peng et al., 2019) suggests that, although poor or struggling readers are disadvantaged in overall reading, they show a similar trajectory in growth (e.g., deceleration over the elementary grades) to typically developing readers. That is, the best descriptor of poor readers may simply be a deficit in initial or overall performance (e.g., Mancilla-Martinez & Lesaux, 2010). Here, we examine whether poor readers indeed show the same pattern of influences within and across grades as the broader population of readers.

Present study

We examined the speed and accuracy of lexical decision from the first through the sixth grade cross-sectionally. We applied crossed random effects modeling to simultaneously examine, at the item level, influences of both character-level (transparency and consistency) and child-level properties (phonological awareness and orthographic awareness).

We apply growth-curve analysis (e.g., Berninger et al., 2010; Francis, Shaywitz, Stuebing, Shaywitz, & Fletcher, 1996) by allowing both overall performance and the magnitude of each effect of interest (e.g., the consistency effect) to vary across grade levels. As noted above, some prior work suggests growth in reading skill decelerates—and perhaps even reaches a plateau—across this time period. To capture any possibility of such a pattern, we included a quadratic growth term as well as a linear one.

Lastly, we considered whether poor readers show the same effects as those observed among all readers.

Method

Participants

We recruited 762 (328 female) native Mandarin-speaking students from three elementary schools in China. To consider effects among poor readers, we identified students (n = 80) who at the 10th quantile or below for their grade level on their combined scores on their word reading, reading comprehension and standardized Chinese Academic Performance Test (CAPT) (NIES, 2012). This screening procedureFootnote 1 is based on previous literature on developmental dyslexia in both Chinese and English (Chen, Zhou, Dunlap, & Perfetti, 2007; Cortese & Schock, 2013; Shu, McBride-Chang, Wu, & Liu, 2006, Vaughn, Fletcher, Francis, Denton, Wanzek, Wexler, Cirino, Barth, & Romain, et al., 2008). The correlation coefficients of all these variables appear in Table 1 and descriptive statistics in Table 2. All parents signed informed consent forms.

Table 1 Correlation coefficients of student-level variables for normal readers (lower triangle) and poor readers (upper triangle)
Table 2 Means and (in parentheses) standard deviations of all measures among all readers in each of six grades and among poor readers

Measures

Lexical decision

We selected 240 characters (40 from each grade level) from the students’ curriculum. The characters were representative of the compound regularities and basic configurations of Chinese characters, including left-right (妈), top-down (骂), and outside-inside (闯).

We assessed the consistency of each character based on its orthography-to-morphology phonology. Consistency has been defined in the literature both dichotomously and continuously (Fang, Horng, & Tzeng, 1986); here, we use a continuous measure to maximize power (Cohen, 1983). Specifically, analogous to the definition used by Jared, McRae, and Seidenberg (1990) for English, we define the consistency value as the proportion of characters with the same pronunciation out of all characters that share the phonetic radical. For example, there are twelve characters that include the phonetic radical 由 you. Among these, 迪 and 笛 are pronounced as di and thus have a consistency value of 0.17 (i.e., 2/12). We included the full range of gradient consistency values in our statistical models, but for simplicity in visualization, we dichotomize consistency into high (> 0.5) and low (< 0.5) categories in figures (Shu & Anderson, 1997).

We also categorized characters as transparent if their meaning could be deduced from the orthographic form based on the character curriculum database (Shu et al., 2003) or opaque if not. We counterbalanced consistency (high vs. low) and transparency (transparent vs. opaque), with a quarter of characters in each cell of the design.

Another 240 pseudo-characters were created by adding, deleting or shifting one stroke from the radical within a legal character.

Orthographic awareness

This task tested stroke awareness and radical knowledge (Guan, Perfetti, & Meng, 2015). For stroke awareness, students tried to reproduce a character one stroke at a time in the order they perceived. The maximum score (20) was earned by writing all 20 characters using appropriate stroke order. For radical knowledge, students were shown a novel character first, and then asked to identify the constituent radicals in that novel character. For example, for the character “晴,” the participants should select the appropriate constituent radicals “日” and “青” out of four semantic radicals (日, 口, 目, 月) and four phonetic radicals (青, 靑, 亲, 庆). The maximum score (20) was earned by correctly identifying all radicals. The scores on these two tasks were summed up as the orthographic awareness score (maximum 40).

Phonological awareness

Participants heard a novel character pronounced and were asked to (a) select its pinyin among four choices and (b) write down tone 1, 2, 3, or 4, representing flat, rising, rising and falling, and falling tones, respectively. The maximum score (60) was earned by producing the correct pinyin onset, rime, and tone for each of 20 characters. Although this task requires knowledge of pinyin (the alphabetic orthography used to write words for beginning Chinese readers) in addition to Chinese phonological awareness, all of our participants had received extensive pinyin training, so the variabilities in their performance is likely to reflect phonological awareness rather than pinyin knowledge.

Procedure

Participants completed the task in groups in their classroom. The lexical decision (20 min) and orthographic awareness (3 min) tasks were computerized whereas the stroke awareness (20 min) and phonological awareness (15 min) tasks were completed on paper. Across classrooms, we counterbalanced whether the computerized or paper-block was presented first. The paper-pencil tasks were scored by two RAs; the Pearson correlations of their inter-rater reliability were above 0.90.

Analytic strategy

We analyzed our data using crossed random effects models (Steacy et al., 2016). The unit of analysis is the outcome of an individual trial rather than the average across multiple trials. Examining lexical processing at this level allowed us to simultaneously examine the influence of child and character factors, critical to the goals of this project.

We examined two dependent measures: the accuracy of lexical decision, modeled as the log odds (logit) of correctly responding to a character, and the response time (RT) for correct lexical decisions. Because RTs are positively skewed, we log-transformed RT (in ms) prior to analysis (van der Linden, 2006), although this decision did not affect any of the patterns of significance.

Both models included multiple fixed effects of theoretical interest. At the character level, we included consistency (ranging from 0 to 1), transparency (transparent vs. opaque), and their interaction. At the child level, we included orthographic awareness, phonological awareness, and their interaction. All predictor variables were mean-centered to obtain estimates of the main effects analogous to those from an ANOVA (Cohen, Cohen, West, & Aiken, 2003, pp. 357–358). To compare the effect sizes of our variables of interest, we z-scored them so that parameter estimates were expressed in terms of the effect a 1-standard-deviation change of each variable (Cohen et al., 2003, p. 512).

A further goal of the study was to examine how the influence of child and character properties varied developmentally across grades. As noted above, we applied growth-curve analysis (e.g., Berninger et al., 2010; Francis et al., 1996) by allowing both the intercept (overall performance) and each effect of interest to vary across grade levels. To capture the possibility of decelerating growth (i.e., a plateau effect), we incorporated a quadratic growth term as well as a linear one.

Lastly, to account for the nested structure of the data (Baayen, Davidson, & Bates, 2008; Raudenbush & Bryk, 1988), we included random intercepts for participant, classroom, and item (character).

Thus, our model of response time (using mixed-effects notation; Matuschek, Kliegl, Vasishth, Baayen, & Bates, 2015) for subject i in classroom j responding to item k took the following form:

$$ \mathit{\log}\left({y}_{ij k}\right)={\gamma}_{0000}+{\gamma}_{1000}\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right)+{\gamma}_{2000}{\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right)}^2+{\gamma}_{3000}{\mathrm{Consistency}}_k+{\gamma}_{13000}\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right){\mathrm{Consistency}}_k+{\gamma}_{23000}{\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right)}^2{\mathrm{Consistency}}_k+{\gamma}_{4000}{\mathrm{Transparency}}_k+{\gamma}_{14000}\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right){\mathrm{Transparency}}_k+{\gamma}_{24000}{\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right)}^2{\mathrm{Transparency}}_k+{\gamma}_{34000}{\mathrm{Consistency}}_k{\mathrm{Transparency}}_k+{\gamma}_{134000}\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right)\ {\mathrm{Consistency}}_k{\mathrm{Transparency}}_k+{\gamma}_{234000}{\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right)}^2{\mathrm{Consistency}}_l{\mathrm{Transparency}}_l+{\gamma}_{5000}{\mathrm{PA}}_{ij}+{\gamma}_{15000}\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right){\mathrm{PA}}_{ij}+{\gamma}_{25000}{\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right)}^2{\mathrm{PA}}_{ij}+{\gamma}_{6000}{\mathrm{OA}}_{ij}+{\gamma}_{16000}\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right){\mathrm{OA}}_{ij}+{\gamma}_{26000}{\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right)}^2{\mathrm{OA}}_{ij}+{\gamma}_{67000}{\mathrm{PA}}_{ij}{\mathrm{OA}}_{ij}+{\gamma}_{167000}\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right){\mathrm{PA}}_{ij}{\mathrm{OA}}_{ij}+{\gamma}_{267000}{\left({\mathrm{Grade}}_{ij}-\overline{Grade}\right)}^2{\mathrm{PA}}_{ij}{\mathrm{OA}}_{ij}+{u}_{ij0}+{v}_{0j0}+{w}_{00k}+{e}_{ij k} $$

where uij0 is the random intercept for subject i (independently sampled from a normal distribution of subject effects with mean 0 and variance \( {\uptau}_{\mathrm{U}}^2 \)), v0j0 is the random intercept for classroom j (independently sampled from a normal distribution of classroom effects with mean 0 and variance \( {\uptau}_{\mathrm{V}}^2 \)), w00k is the random intercept for item k (independently sampled from a normal distribution of item effects with mean 0 and variance \( {\uptau}_{\mathrm{W}}^2 \)), and eijk is a random trial-level error term (independently sampled from a normal distribution with mean 0 and variance \( {\upsigma}_e^2 \)). The model of logit accuracy for subject i in classroom j responding to item k was the same except that the trial-level error term was omitted and the dependent measure was log(\( \frac{y_{ijk}}{1-{y}_{ijk}} \)), where yikj is the probability of subject i in classroom j responding correctly to item k.

We adopted a model-based approach to outlier detection by eliminating observations with residuals more than 3 standard deviations from the mean, then refitting each model. This procedure identifies observations that are outlying after considering all experimental variables, subject differences, and item differences (Baayen, 2008, p. 207). All models were fit in R using package lme4 (Bates et al., 2015).

Results

Figure 1 displays overall accuracy and response time (RT) for each grade level among all readers. Lexical decision accuracy increased sharply from the third to the fourth grade but plateaued afterwards; RT showed a mostly linear decline (i.e., increased speed) across grades.

Fig. 1
figure 1

Proportion accuracy for lexical decisions (left panel) and response time for correct lexical decisions (right panel) as a function of student grade level. Error bars indicate the standard error across subjects

Effects of character-level variables

Accuracy

The top half of Table 3 displays character-level effects in our model of accuracy, with fewer than 0.1% of outlying observations removed. This model revealed two significant character-level effects, visualized in the top panels of Fig. 2. First, radical consistency affected lexical decision accuracy, but was qualified by grade. As can be seen in the left panel of Fig. 2, at higher grade levels, characters with consistent radicals (dark lines) were responded to more accurately than characters with inconsistent radicals (light lines) whereas this effect was largely absent for grades 3 and 4 and reversed for lower grade levels. Second, transparency effects also varied across grades in a quadratic pattern: as seen in the middle panel of Fig. 2, opaque characters were better recognized than transparent ones at early grades (grades 1 and 2) and higher grades (grades 5 and 6), but this effect was absent in the middle grade levels (3 and 4). There were no interactions of transparency and consistency in accuracy.

Table 3 Fixed-effect estimates from crossed random effects logit model of accuracy
Fig. 2
figure 2

Proportion accuracy for lexical decisions as a function of (a) radical consistency only (top left panel), (b) transparency only (top middle panel), (c) consistency and transparency (top right panel), (d) orthographic awareness only (bottom left panel), (e) phonological awareness only (bottom middle panel), and (f) orthographic awareness and phonological awareness (bottom right panel), as well as student grade. Consistency, orthographic awareness, and phonological awareness are depicted as median splits for purposes of visualization but were entered as continuous variables into the crossed random effects models

Response time

Table 4 displays the estimates from our model of RT in accurate lexical decisions, with 0.8% of outlying RTs removed. This model revealed two significant character-level effects, visualized in the top panels of Fig. 3. First, there was an effect of transparency that varied across grade levels: as depicted in the top middle panel of Fig. 3, at lower grade levels, transparent characters were responded to more quickly than opaque ones, but this effect diminished at higher grade levels. Second, transparency and consistency interacted; the difference between opaque and transparent characters was driven by lower-consistency radicals (gray points and lines in the top right panel of Fig. 3) and was much smaller for high-consistency radicals (black points and lines). Again, this effect was qualified by grade level such that it was stronger in lower grades.

Table 4 Fixed-effect estimates from crossed random effects model of response time in accurate lexical decisions
Fig. 3
figure 3

Response times for correct lexical decisions as a function of (a) radical consistency only (top left panel), (b) transparency only (top middle panel), (c) consistency and transparency (top right panel), (d) orthographic awareness only (bottom left panel), (e) phonological awareness only (bottom middle panel), and (f) orthographic awareness and phonological awareness (bottom right panel), as well as student grade. Consistency, orthographic awareness, and phonological awareness are depicted as median splits for purposes of visualization but were entered as continuous variables into the crossed random effects models

Effects of child-level variables

Accuracy

The bottom half of Table 3 displays the child-level effects on lexical decision accuracy. As seen in the bottom panels of Fig. 2, students with higher phonological awareness were significantly more accurate in lexical decision. This effect was evident across grade levels in a main effect, but its strength did vary somewhat with grade. Specifically, the quadratic trend indicates that the magnitude of the phonological awareness effect was smallest in intermediate grades 3 and 4 but larger in lower and higher grade levels.

Orthographic awareness (bottom middle panel of Fig. 2) was also associated with higher lexical decision accuracy and, indeed, had the single largest effect on accuracy. Like phonological awareness, this effect was evident across grades (i.e., a main effect). Nevertheless, the linear and quadratic trend indicate that the magnitude of the orthographic awareness effect generally declined as grade levels advanced, but this decline was sharpest across the transition from first to third grade.

Orthographic awareness and phonological awareness did not significantly interact regardless of grade level; rather, they appear to be two independent, additive skills.

Response time

The effects of student characteristics on RT are displayed in the bottom halves of Table 4 and Fig. 3. Orthographic awareness, again, had the strongest influence. This effect was evident in a significant main effect across grade levels; although the bottom middle panel of Fig. 2 suggests the magnitude of the effect declined somewhat in higher grade levels, this trend did not approach conventional levels of significance.

Beyond orthographic awareness, students with higher phonological awareness also had faster RTs. The effect of phonological awareness varied linearly across grade levels (bottom left panel of Fig. 3), with the largest effect coming at lower grade levels.

There were, again, no significant interactions of orthographic awareness and phonological awareness.

Performance of poor readers

We also considered whether the above character- and child-level effects similarly occurred among the poor readers. Given the size of a dataset as a whole, this still left 38,400 lexical decision trials for analysis.

Character-level variables among poor readers

Whereas lexical decision accuracy increased linearly across student grade levels for the sample as a whole, it showed quadratic growth among poor readers; accuracy rapidly increased among the first few grade levels but then reached an asymptote. Table 5 and Fig. 4 display the model of lexical decision accuracy among the poor readers, with 0.04% of outlying observations excluded. The top half of Table 5 indicates that the poor readers were sensitive neither to radical consistency nor the difference between semantically transparent and opaque characters.

Table 5 Fixed-effect estimates from crossed random effects logit model of accuracy, for poor readers only
Fig. 4
figure 4

Proportion accuracy for lexical decisions among poor readers as a function of (a) radical consistency only (top left panel), (b) transparency only (top middle panel), (c) consistency and transparency (top right panel), (d) orthographic awareness only (bottom left panel), (e) phonological awareness only (bottom middle panel), and (f) orthographic awareness and phonological awareness (bottom right panel), as well as student grade. Consistency, orthographic awareness, and phonological awareness are depicted as median splits for purposes of visualization but were entered as continuous variables into the crossed random effects models. Not all combinations of orthographic awareness, phonological awareness, and grade level were represented among poor readers

Table 6 and Fig. 5 display the corresponding results on RT in accurate trials, with outlying observations (0.9% of the data) removed. Again, poor readers showed no statistically reliable influences of consistency or transparency on their RTs.

Table 6 Fixed-effect estimates from crossed random effects model of response time in accurate lexical decisions, for poor readers only
Fig. 5
figure 5

Response times for correct lexical decisions among poor readers as a function of (a) radical consistency only (top left panel), (b) transparency only (top middle panel), (c) consistency and transparency (top right panel), (d) orthographic awareness only (bottom left panel), (e) phonological awareness only (bottom middle panel), and (f) orthographic awareness and phonological awareness (bottom right panel), as well as student grade. Consistency, orthographic awareness, and phonological awareness are depicted as median splits for purposes of visualization but were entered as continuous variables into the crossed random effects models. Not all combinations of orthographic awareness, phonological awareness, and grade level were represented among poor readers

Child-level variables among poor readers

There was still variation in both phonological and orthographic awareness even among the group of poor readers.

The bottom halves of Table 5 and Fig. 4 depict the effects of these child-level variables on lexical decision accuracy specifically among poor readers. Even within this restricted sample, orthographic awareness predicted response accuracy, as it did for the entire sample. One difference is that, among poor readers, the orthographic awareness effect was equally strong across grade levels whereas for all readers, it is was most evident in early grades. By contrast, variation in phonological awareness among poor readers did not directly relate to lexical decision accuracy. But, orthographic and phonological awareness did interact to predict lexical decision accuracy such that benefits of orthographic awareness were larger among students who also had good phonological awareness (though less so in middle grades).

Lastly, the bottom halves of Table 6 and Fig. 5 show the influence of child characteristics on RT among the poor readers. Neither orthographic awareness nor phonological awareness predicted RT when examining just the poor readers.

Discussion

In the present study, we explored the effects of character-level and child-level factors on the development of children’s character recognition. The major findings are fourfold.

First, lexical decision speed and accuracy increased from the first grade to the sixth in an asymptotic pattern, with the largest gains in accuracy coming between the third and fourth grade.

Second, both phonetic radical consistency and semantic transparency influenced lexical decision. Further, consistency and transparency interacted such that the response-time difference between transparent and opaque words was driven largely by words with lower-consistency phonetic radicals. Moreover, poor readers did not show these character-level effects.

Third, these influences showed developmental trajectories such that, at the earliest grades, words with more consistent radicals were judged less accurately, but by intermediate grades, these effects had disappeared, and by higher grades, the effects had reversed such that higher consistency was associated with a higher probability of accurate lexical decision.

Fourth, both phonological awareness and especially orthographic awareness were associated with substantial benefits in lexical decision among the broader sample of readers; among poor readers, only orthographic awareness showed a main effect. Further, while these skills had independent, additive effects among the full sample, they interacted in poor readers such that orthographic awareness had a greater benefit among students who also had good phonological awareness.

Lexical quality in Chinese Reading

These results have several important implications. First, our findings explicate how the lexical quality hypothesis can be applied to reading Chinese characters. Traditionally, this hypothesis claims that high lexical quality includes well-specified and partly redundant representations of form (orthography and phonology) and flexible representations of meaning, allowing for rapid and reliable meaning retrieval (Perfetti, 2007). Our study suggests that, in particular, Chinese character recognition is facilitated by form representations; namely, knowledge of the orthographic and phonological properties of radicals.

However, these types of knowledge are not equal in their importance. We observed an interaction of consistency and transparency such that the influence of characters’ semantic transparency emerged primarily when the orthography (i.e., the phonetic radical) did not consistently map to a phonological representation. This suggests that the orthography-to-phonology mapping is the primary source of lexical quality in Chinese character reading, and the semantic transparency of characters is only a secondary mechanism. This finding is consistent with the Universal Phonological Principle (Perfetti & Harris, 2013), which points to a core role of phonology in character decoding.

Second, the time course of these effects suggests important limitations on the influence of radical knowledge. We found that benefits of orthography-to-phonology consistency on character reading were limited to the 5th and 6th grade. This limitation may be surprising because, by the time children reach the third grade, most Chinese children rated as average or high ability are functionally aware that the radicals in compound characters contain information about meaning, which can be used to learn and remember characters and to derive meanings of unfamiliar characters (Shu & Anderson, 1997). However, our results suggest that younger children, as well as older children low in orthographic and phonological awareness, either are not aware of the function of radicals or they do not systematically use the orthographic or phonological information of the new characters to be acquired.

This slowly developing effect of consistency is consistent with the theoretical model of Zevin and Seidenberg (2004, 2006). As students’ learning experience accumulates and skill develops, the effects of psycholinguistic properties change as the oral reading system approaches maximal efficiency. Indeed, such learning effects have been argued to be inherent in connectionist network systems, i.e., asymptotic learning based on distributed representations and a nonlinear input-output function (Plaut, McClelland, Seidenberg, & Patterson, 1996; Van Orden, Pennington, & Stone, 1990). Our current study provides empirical evidence supporting this theory and contributes toward building a theoretical model of more precise effects of word-level and student-level effects on word reading development across ages and, in the future, perhaps across languages (Guan & Fraundorf, 2019).

Deficits and compensation among poor readers

Our results help to characterize the deficits faced by children with poor Chinese reading skill. Although it has been suggested that the primary deficit in dyslexia is phonological (e.g., Liberman, 1983; Stein, 2018), we found that poor readers were also restricted in their use of semantic information (i.e., character transparency) in word recognition.

Further, these deficits (Wolf & Bowers, 1999) differed from those experienced by beginning readers; among the broader sample, even readers in grades 1 or 2 were sensitive to transparency, whereas poor readers never were. That is, our poor readers were not simply less proficient readers; they exhibited unique deficits that rendered them qualitatively distinct from readers who merely lack experience.

Nevertheless, our results also suggest the existence of a compensatory mechanism among poor readers. For general readers, both orthographic awareness and phonological awareness contributed to accuracy whereas for poor readers, only orthographic awareness directly predicted performance. (Nevertheless, phonological awareness did act in concert with orthographic awareness such that orthographic awareness had a greater benefit among students with good phonological awareness.) That is, orthographic awareness may help to compensate for reading skills that are otherwise poor: although that children were poor in their use of phonological and semantic information, what they could leverage to identify words was their physical form. Thus, one possible way to improve the word recognition ability of poor readers may be to raise their orthographic awareness. However, one important caveat is that the poor readers in our study were identified solely by their academic performance and did not necessarily have a formal diagnosis of reading disability. In the future, it would be valuable to test whether these same conclusions apply to children diagnosed with dyslexia.

More broadly, this study implies that both phonological awareness and orthographic awareness are critical in children’s language learning. One implication is that, at the student level, it may be beneficial for language teaching to focus on students’ phonological and orthographic awareness, especially for students in lower grade levels. At the word level, it may be important to teach first- and second-grade students to master the radicals in Chinese characters so that they can use this knowledge in higher grades. Some studies have already explored early intervention (Anderson, Li, Ku, Shu, & Wu, 2003; He, Wang, & Anderson, 2005), and recent research (Hsuan, Tsai, & Stainthorp, 2018) have also suggested that the relevance of both PA and OA training among lower graders in Taiwan. Furthermore, the current mainstream Chinese curriculum (NIES, 2012) emphasizes compound awareness and radical awareness among students to benefit their later reading development.

Limitations

Here, we focused on the roles of phonological and orthographic awareness in Chinese, using language-specific measures. To facilitate comparison and generalization across languages, it would be beneficial to design and validate more comparable language-specific measures of PA and OA (Wright & Cervetti, 2017).

Although we examined interactions between the two character-level variables and the two student-level variables, there may also be character × student interactions; for instance, students with high PA might make more use of radical information. Further, our student-variables were relatively metalinguistic measures of awareness, and there are other word-level variables, such as frequency and AOA, also relevant to reading development, that could be considered using similar analytic methods (including in our own ongoing work; Guan & Fraundorf, 2019).

Conclusion

Lexical decision speed steadily progressed from grades 1 to 6 among Chinese elementary students; however, character recognition accuracy reached a near-asymptote by the fourth grade. Analysis of character- and child-level variables helped to extend the lexical-quality hypothesis by revealing the pivotal effect of orthography-to-phonology mapping in learning to read in Chinese. Finally, poor readers displayed a compensatory use of orthographic form, which could possibly be trained among poor or potentially dyslexic readers.