Introduction

The primary purpose of the present study was to investigate the extent to which Chinese single character reading and 2-character word reading may reflect somewhat different processes in Hong Kong Chinese young children. Chinese word recognition involves both single-character identification and multi-character word reading. Approximately 65% of Chinese words comprise two or more characters, while 35% of Chinese words are single characters (McBride, 2016; Modern Chinese Frequency Dictionary, 1986; Sun et al., 1996). The Chinese word can be comprised of one or more characters with independent meanings; this is conceptually similar to the concept of a word in an alphabetic script (McBride, 2016). However, in this paper, we use the term “word” more narrowly to refer to a word that is made up of two characters (e.g., 鷄蛋) or more, while we refer to “character” as a single character (e.g., 鷄) for clarity. We examine how the processing of the basic graphic unit, the character, is different from the processing of words among Chinese children.

This concept of character vs. word in Chinese is best demonstrated with illustrations of compound words in both English and Chinese. For example, in English, snow and ball can be combined to form an English compound word snowball; in Chinese, 雪(syut3, meaning snow) and 球 (kau4, meaning ball) can also be combined to form a Chinese compound word 雪球 (syut3 kau4, meaning snowball). McBride (2016) proposed a model of the relationship of the acquisition of single characters and multiple-character words in Chinese reading. In the model, compound/multiple-character words can be divided into several characters via analysis, while two or more single characters can comprise words via lexical compounding. However, unlike the case in some alphabetic scripts, there is no clear space boundary between Chinese words. For instance, in the sentence /今天天氣很好/, 今天/gam1 tin1/ (meaning today) and 天氣/tin1 hei3/ (meaning weather) are both words (The entire sentence together means “Today the weather is very good.”). The word 天天 /tin1 tin1/ (meaning everyday) also appears in this sentence if one ignores the surrounding characters. However, interpreting these two characters together as meaning天天 /tin1 tin1/ (meaning everyday) here would be incorrect in context. One can see in this example that by focusing on connections between different characters, different lexicalized meanings are possible. Therefore, Chinese readers have to identify the words expressed by the sentence through the context (Chen et al., 2018).

Such differences make reading of Chinese practically, and perhaps theoretically, substantially different from alphabetic word reading. Therefore, a focus on single character vs. 2-character word recognition is potentially important in both theoretical and practical ways. Theoretically, word reading models generally tend to operate at the word level for alphabetic reading but at both the character and word levels simultaneously in Chinese. Practically, if character and word recognition constitute somewhat different processes, then learning of characters and words may require somewhat different skills. Teachers and parents might, therefore, take advantage of different strategies when instructing children in their learning of Chinese characters and words.

Character reading and word reading are not (exactly) the same

Character reading and word reading are highly associated as words are made up of characters, and sometimes individual characters are also words (Ho & Bryant, 1997; Zhou, et al., 1999). Both characters and words have been considered important fundamental processes in reading acquisition among Chinese young children (Li, 2002; McBride-Chang & Ho, 2000). However, studies are beginning to highlight some differences between character reading and word reading among children (e.g., Chu & Leung, 2005; McBride, 2016) from at least three different perspectives. First, the character in a word or by itself alone usually conveys different meanings (Chen et al., 2009). For example, /熱情/ means “enthusiastic” as a whole, but the meanings of the two individual characters are “hot” and “emotion” respectively. Second, the same character tends to be read more accurately in the context of a word than in isolation, because children can extract useful information from the surrounding character(s) to help them recognize the target one (Wang & McBride, 2016). For instance, children are more likely to recognize /蛋/( meaning eggs) when it appears in the word /鷄蛋/ (meaning chicken eggs), because they can infer the pronunciation and the meaning with the familiar character of /鷄/ (meaning chicken). An example of a similar phenomenon in English would be reading ache better in the context of the word backache because of an ability to make an inference about the whole word given the word back.

A third perspective on potential character-word differences highlights reading strategies occurring in character as compared to word recognition. In several studies, young Chinese readers have tended to use an analytic strategy to process characters while using a holistic strategy to process multi-character words (Chen et al., 2009; Chung et al., 2010; McBride, 2016). On the one hand, children need to analyze the positions of radicals in a character to know its meaning(s) and pronunciation(s) (Xing et al., 2004). On the other hand, knowing one character within the word can help children to distinguish the meanings and pronunciations of the other new ones by using a holistic strategy (Leong, 1997; McBride, 2016). For example, the single character /奇/ can be pronounced either as /kei4/ or /gei1/. In the word /奇怪/ (meaning wired), the character will be read as /kei4/ and in the word /奇數/ (meaning odd numbers) it will be read as /gei1/. Consequently, character reading and word reading should reflect somewhat different processes.

In the past few decades, studies have highlighted at least three critical metalinguistic skills for Chinese character reading and word reading. These skills are phonological awareness, morphological awareness, and orthographic knowledge (Ho & Ma, 1999; Packard et al., 2006; Shu et al., 2008). Apart from these three core metalinguistic skills, rapid automatized naming (RAN) is another important correlate of word reading in Chinese, involving children’s ability to name a list of familiar stimuli as quickly as possible (Pan et al., 2011). Given that character and word recognition in Chinese are somewhat different, certain cognitive-linguistic skills associated with them may manifest themselves distinctively from one to the other in young children. The logic by which we make this argument is introduced in detail below.

Underlying cognitive-linguistic skills for character vs. word recognition

The relative importance of phonological awareness for character and word reading is unclear. On the one hand, because access to Chinese phonology is required for both character reading (Ho & Ma, 1999; Hu & Catts, 1998; Li et al., 2012; Shu et al., 2008) and word reading (Hu & Catts, 1998) concurrently, phonological awareness might be consistently utilized across both. On the other hand, Chinese single characters by themselves convey very limited phonological information, especially in Hong Kong traditional Chinese recognition, which has typically been taught in the absence of any phonological coding systems (i.e., Pinyin and Zhuyin-Fuhao) (McBride-Chang, Bialystok, Chong & Li, 2004). The importance of phonological awareness for reading in Chinese likely diminishes with development (e.g., McBride, 2016), and, even for young children, training in phonological awareness in Chinese does not appear to result in improvement in Chinese reading abilities (e.g., Zhou et al., 2012), in contrast to similar training carried out in Indo-European languages (e.g., Ehri et al., 2001; Schneider et al., 2000).

As noted above, the majority of Chinese words are compound words, composed of two or more constituent characters. Yet individual characters are usually morphemes too, so when children do process characters/words as a whole relatively early in the developmental trajectory, morphological awareness should be helpful in this process. Moreover, it is possible that morphological awareness might have a more crucial role for word reading than for character reading in Chinese. That is because, most Chinese words are comprised of two or more morphemes (Chen, 1996; Li & McBride-Chang, 2014; Lo et al., 2019; Wong et al., 2014), and morphological awareness can help readers to disambiguate homophones across words (e.g., McBride, 2016). For example, /少/ (siu3, meaning young) of /少年/ and /笑/ (siu3, meaning smile) of /微笑/ share the same pronunciation but have different meanings.

Interestingly, orthographic knowledge might be particularly important for character as compared to word reading (Yang et al., 2019a). When recognizing single characters, Chinese children are likely to focus on the internal orthographic structures of characters, including knowledge of the forms and the functions of phonetic radicals and semantic radicals (McBride-Chang, Shu, Zhou, Wat, & Wagner, 2003). As showed by Liu et al. (2010), children tend to focus more on the processing of visual-orthographic information when analyzing the radicals at the character level than at the word level. Indeed, previous studies have demonstrated the importance of visual-orthographic knowledge in character reading concurrently (e.g., Siok & Fletcher, 2001). Nevertheless, many Chinese characters that are learned by children in the early stages are semiregular and irregular characters (Wang & Tao, 1993), making it a bit difficult to take advantage of orthographic knowledge in the early years of reading acquisition. It remains unclear whether orthographic knowledge is related to character reading as well as word reading.

Finally, RAN is thought to reflect children’s rate of lexical access to phonological information stored in long-term memory (Wagner & Torgesen, 1987). RAN likely captures facets not only of phonological processing (Pan & Shu, 2014), but also of orthographic (e.g., Liao et al., 2008; Logan et al., 2011; Wolf & Bowers, 1999) and other oral language-writing system aspects, such as visual-verbal associations (Georgiou et al., 2013). On the one hand, the phonological processing and access to the phonological information in long-term memory are very helpful to word reading (Wei et al., 2015). On the other hand, RAN reflects the arbitrary mapping from visual orthography to phonology, which is also very important for character reading (Koponen et al., 2016; Wagner & Torgesen, 1987). In Chinese Mandarin-speaking children, some studies suggest that RAN is more important for word reading (Wei et al., 2015) than for character reading (Liao et al., 2008; Pan et al., 2011), while others have found that RAN was more salient in character reading than in word reading (Wang & McBride, 2016; Yang & McBride, 2020). However, little evidence has yet emerged for the relations of RAN with character reading versus word reading for Hong Kong Cantonese-speaking children.

Most previous studies of Chinese reading have effectively treated character reading and word reading as the same process and, correspondingly, assessed character reading or word reading, rather than both reading skills (e.g., Liu et al., 2017; Song et al., 2016). Studies tapping both character and word recognition in Chinese children are rare. We are aware of only three studies that have included both word reading and character reading (Li et al., 2017; Liu & Zhu, 2016; Wang & McBride, 2016). Li and colleagues (2017) showed that knowledge of words facilitated individual character recognition; this study did not investigate cognitive-linguistic skills in relation to character or word recognition. Liu and Zhu (2016) concurrently assessed the fluency of word reading and the accuracy of character reading; they found that RAN predicted word reading fluency when character reading accuracy was controlled. The unique associations of other cognitive-linguistic skills with character and word reading were not clear. Wang and McBride (2016) concurrently examined the cognitive-linguistic correlates of character reading and word reading among mainland Chinese children. They found that orthographic awareness appeared to be more strongly related to character as compared to word reading with the other reading skill statistically controlled. Longitudinal studies are important for a comprehensive understanding of character and word reading development. Since characters form the building blocks of words, we predicted that children’s character reading acquisition would predict children’s subsequent word reading.

The present study

Our longitudinal study focused on whether and how Chinese character reading and word reading reflect different processes and compared the contributions of metalinguistic awareness, including morphological awareness, phonological awareness, and orthographic awareness, and RAN, to character and word recognition. To explore the basic units in Chinese reading, beginning readers must grasp the underlying skills required for reading of characters and words; for these, metalinguistic awareness and RAN are particularly crucial (e.g., Pan et al., 2011; Yang et al., 2019a, b). Therefore, we examined whether these important cognitive-linguistic skills might be associated with character reading and word reading somewhat differently, both concurrently and longitudinally. Traditional characters among Hong Kong children were specifically examined. Traditional characters contain more visual-orthographic information than do simplified characters. Thus, they might be particularly instructive in distinguishing character and word reading. Character reading was measured using a list of single characters, while word reading was measured with a list of two-character words.

Children were initially tested in the third year of kindergarten (time 1–T1), in order to examine the concurrent associations of various cognitive-linguistic skills with word and character reading. These participants were then tracked to grade one (time 2–T2); character reading and word reading were assessed again at this time. This age range was selected because it represents a crucial period of early literacy acquisition for Hong Kong Chinese children. To summarize, we would examined concurrent associations at time 1 in order to reveal their simple relationships. We examined concurrent relations at Time 1 and also longitudinally at Time 2, the latter of which we statistically controlled for autoregressive effects of character/word reading to determine unique contributions of the predictors.

Method

Participants

The present study is a part of an on-going longitudinal project in [name deleted to maintain the integrity of the review process]; it was designed to examine Chinese reading and spelling development. The original sample size of the present study in the first wave included 325 [name deleted to maintain the integrity of the review process] kindergarteners. All the participants were recruited from nine local kindergartens located in three geographical regions of [name deleted to maintain the integrity of the review process]. They were typically developing Cantonese-speaking children without any reported special education needs. In the second wave, 42 children dropped out of the study due to unidentified reasons. Finally, 283 [name deleted to maintain the integrity of the review process] kindergartners (Mean age = 66.24, SD = 5.48, 161 boys) completed all tasks of the two waves. In [name deleted to maintain the integrity of the review process], formal literacy training begins in the first year of kindergarten (K1) when children are around 3.5 years old. Children are expected to acquire 150- 200 characters by the end of the third year of kindergarten (K3; [name deleted to maintain the integrity of the review process] Education Department Curriculum Development Institute, 1996).

Measures

A set of individual tests, including phonological awareness, morphological awareness, orthographic knowledge, rapid automatized naming, vocabulary knowledge, character, and word reading, were administered to the children by trained research assistants majoring in psychology. Testing was performed individually during school hours in a quiet room. Additionally, a measure of nonverbal IQ test was tested in a group format. All tasks were administered in Cantonese.

Phonological awareness

This measure was adopted from a previous study that was conducted among Hong Kong children (Chung, McBride-Chang, & Wong, 2008). It consisted of two parts: syllable deletion and onset deletion. The syllable deletion section included 15 three-syllable real words and 14 three-syllable pseudo-word test items. The test required children to verbally repeat a three-syllable word first and then the experimenter asked them to delete one syllable from this word in order to say the new phrase aloud. For example, children were required to say aloud /ning4/ /mung 4/ /caa4/ (檸檬茶, lemon tee) without /caa4/ (茶, tea). The correct answer is /ning4/ /mung4/. The onset deletion part consisted of 10 real and 12 pseudo one-syllable words. In each item, children were required to repeat a one-syllable word first and then the experimenter asked them to drop the first consonant and say what was left in the syllable. For example, children were asked to say aloud /po4/ (婆, old women) without the initial sound. The correct answer would be /o4/ (哦, oh). The scores of these two parts were summed to represent a total phonological awareness score. Performance on syllable deletion and onset deletion were significantly associated with each other, r = 0.48, p < 0.001. The potential maximum score for this task was 51.

Morphological awareness

This measure was adopted from two previous studies (McBride-Chang, Shu, Zhou, Wat, & Wagner, 2003). It consisted of four practice items and 48 experimental items. In each item, a scenario was presented orally in less than or equal to two sentences. Children were asked to construct a novel compound word to describe the object or concept based on that scenario, from known morphemes. For example, one story was “日頭出嚟, 我地會叫佢做日出 /yat6 ceot1/; 咁月亮出嚟, 我哋會點叫佢啊? (When the sun rises, we call that a sunrise. What would we call it if the moon rises?). The correct answer in this example is “月出 /yuet6 ceot1/ (moonrise)”. The possible maximum score was 48, with one mark for each corrected answer.

Rapid automatized naming (RAN)

Rapid automatized naming performance was measured with a rapid number naming test (Denckla & Rudel, 1976). For this test, eight rows of five numbers (e.g., 7, 4, 9, 6 and 2) were visually presented. These five numbers were displayed in various orders for each row. Children were asked to name the numbers row by row as quickly and as accurately as possible. They were required to complete two trials, and the average time in seconds was recorded.

Orthographic knowledge

A Chinese character decision test (Tong et al., 2009) was used to tap individual orthographic knowledge by measuring knowledge of the character structure and the radical position. It consisted of 10 visual symbols, 10 pseudo-characters, 20 non-characters, and 30 real characters. All items were presented in 14 sets with 5 items each in a paper booklet. Children were asked to decide whether each visually presented item was a real character or not. The correct answers for the 30 real characters were yes, and no for the other items (non-characters, pseudo-characters, and visual symbol items). The potential maximum score for this task was 70 points with one point for each correctly identified item.

Vocabulary knowledge

This vocabulary test, tapping receptive vocabulary, expressive vocabulary, and vocabulary definitions, was used to measure children’s vocabulary knowledge (Ho et al., 2017). In the receptive vocabulary subtest, there were 10 items in total. For each item, children heard a word presented orally and were asked to identify one out of four pictures that best represented the word they heard by pointing to the correct answer on the page. In the expressive vocabulary subtest, there were also 10 items. This required children to name the presented picture for each item. These two sections were conducted in sequence. One point was given for each correct answer. Each section was terminated only if and when children gave five 0-point responses consecutively. In the vocabulary definition subtest, the experimenter read aloud a word and children were required to explain the word they heard. This section included 5 test items in total. Children’s answers were rated by two trained experimenters based on rating criteria determined through pilot testing and a previous study (See McBride-Chang et al., 2008). Two points were given for the best answer that completely expressed the meaning of the word, one point was given for an answer that partially expressed the meaning of the word, and 0 point was given for irrelevant answers. The scores of the three parts were summed as a proxy for a basic vocabulary knowledge score. The maximum possible score for this measure was 30.

Nonverbal IQ

Raven’s Standard Progressive Matrices were used as a standardized test to measure children’s nonverbal IQ. There were five sets with 12 items each. In this study, children were required to complete the short version, including Sets A, B, and C. For each item, children were presented with a visual matrix which has a missing part. They were asked to select the best matching piece to complete the visual matrix from among six to eight alternatives. Each correct answer was marked as worth one point. The potential maximum score of this measure was 36.

Chinese character reading

This untimed character reading test has been used in previous studies (Ho & Bryant, 1997; McBride-Chang, et al., 2003; Zhou et al., 2012). This measure included 27 characters, and all items were presented in order of ascending difficulty. These characters were selected from five of the most commonly used reading texts in Hong Kong kindergarteners (Ho & Bryant, 1997). All children were asked to read the characters aloud from the beginning. One point was allotted for each correct answer, and the possible maximum score was 27.

Chinese word reading

Word reading was administered following the character reading test (Ho & Bryant, 1997). This measure included 34 two-character words, and all items were presented in order of ascending difficulty. These words were included on the basis of common words typically assessed in young children as noted by the local Education Department. Similar to our protocol for the character reading test, all children were asked to read these words aloud from the beginning without time limit. One point was given for each correct answer. The maximum possible score for children’s word reading performance was 34.

Procedure

Written consent was obtained from children’s parents before testing. Children were first tested when they were in K3 (third year of kindergarten–Time 1). Testing at kindergarten consisted of a set of individual tests including phonological awareness, morphological awareness, orthographic knowledge, rapid automatized naming, vocabulary knowledge, character, and word reading. It also included a nonverbal IQ test which was tested in a group. It took about 50 min to finish all the tests; a short break was allowed if necessary. Children’s character and word reading were tested again one year later when they were in Grade 1 (Time 2). This testing took about 5 min. The same measures were used in both Times 1 and 2 to test children’s character and word reading. At kindergarten, children were tested in their classrooms. At grade one, depending on parents’ preferences, children were tested either at their homes in a quiet place or in a laboratory setting. All the tests were administered by native Cantonese speaking and formally trained experimenters.

Results

Table 1 shows descriptive statistics of all measures in this study. Generally, the reliabilities of all measures were moderate to high, from 0.72 to 0.98. The distributional properties of all measures were appropriate, as demonstrated by the skewness values, ranging from -1.92 to 0.90. Word reading at kindergarten and character reading at both time points were not normally distributed (skewness > 1). Considering that the aim of Generalized estimating equations (GEE) is to address the dependency of data issues inherent in within-subject longitudinal data, and to estimate more efficient and unbiased regression parameters compared to ordinary least squares regression, it was used for analyses. Moreover, the advantage of this technique is that it is not dependent on the distribution of the data; indeed, it can provide a valid inference regardless of the distribution of the data (Ballinger, 2004; Feng et al., 2014). R package geepack (Halekoh et al., 2006), which implements the generalized estimating equations (GEE) approach, was employed to fit the generalized linear models to clustered data. Paired sample tests showed that children’s performances on character recognition progressed from 21.67 at kindergarten to 24.77 at grade one (t (282) = 18.54, p < 0.001) and word reading progressed from 18.77 at kindergarten to 26.57 at grade one (t (282) = 10.79, p < 0.001). Pearson correlation coefficients are shown in Table 2. The correlations of character reading and word reading were 0.82 at time 1 and at time 2, demonstrating a strong overlap. Moreover, character reading and word reading across these two time points were significantly correlated with nonverbal IQ, vocabulary, RAN, phonological awareness, morphological awareness, and orthographic knowledge measured at kindergarten.

Table 1 Descriptive statistics of all variables measured at kindergarten and grade one
Table 2 Zero-order correlations among all variables measured at kindergarten and grade one

Concurrent generalized estimating equations (GEE) analyses at kindergarten

Generalized estimating equations (GEE) were created to investigate the respective contributions of phonological awareness, morphological awareness, orthographic knowledge and RAN to character reading and word reading at kindergarten. We additionally statistically controlled for age, nonverbal IQ, and vocabulary knowledge for all the following analyses to rule out their effects. Gender was not statistically controlled in these analyses because it did not correlate with any other variables of the present study. Results showed that vocabulary knowledge significantly explained unique variance in character reading (estimate = 0.33, 95% CI = [0.20, 0.46], p < 0.001) and in word reading (estimate = 0.33, 95% CI = [0.20, 0.47], p < 0.001), as shown in Table 3. Vocabulary knowledge explained 9.0% of the variance in character reading, and explained 9.1% of the variance in word reading. Moreover, the final estimate for RAN were significant for both character reading (estimate = -0.18, 95% CI = [−0.26, −0.11], p < 0.001) and word reading (estimate = −0.24, 95% CI = [−0.35, −0.13], p < 0.001). Among the metalinguistic skills, both phonological awareness and morphological awareness significantly accounted for unique variance in character reading (for phonological awareness, estimate = 0.19, 95% CI = [0.02, 0.35], p = 0.029; for morphological awareness, estimate = 0.19, 95% CI = [0.11, 0.26], p < 0.001), and in word reading (for phonological awareness, estimate = 0.20, 95% CI = [0.01, 0.39], p = 0.042; for morphological awareness, estimate = 0.24, 95% CI = [0.16, 0.32], p < 0.001). Metalinguistic skills uniquely explained 14.4% of the variance in character reading and 22.7% of the variance in word reading.

Table 3 Generalized estimating equations (GEE) explaining Chinese character reading and word reading at kindergarten

In order to look more strictly at unique correlates of word and character reading given their substantial overlap, Table 4 presents generalized estimating equations (GEE) results explaining character reading with word reading statistically controlled. Age, nonverbal IQ and vocabulary were entered as control variables, and word reading was entered at Step 3. Results showed that when word reading was included in the generalized estimating equation, other variables in Step 3 did not account for unique variance in character reading. In contrast, the final estimate for word reading was significant (estimate = 0.78, 95% CI = [0.59, 0.97], p < 0.001). Thus, only word reading made a unique contribution to character reading when other variables were statistically controlled. Word reading uniquely explained 44.8% of the variance in character reading. Similarly, we examined the concurrent correlates of word reading with character reading additionally entered in the equations. Table 5 shows that when age and general cognitive variables were controlled, character reading accounted for unique variance in word reading (estimate = 0.74, 95% CI = [0.66, 0.83], p < 0.001). Character reading uniquely explained 42.8% of the variance in word reading. The final estimates for RAN (estimate = -0.13, 95% CI = [−0.21, −0.05], p = 0.002), phonological awareness (estimate = 0.08, 95% CI = [−0.06, 0.23], p = 0.256) and morphological awareness (estimate = 0.12, 95% CI = [0.07, 0.17], p < 0.001) were all significant. Thus, RAN, phonological awareness, and morphological awareness significantly explained 4.8% of the variance in word reading, even after statistically controlling for character reading and other variables.

Table 4 Generalized estimating equations (GEE) explaining character reading with word reading controlled at kindergarten
Table 5 Generalized estimating equations (GEE) explaining Chinese word reading with character reading controlled at kindergarten

Longitudinal generalized estimating equations (GEE) analyses from kindergarten to grade one

In the final analyses, we examined how word reading and character reading at grade one were predicted by one another in kindergarten, together with other cognitive-linguistic skills administered at time 1. As shown in the longitudinal generalized estimating equations (Table 6), when predicting character reading at time 2, the final estimate for time 1 character reading was significant (estimate = 0.34, 95% CI = [0.23, 0.44], p < 0.001). In contrast, word reading at time 1 failed to account for unique variance in character reading (estimate = 0.02, 95% CI = [−0.02, 0.06], p = 0.348). Specifically, neither RAN nor metalinguistic awareness significantly predicted character reading at grade one after controlling other variables at kindergarten. Table 6 also presents the longitudinal correlates of word reading at grade one from all variables at kindergarten. Both the final estimates for character reading (estimate = 0.35, 95% CI = [0.15, 0.55], p = 0.001) and word reading (estimate = 0.49, 95% CI = [0.35, 0.63], p < 0.001) at kindergarten were significant. Furthermore, in Step 4, the final estimates for RAN and the three metalinguistic awareness variables was insignificant. The models explained 57.0% of the variance in character reading and 69.5% of the variance in word reading. Taken together, character reading at kindergarten was predictive of grade one word reading along with prior word reading beyond other variables. In contrast, word reading at kindergarten could not significantly predict future character reading.

Table 6 Generalized estimating equations (GEE) predicting character reading and word reading at grade one from all variables measured at kindergarten

Discussion

The present study highlighted some unique features of young children’s acquisition of single Chinese character recognition as compared to 2-character word reading. Morphological awareness, phonological awareness, orthographic awareness, and RAN were all associated with both single character and word reading across kindergarten to grade one. This result was consistent with many previous Chinese literacy studies (e.g., Chen, et al., 2009; Ho & Bryant, 1997; Pan et al., 2011). Furthermore, morphological awareness and RAN explained unique variance in Chinese word reading at kindergarten when character reading was controlled, while character reading had no such unique cognitive-linguistic correlates when word reading was controlled.

Another suggestion of some pattern of differences between character and word recognition acquisition, despite a correlation between these two variables of 0.82 at both times 1 and 2, is the longitudinal prediction of each. That is, character recognition in first grade was not longitudinally explained by word reading at kindergarten. In contrast, word reading in first grade was indeed explained by character recognition in kindergarten. Thus, across both the associated cognitive correlates and reading variables, word reading was explained by a broader set of skills than was character recognition. These findings underscore the importance of considering word reading as a broader literacy skill than character recognition. Moreover, the results indicate that word reading models generally tend to operate at both the character and word levels simultaneously in Chinese (McBride, 2016), which is different from Indo-European reading models that operate mainly at the word level (e.g., Rayner & Reichle, 2010). Wang and McBride (2016) demonstrated that kindergarteners scored significantly higher on reading the same character when embedded within a word than when it was presented alone. This phenomenon held even up through fifth grade (Li et al., 2017).

Previous studies revealed that a holistic strategy is useful for Chinese readers to recognize multi-character words. For example, one eye movement study showed that Chinese children in second grade tend to process short, two-character words as whole units (Chen et al., 2003). Moreover, other evidence from grade 3 (Li et al., 2017), grade 4 (Liu et al., 2010), and even kindergarten children (Wang & McBride, 2014) has suggested that children tend to process two-character words relatively holistically because they can use the contextual information in the word to help them to recognize unfamiliar characters within the words.

Below, we consider in more detail, separately, the cognitive-linguistic skills tested vis-à-vis both character and word recognition.

Morphological awareness, RAN, character reading, and word reading

Concurrently, morphological awareness and RAN explained unique variance in Chinese word reading at kindergarten when character reading was controlled; character reading had no such unique cognitive-linguistic correlates when word reading was controlled. Given the difficulty of reading in Chinese, young children tend to rely heavily on morphological processing, as well as individual character recognition, in order to derive the final meaning of a given word (Chung et al., 2010; Liu, Chung, Zhang, & Lu, 2014). Considering the compounding characteristics of Chinese words, word reading may require more lexical compounding knowledge than character reading, and depend more on morphological awareness. For example, a child recognizing “籃球”might do so because he/she simply recognized the relatively frequently used word “球”rather than “籃”, but then guessed the whole word. An example of a parallel phenomenon in English might be a child recognizing “basketball” partly because he/she recognizes the word “ball” (but finds “basket” too long and unclear) and then guessing that the rest of the word might be “basket” given what she has about compound words. Therefore, reading Chinese at the word level involves somewhat different underlying skills as compared to reading Chinese at the character level, and morphological awareness might particularly explain the difference between character reading and word reading.

Interestingly, the current study also found that RAN contributed to children’s word reading when character reading was controlled for. This was similar to the study of Liu et al. (2016) revealing that RAN at primary school accounted for unique variance in word reading concurrently. Notably, RAN successfully explained unique variance in children’s Chinese word reading skills above and beyond metalinguistic skills, including phonological awareness, morphological awareness, and orthographic knowledge (e.g., Liao et al., 2008; Liu & Zhu, 2016), and beyond general factors such as age and nonverbal IQ (Liao et al., 2008; Wang & McBride, 2016; Yang & McBride, 2020). Apart from the research that conceptualized RAN as a combination of phonological sensitivity (Liao et al., 2008; Pan & Shu, 2014; Wagner & Torgesen, 1987) and orthographic skills (Liao et al., 2008; Wolf & Bowers, 1999), our results further expanded the explanatory power of RAN independently of phonological awareness and orthographic knowledge. Therefore, different from phonological awareness and orthographic knowledge, constructs that are largely influenced by task characteristics and children’s developmental status, RAN taps into a language-universal cognitive mechanism when relating to word reading (Landerl et al., 2019; Pan et al., 2011; Ziegler et al., 2010). In contrast, RAN failed to explain unique variance in character reading when word reading was statistically controlled at kindergarten. This might be because the variance of character reading explained by RAN was much smaller than what RAN explained in word reading. Although RAN was associated with both character reading and word reading, it seemed to be slightly more strongly associated with word reading than character reading acquisition in Hong Kong young children at both time points. The underlying mechanisms of the RAN-word reading relationships might be that rapid naming integrates the visual and verbal skills required during word recognition and further reflects the fluency of simultaneous processing of multiple verbal stimuli (Kirby et al., 2010; Song et al., 2016; Wei et al., 2015).

Phonological awareness, orthographic knowledge, character reading, and word reading

Our phonological awareness test made use of syllable and onset knowledge. The correlational data on phonological awareness showed a moderate association between this task and both character and word reading, but no unique associations once word or character recognition was statistically controlled to predict the other (word or character recognition) reading skill, which is in contrast with alphabetic languages (Ehri et al., 2001). The relatively simple phonological structure of the Chinese language, compared to alphabetic languages, might make the importance of phonological awareness limited in reading of Chinese characters and words among Hong Kong primary school children (McBride-Chang et al., 2005; Pan et al., 2021). Nevertheless, there might be other explanations since phonological awareness was significant before the inclusion of word reading in the model. The reduction in the effect of phonological awareness on character reading could be due to the overlap in variance explained between word reading and phonological awareness.

In addition, orthographic knowledge was not a significant predictor of either character reading or word reading across all generalized estimated equations. This finding was inconsistent with findings from a previous study in which orthographic knowledge uniquely explained character reading and word reading of Hong Kong children from grade one to grade three (Pan et al., 2021). This might have been partly due to the fact that our participants were kindergarteners, and patterns of radical types and positions in characters take some time with children’s development to be recognized. Some studies suggest that orthographic knowledge typically develops during the primary school period (Ho, Ng, & Ng, 2003; Tong & McBride-Chang, 2010), and children at around 6 years old can only distinguish nonwords and pseudowords to some extent (Chan & Nunes, 1998). Previous evidence has also shown that the importance of orthographic skills for character recognition might be stronger when children’s experience and understanding of both regularities and irregularities in characters increases (Ho, Ng, et al., 2003; Ho, Yau, et al., 2003).Our inconsistent finding might be also due to the fact that beyond the measure we employed, there are many other different measures for orthographic knowledge, including the awareness of the compositional structure of compound characters (Ho Ng, & Ng, 2003; Ho, Yau, et al., 2003), discriminating incorrect radical forms within a character (Qian et al., 2015), or understanding of positional and functional radicals (Qian et al., 2015). Future work comparing characters and words in literacy acquisition in Chinese children should continue to include more measures of orthographic knowledge to determine the extent to which our findings were age- and/or culture- or script-specific.

There were some limitations in the present study. First, we only assessed children at the early stages of reading, namely, in kindergarten and first grade. Since cognitive-linguistic skills might differentially contribute to reading outcomes across grades (e.g., Tong et al., 2011), future studies are needed to compare the predictive effects of these cognitive skills at different ages and grades. Second, although we tried to consider both kindergarten and primary school children when selecting and utilizing the reading measures, some of the senior participants obtained skewed distributions in the reading measures at grade one. We therefore analyzed the present database using generalized estimating equations analyses, analyses that have the advantages of having been shown to be valid in dealing with skewed data and also not being dependent on the distribution (Feng et al., 2014). Finally, in future work, those words comprising three or more character words, although not frequently used, might be included in comparison with single-character words in order to understand a broader picture as to whether and how single-character words, two-character words, and multiple-character words are processed.

Conclusions and implications

In conclusion, the current study showed that metalinguistic skills were somewhat differently associated with character reading and word reading, at least in their salience for reading. Specifically, when character reading was statistically controlled in kindergarten, morphological skill and RAN remained unique correlates of word reading. In contrast, when word reading was controlled, neither metalinguistic skills nor RAN (nor other skills) significantly contributed to character reading. Moreover, longitudinally, we revealed that character reading could predict word reading but word reading did not predict character reading. Theoretically, these findings have extended previous research by demonstrating the importance of morphological awareness, as well as RAN, in explaining the differences between character reading and word reading among Hong Kong Chinese young children. In contrast with previous models of Chinese word reading that have often used the Chinese character as the primary unit of Chinese literacy (e.g., Xing et al., 2004; Yang et al., 2009), our results suggest that this approach to understanding the Chinese word reading process may have been an oversimplification. Practically, this study suggests that teachers of Chinese in both native and foreign language learners should uniquely focus both on single characters and on 2-character words in order to optimize early Chinese literacy learning. This implies a focus on both visual-orthographic knowledge capturing the individual features of each Chinese character and also an emphasis on how the characters connect to create new words.