Keywords

7.1 Introduction

Hong Kong being a Special Administrative Region of China, there is a natural expectation for younger generations of Hongkongers to be conversant in Putonghua, the national lingua franca, when communicating with Chinese Mainlanders. Accordingly, Putonghua has a special place in the postcolonial language-in-education policy of biliteracy and trilingualism:Footnote 1 in writing, being able to read and write Chinese and English, and, in speech, to interact with others in Putonghua, in addition to Cantonese and English (cf. Wang and Kirkpatrick 2015). It was against this background that various options for including Putonghua in the local curriculum were explored before the handover. For instance, three alternative models of teaching Chinese in Putonghua (TCP)Footnote 2 curriculum design were considered (see Ho et al. 2005, pp. 68–88):

  1. (a)

    TCP without Putonghua being taught as a separate subject;

  2. (b)

    TCP with Putonghua being taught as a separate subject; and

  3. (c)

    TCP with Putonghua being taught as a separate subject, Putonghua elements (esp. pronunciation features) being infused into the TCP curriculum.

Since 1998, Putonghua has become a compulsory core subject in primary school and elective subject in secondary school. From 2000, Putonghua has been included as an optional subject in the Hong Kong Certificate of Education Examination (HKCEE), which was abolished and replaced with the Hong Kong Diploma of Secondary Education (HKDSE) in 2012/13. Apart from teaching Putonghua as a subject (typically up to two hours per week, Chau 2004, p. 132), another move was the piloting of using Putonghua to teach the Chinese Language subject at primary level. Before this move, the Chinese Language subject, like all other subjects (except English) in most primary schools and Chinese-medium secondary schools (including Chinese in English-medium schools), had always been taught in Cantonese. When first introduced in the first few years of the new millennium, the government-funded TCP initiative was taken up by only a small number of primary schools and an even smaller number of secondary schools. Limited curriculum space has been one major challenge. As the primary curriculum is already quite packed, it is not obvious how Putonghua could be conveniently incorporated without disrupting the teaching and key learning outcomes of other subjects. For Cantonese-L1 students, Putonghua medium of instruction (PMI) for learning content subjects is clearly not an option. This is why in most of the primary schools Putonghua was taught as a subject for two or three 35–40-min lessons per week, with or without Putonghua being used as the MoI for teaching the Chinese Language subject. By mid-2016, according to media reports, about 70% of the 400+ primary schools have experimented with teaching Chinese in Putonghua in one way or another (i-Cable report 2016; Sing Tao Daily 2016).

In the last two decades since the 1990s, various issues related to the teaching of Putonghua, including TCP, received greater attention in Hong Kong and generated a sizable body of research, including small-scale studies on the effectiveness and assessment of Putonghua teaching, often explorative in nature. Much of this body of research appears in specialized monographs written in Chinese, some of which carrying a clear focus on the teaching and learning of Putonghua. A wide range of topics are covered: from a collection of articles by experienced teachers and researchers on various pedagogical issues in the teaching of Putonghua (e.g., Education Department 1997; Tian 1997) to more theoretical deliberations (e.g., Ho et al. 2005; Kwok 2005; Lai 2010), and from issues more specifically related to curriculum design and teaching methods (e.g., Tong et al. 2000, 2006) to one local secondary school’s sharing of TCP experienceFootnote 3 (Cho 2005; cf. Cho and Kwo 2005). In anticipation of wider interests among teachers and educationists, and commonly heard queries regarding the feasibility and methods of TCP, Ho (2002a) adopts a trouble-shooting style by structuring the book in the form of experts’ response to a list of frequently asked questions. The quality and level of Chinese teachers’ Putonghua pronunciation is evidently a matter of concern to the Education Department (1997), which is probably why in that (1997) monograph, several articles are devoted to the teaching of Putonghua using pinyin, focusing on Cantonese speakers’ Putonghua pronunciation problems and how teachers may cope with them (Ching 1997a, b; Hui 1997; Wu 1997). Below is an overview of some of the recurrent topics and views expressed.

Ho (1999), whose detailed analysis of Putonghua pronunciation errors was reviewed in Chap. 3, is to my knowledge the most comprehensive Cantonese-Putonghua contrastive study to date (see also Chan and Zhu 2010, 2015; Ho 2002b, 2005; Lee-Wong 2013; Ng 2001; Tsang 1991, 2002, 2003, 2014; P.-K. Wong 1997).Footnote 4 In 23 chapters, Si et al. (1997) review the current and future status of Putonghua teaching and learning in Hong Kong, and give a comprehensive coverage and discussion of relevant theories and practices in five sections: Overview, curriculum design, teaching methods, compilation of teaching materials, and teacher training.

Yiu (2010) discusses the status of Putonghua as L1 or L2 and its implications for TCP teacher training (cf. Yiu 2013). Yu (2012) compares one teacher’s teaching of separate classes in PMI (Putonghua medium of instruction) and CMI (Cantonese medium of instruction), and points to the urgency of TCP teacher training (cf. S.-M. Tse 2012; Yu 2013). Leung and Fan (2010) draw attention to common pedagogic problems in TCP classes. For instance, reading aloud is by far the most popular teaching strategy, partly because many teachers have no confidence elucidating meanings clearly in fluent Putonghua,Footnote 5 and so they tend to use reading aloud as a strategy to help students appreciate the meaning of the text.Footnote 6 This led Leung and Fan to appeal for using conversation-enriched texts to drive students’ Putonghua practice. Y.-N. Wong (2012) evaluates the impact of Putonghua textbooks on students’ learning outcomes. Lau (2012) discusses important pedagogic principles in the assessment of different types of Putonghua listening competence, while Kau and Lee (2012) underscore the usefulness of various task-based learning activities (e.g., information gap, jigsaw activities, task-completion, information-gathering, opinion-sharing) in facilitating the scaffolding of students’ classroom interaction in Putonghua.

In terms of learning effectiveness, Huang and Yang’s (2000) quasi-experimental study is particularly instructive. They compared two groups of Cantonese-L1 Primary 1 pupils (age 6) learning Putonghua from scratch, one group under school-based immersion conditions (n = 13), the other as a subject two 35-min lessons per week in a regular Cantonese-medium school over a 10-month period (n = 33). The Putonghua-immersion group followed their normal curriculum, while special curriculum materials were designed for the separate-subject group. Apart from class observation, audio-recorded reading-aloud data were also collected from the pupils for analysis. The findings showed that by the eighth month, the immersion group gradually reached a spontaneous-use stage in Putonghua after going through a silent stage (two months), a ‘Cantonese-Putonghua mixing’ stage (two months), and a semi-spontaneous-use stage (three months). As for the separate-subject group, while their level of attainment was clearly not as high, their gain or achievement in Putonghua was also quite remarkable. This was attributed to two main design features of the specially prepared teaching materials, namely (a) the recycling of keywords already introduced in two subjects, Chinese Language and Arithmetic; and (b) the use of interesting short, rhyming texts intended to be memorized in preparation for reading aloud classroom practice (individual or group) or performing in front of the class. As a teaching strategy, the conscious use of competition was especially productive and welcomed by the pupils. Marked progress took place over two five-month stages: an ‘initial contact with Putonghua’ stage, followed by a ‘Putonghua beginner’ stage. In terms of the types of learning difficulties as reflected in the two groups’ non-standard Putonghua features at both the segmental and suprasegmental levels, both the immersion group and the separate-subject group appeared to be going through very similar interlanguage processes. These encouraging findings led Huang and Yang (2000) to conclude that, provided interesting, pedagogically sound and interactive teaching materials are in place, Cantonese-L1 schoolchildren at P1 level can achieve a lot in Putonghua. For one thing, the ‘language across the curriculum’ principle helps reinforce the learning of content subjects while minimizing vocabulary problems (cf. ‘mental lexicon’, S.-K. Tse 2001, 2014; S.-K. Tse et al. 2007; Lee et al. 2011). As for rote-learning, rather than being something to avoid, Huang and Yang (2000) demonstrate that the use of short rhyming texts intended to be chanted out loud or performed (e.g., in a class competition) can be pedagogically a productive teaching and learning method.

Similar empirical studies have also been conducted with a view to identifying factors that impact on Putonghua teaching and learning effectiveness. According to classroom-based TCP data collected at 20 participating schools (11 primary, 9 secondary) in 2004, six factors were identified as having an impact on the learning outcomes of TCP (SCOLAR 2008). They are listed in descending order of relative significance as follows (for an informative discussion and review, see S.-F. Tang 2008; cf. Chau 2004):

  1. (a)

    qualified teachers (師資)

  2. (b)

    school management’s attitudes and strategies (學校管理層的態度及策略)

  3. (c)

    language environment (語言環境)

  4. (d)

    students’ aptitude and learning ability (學生的學習能力)

  5. (e)

    curriculum, pedagogy and teaching materials (課程 教學及教材安排)

  6. (f)

    support for teaching and learning (教與學的支援)

In general, research shows that while there was some indication of improvement in students’ Putonghua, there was little evidence of improvement in students’ Chinese-language learning outcomes – the main objective of the Chinese Language subject (SCOLAR 2008; cf. Tong et al. 2000, 2006, p. 343). Quite the contrary, in a few news stories on the teaching effectiveness of TCP classes, it was reported that students’ performance in the Chinese Language subject had actually deteriorated (S.-F. Tang 2008, p. 2). Among the main pedagogical problems identified were:

  1. (a)

    TCP teachers’ Putonghua was non-standard

  2. (b)

    Too much time was spent teaching Putonghua and offering corrective feedback to students’ pronunciation

  3. (c)

    TCP teachers’ neglect of students’ learning outcomes in Putonghua

  4. (d)

    The quality of teaching was compromised as many TCP teachers did not have confidence using teaching strategies that they would normally use when teaching Chinese in Cantonese

  5. (e)

    There was no evidence of improvement in students’ Chinese-language output, e.g., use of grade-relevant vocabulary and the quality of their prose in creative writing

The brief review of the TCP-focused literature above suggests that much more basic research is needed, both with regard to TCP curriculum design at the policy level, as well as the provision of logistical support at the level of implementation.

In Chap. 3, we saw that linguistically, the learning of SWC and Putonghua by Cantonese-L1 learners is riddled with plenty of cross-linguistic and literacy-related challenges. At the same time, our discussion in Chap. 6 suggests that sociolinguistically, for mainly identity-related reasons, natural exposure to and opportunities for using Putonghua spontaneously for intra-ethnic communication are hard to come by. Coupled with the perennial problem of a lack of professionally trained teachers who are confident and proficient in teaching Chinese in Putonghua, our students’ poor Putonghua learning outcomes – as shown in the majority of experimental TCP studies – are hardly surprising. To counteract the linguistic hurdles and unfavorable sociolinguistic learning conditions, our best bet would seem to be a re-examination of the timing of Putonghua input as well as its curriculum design. For the requisite evidence and support, we will review a number of empirical studies: (i) psycholinguistic research in reading and literacy development in Chinese and/or English (as L1 or L2), and (ii) neuroscience research in the acquisition of one or more languages in early life, with a view to elucidating facilitative factors that are likely to be conducive to students’ Putonghua development. Then, based on insights extrapolated from these two research areas, we will draw policy implications by recommending a number of changes in the curricular arrangements, in the hope that the teaching and learning of Putonghua in Hong Kong could take place more effectively and productively.

7.2 Psycholinguistic Research in Reading and Literacy Development in L1 and L2

There is no shortage of empirical, especially experimental studies researching how reading and literacy in Chinese develops vis-à-vis other languages such as English. Based on empirical findings from their 9-month longitudinal study of phonological processing skills and early reading abilities of Hong Kong Chinese kindergarteners (mean age 4.88 years; range 3.80–6.20 years) learning to read English as a second language, Chow et al. (2005) found that:

phonological awareness is not only important for learning alphabetical languages but also for Chinese reading acquisition (…), representing the ability to manipulate sound units and mapping sound units to written symbols, seems to be an essential element of reading across orthographies. Using phonological elements to process written languages may be a universal process of reading development no matter how limited the presentation of phonological cues are in written form. (Chow et al. 2005, p. 85)

Of greater interest are Chow et al.’s (2005) two further closely related findings. The first one concerns the bi-directional relationship between phonological awareness and Chinese reading, which is consonant with earlier empirical findings regarding a similarly reciprocal, mutually supportive role of phonemic awareness and learning to read not only in English (cf. Perfetti 1985; Perfetti et al. 1982, Perfetti et al. 1987), but also in Chinese (e.g., Hu and Catts 1998):

the development of phonological awareness and Chinese reading abilities proceeds hand in hand. Thus, phonological awareness skills aid in reading acquisition in Chinese and they are also the by-products of learning to read at the same time. (...) In Chinese, the basic phonological unit is the syllable. Every character represents a single syllable. Thus, for beginning readers, experience with print may sensitize children to syllable-level units, just as learning to read English sensitizes children to phoneme-level units. (Chow et al. 2005, p. 85)

A second finding in Chow et al.’s (2005) study involving Cantonese-L1 kindergarteners points to ‘phonological transfer’ between written Chinese and English, in that

phonological awareness in Chinese [here Cantonese] can aid concurrent and subsequent English language acquisition. (...) This finding highlights the importance of certain phonological processing skills in Chinese for learning to decode English. (...) Phonological transfer is not restricted to languages with similar structures. Phonological processing skills in a nonalphabetic language can aid in the acquisition of an alphabetic language, and it appears that some phonological processing skills are intrinsic to children’s language acquisition across orthographies. (Chow et al. 2005, pp. 85–86; cf. Perfetti et al. 1992)

In Chow et al.’s (2005) study, the participating kindergarteners did not receive any explicit training in phonological coding, such as activities guiding them to manipulate sound segments in English through the teaching of phonics, or the segmentation of Cantonese syllables through a romanization system like JyutPing (粵拚, Tang et al. 2002). Does the explicit training in phonological coding, such as the teaching of pinyin, have any impact on young learners’ Chinese literacy development, for example, character recognition and reading performance in general? This was one of the research questions in Shu et al.’s (2008) study.

Previous research has shown a strong correlation between syllable awareness and literacy development such as character recognition in Chinese among early readers (e.g., Chow et al. 2005; McBride-Chang and Ho 2000, 2005; cf. McBride 2016). In addition to syllable awareness, phonemic awareness (the onset, coda of a syllable) also helps explain variance in Korean students’ reading performance in Hangul (McBride-Chang and Kail 2002; cf. Cho and McBride-Chang 2005). On the basis of empirical evidence to date, Shu et al. (2008) hypothesized that two aspects of phonological awareness – syllable awareness and rhyme awareness – are developmentally influenced by age changes and experience with language through exposure and use. They further hypothesized that formal literacy instruction, that is, teaching children explicitly how Putonghua speech sounds at the phonemic level are coded in pinyin, would enhance their phonological awareness, including tone awareness, which in turn would impact positively on their literacy development. With these premises and hypotheses in mind, Shu et al. (2008) investigated the development and interrelations of four aspects of phonological sensitivity among 3- to 6-year-old children. They administered a series of psycholinguistic experiments – syllable deletion, rime detection, onset detection, and tone detection – to a total of 146 children in Beijing. Their grade levels, age ranges and gender distribution are listed in Table 7.1.

Table 7.1 Participants’ grade level, age range and gender distribution in Shu et al.’s (2008) ‘Study 1’ and ‘Study 2’

Shu et al.’s (2008) hypotheses were largely confirmed in both Study 1 and Study 2 reported in the same paper. In Study 1, the focus was on the development of four levels of phonological awareness and how it relates to age and pinyin instruction. The results indicated that, whereas syllable and rhyme awareness gradually became more mature with age developmentally, phonological coding instruction and training in pinyin appeared to boost children’s phonemic awareness (onset) and tonal awareness dramatically. More specifically, K1–K3 (aged 3–5) pupils’ awareness of phoneme onset and tone showed little variation (i.e., comparable “chance-level success”). By contrast, the first-graders, who had received formal training in pinyin, demonstrated much greater sensitivity to onsets and rimes of Chinese morpho-syllables, and their accuracy in phoneme onset and tone (both over 70% accurate) exceeded that of K1–K3 pupils by a wide margin. According to Shu et al. (2008, p. 173), this is probably because learning pinyin helps “make implicitly learned lexical tones explicit and, thus, highlight the salience of tone for young children”, which is especially useful when children are confronted with homophones.

In Study 2, Shu et al. (2008) examined whether different levels of phonological awareness may help account for variance in (mono- and bi-syllabic) Chinese word recognition among children with no prior reading instruction. Shu et al. (2008) administered six tests to 202 K1–K3 pupils in Beijing: syllable deletion (16 items, half real, half nonsense words), rime detection, tone detection, rapid naming, vocabulary, and Chinese character recognition. The results showed that “both tone detection and syllable deletion skills independently explained variance in early Chinese character recognition” (Shu et al. 2008, p. 178).

Drawing implications from both Study 1 and Study 2, Shu et al. (2008, p. 171) conclude that their findings “underscore the unique importance of both tone and syllable for early character acquisition in Chinese children”. This is consonant with earlier findings. For instance, in Huang and Hanley’s (1994) comparative study of Hong Kong and Taiwanese students’ ability to delete phonemes from Chinese syllables, Taiwanese children who had received instruction in Zhuyin fuhao,Footnote 7 the phonological coding system in Taiwan, significantly outperformed their Hong Kong counterparts who had not received any phonological instruction and training (cf. Huang and Hanley’s 1997). There is thus strong evidence that “children who receive reading instruction that makes phoneme awareness explicit typically learn to identify phonemes earlier than do those who do not” (McBride-Chang et al. 2003, p. 746; cf. McBride 2016). Hence, apart from phonemic and tonal awareness being a natural developmental, maturational outcome, as evidenced in Ciocca and Lui’s (2003) study involving Cantonese-L1 children, formal instruction and training in a phonological coding system like pinyin or Zhuyin fuhao has been shown to have good potential for enhancing preschoolers’ sensitivity to the onsets, rimes, and tones of Chinese characters.

In a separate study on cross-language and writing system transfer in students’ Chinese-English biliteracy acquisition, Wang et al. (2005, p. 72) predicted that “sensitivity in English and in Chinese to onset and rime, common linguistic units in both languages, will be correlated” and that “pinyin reading skills will correlate with English word reading, since the two systems share the alphabetic principle”. The subjects were 46 weekend Chinese school students in Washington, D.C. with the mean age of 8 years and 2 months (Grade 2 or 3). Both of these predictions were borne out in their findings. More specifically:

The finding that Pinyin naming skill was highly correlated with English phoneme deletion and pseudoword naming suggests that reading skills in two alphabetic systems are related. It is interesting that when children are learning Chinese characters and Pinyin simultaneously, the Pinyin naming and English reading skills facilitate each other, but the Chinese character naming and English reading skills do not. It is interpretable given the sharp distinction between the two writing systems. (Wang et al. 2005, p. 83)

These empirical findings suggest that knowledge of pinyin not only facilitates the learning of Putonghua, but it is also conducive to developing reading skills in English as well.

The relative ease with which preschoolers aged 4–6 are able to develop a certain level of phonological awareness in Chinese and English to facilitate literacy development – word/character reading and recognition – as found in psycholinguistic experiments discussed above, is in sharp contrast with the difficulties encountered by many of our TCP teachers, who often feel frustrated and exhausted attending to their students’ Putonghua pronunciation (e.g., Leung and Fan 2010). On the other hand, research in the psycholinguistics of emergent reading in Chinese and English suggests that those Primary 1 students who have already developed a certain level of sensitivity to Putonghua tend to perform better in reading, probably because deeper knowledge of Putonghua and character recognition allow them to better concentrate on Chinese literacy-focused activities.

The empirical insights discussed above suggest that, with regard to the goal of sharpening young learners’ sensitivity to Putonghua, the age range 4–6, corresponding to K1–K3, seems to be the ideal or optimal biological stage at which exposure to Putonghua is acquisitionally more fruitful and productive than delaying it till early primary. Compared with the current policy and practice, this would mean bringing the onset time of Putonghua in the curriculum forward by two to three years, from P1 to K1.Footnote 8 Of course, certain conditions must be met if this policy is to be implemented Hong Kong-wide: the kindergarten teachers must have attained the required standards in Putonghua (ideally PSC level 2A or above) and are thoroughly trained in teaching Chinese in Putonghua. In terms of the percentage of Putonghua in the kindergarten curriculum, it may be anywhere between one-third to half of the curriculum space. To the extent that young children aged 4–6 have the ability to distinguish between discrete languages, translanguaging between Cantonese and Putonghua (or even English) should not present any major problem, acquisitionally or otherwise (cf. Huang and Yang 2000).

While the putative benefits of earlier exposure in terms of relative acquisitional ease of Putonghua have yet to be tested out, awaiting confirmation in rigorous empirical research, anecdotal evidence suggests that earlier exposure to Putonghua tends to yield positive results. In a documentary on TCP (ATV Home 2014), a primary school principal who adopted a whole-school approach to TCP shared the key findings of a 5-year longitudinal study, in which the same teacher taught two Primary 2 classes, one in Cantonese, the other in Putonghua. The results showed that about 33% of all TCP pupils, including the weakest ones, had made progress in the Chinese Language subject. According to that principal, the schoolchildren’s success could be attributed to their deeper knowledge of Putonghua. In a separate interview with the teacher of Chinese involved in that study, she observed a general tendency for TCP students to be more adept and resourceful in using four-character or four-syllable idiomsFootnote 9 derived from historical allusions such as胸有成竹 Footnote 10 and 成竹在胸,Footnote 11 both meaning ‘confident’ or ‘have a well-thought-out plan’. By contrast, those students in Cantonese-medium classes would tend to render that meaning using the SWC or Cantonese equivalent 有把握 (‘confident’).Footnote 12 Further anecdotal evidence may be found in Susane Wong, a trilingual student who attained outstanding HKDSE performance in Chinese, English and Spanish in 2014, and who started learning Putonghua in kindergarten. What is particularly noteworthy in her case is that she “grew up to be a voracious reader”, relishing, at age 11, a martial arts novel like ‘The Legend of the Condor Heroes’Footnote 13 (918,093 characters) written by the celebrated ‘swordplay’Footnote 14 novelist Jin Yong (Chik Wiseman 2014).Footnote 15 Anecdotal these exemplary cases of Chinese literacy acquisition may be, there seems a missing link that merits closer scrutiny through careful research: to what extent does progress in Putonghua learning facilitate Chinese literacy-focused activities such as leisure reading and free, creative writing?

7.3 Critical Period and Neurobiological Window of Language Acquisition: Insights from Neuroscience Research

As is well-known, language is a species-specific faculty that tells humans and other animals apart. Except for extreme circumstances such as the deprivation of contact with the social world, no known infants or young children have failed to master a language, regardless of skin color, ethnicity, level of IQ or socioeconomic status. In all societies, big or small, with rare exceptions all children ‘pick up’ one or more languages of the locality effortlessly as they grow up, so long as the patterns of language learning and use approximate those of first-language acquisition. Consider, for example, the large number of French-German bilinguals in the border regions between France, Germany and Switzerland, often in addition to the local vernacular such as Swiss German in Switzerland and Alsatian in Alsace, a German dialect in France. In an increasingly globalized world characterized by ease of mobility and massive people movement, simultaneous acquisition of two or more first languages is no longer rare, the only constraint being regular exposure to input of the language(s) in question. Where a target language is learned and used not as a first language (L1), but a second (L2) or foreign language (FL), however, there is a limit as to how successful that language is acquired. There is ample empirical evidence showing that, regardless of languages and cultures, adults tend to fare worse in the learning of an additional language compared with teenagers, while teenagers are no match for children in terms of the extent to which the target additional language is mastered up to a native-like level of competence, even though teenagers may perform better than young children at initial stages, for example, in the learning of morphology and syntax (Snow and Hoefnagel-Höhle 1978, p. 1115). Language being a classic example of a ‘critical’ or ‘sensitive’ period in neurobiology (Kuhl 2010, p. 716), the onset age of learning is thus a fairly robust factor that predicts the ultimate level of language learning attainment under normal language learning conditions. That this is the case may be gauged by the title of the monograph, The scientist in the crib: What early learning tells us about the mind (Gopnik et al. 2000). Such a research insight is not lost to laypeople. In Hong Kong, many – parents in particular – are convinced that ‘earlier is better’ when it comes to their children’s learning of a prestige language such as English, and their action (e.g., choice of kindergarten and school for their children) is often guided by a widely shared Chinese adage:

不要讓小孩輸在起跑線上 Footnote 16

‘Don't let the child(ren) lose at the starting line.’

This is why English-medium (pre)schools are so popular for those parents who can afford it. But how far backwards, on the age scale, can onset age be stretched as an advantage that predicts language learning success? In other words, if children tend to outperform teenagers and adults in language learning, do they fare any better compared with their even younger peers, infants or even newborns? According to insights adduced from cutting-edge neuroscience research in the last two decades, the answer is a resounding ‘yes’, albeit with a caveat: newborns are indeed expert language learners, but with maturation setting in from childhood to later biological stages in life, such an advantage is progressively lost. This phenomenon, generally referred to as the ‘critical period’, has been rigorously researched and hotly debated since the 1960s.

Compared with infants and young children, adults may be cognitively more developed and mature, but their performance in learning the pronunciation patterns, morphology and syntax, and the finite set of grammatical rules of an additional language tends to be disappointing compared with younger learners learning that same language as their L1. None of these pose any difficulty to infants and young children, so long as the target language in question is learned under L1 learning conditions. For decades, scholars in several neighboring disciplines, notably psychology, psycholinguistics, neuroscience and brain science, have tried to explain why infants the world over are gifted with “incredible abilities to learn once exposed to natural language” (Kuhl 2010, p. 713), an amazing feat that no known computers have been able to replicate, however powerful they may be.

The puzzle surrounding the critical period has preoccupied many psychologists and psycholinguists from the 1950s – barely four decades after modern linguistics, the scientific study of language, was founded and recognized as a new academic discipline since the publication of Ferdinand de Saussure’s influential work Cours de linguistique générale (‘Course in General Linguistics’) in 1916. Various theories have been advanced by scholars from different persuasions and disciplines to explain the relative ease in L1 acquisition by young children regardless of the typological status of their first language(s), ethnicity, intelligence quotient (IQ), or socioeconomic background. An early attempt was made in the 1950s by B. F. Skinner (1957), a Harvard psychologist, who postulated that language was not unlike other forms of human behavior. Behaviorists believe that learning by humans or non-humans alike results from association. For instance, after being presented with food and the sounding of a bell several times, a dog would salivate in response to the sounding of a bell (conditioned stimulus) without any food being presented (unconditioned stimulus). Such a process is known as ‘classical conditioning’. Language learning, Skinner argued, is not unlike other forms of human behavior in that it develops along the principle of ‘operant conditioning’: those behaviors that receive positive reinforcement will be imitated and gradually become an automatized response to the stimulus, while those that meet with negative reinforcement will be withdrawn over time. In the 1950s, such a view to language learning was highly influential in second or foreign language teaching methodologies known as audio-lingualism. Accordingly, language teachers were advised to help learners approximate target language norms through imitation, repetition and drilling.

The behaviorist view to language learning was challenged by Noam Chomsky (1959), who argued that language output by humans is first and foremost creative, in that no amount of imitation or drilling could explain, for example, an English speaker’s ability to produce a semantically nonsensical but grammatically well-formed sentence like ‘colorless green ideas sleep furiously’. If humans are able to utter grammatically well-formed sentences (in any language) that they have never heard or seen before, attributing such a universal ability to stimulus–response or imitation is hardly convincing. Underlying this grammatical competence is a finite set of grammatical rules that allow for the generation of any and all sentences that conform to the grammatical norms of the language in question (here, English, e.g., subject-verb agreement; the fronting of wh- words in wh- questions like ‘Who are you?’). What is particularly amazing is that all children appear to acquire a high level of grammatical competence in their first language(s) effortlessly by the age of four or five in the absence of any explicit instruction. Quite the contrary, much of the interactional input children are exposed to is linguistically imperfect (e.g., sentences that are incomplete, often with structural anomalies such as false starts, or characterized by caretaker features like ‘motherese’). Accordingly, it is generally believed that the missing piece in the puzzle lies not so much in first-language learners’ and users’ observable behaviors as brain mechanisms when infants are engaged in language learning and use. It follows that all humans are born with some built-in ‘language acquisition device’ (LAD) which, short of access to how the LAD actually functions in the human brain, came to be known as the ‘black box’ (Chomsky 1959).

Chomsky’s ‘generativist’ account outlined above is clearly more convincing in terms of explanatory adequacy, which is why for decades since the 1960s, it has attracted a lot of followers in the research agenda and endeavors championed by him toward a coherent theory of Universal Grammar (UG). The ongoing debate concerning an optimal UG model led advocates to advance highly abstract underlying principles or parameters in order that the innate linguistic structures of any and all languages could be accounted for despite overt typological differences (e.g., basic word order SVO/SOV/VSO; the obligatory presence of a grammaticalized subject in English like It’s raining as opposed to the ‘pro-drop’ feature in Chinese such as 落雨啦 (lok 22 jyu 23 laa 33) and 下雨了 (xià yǔ le), both meaning ‘it’s raining’.

One may or may not subscribe to UG as the theoretically most promising research direction for explaining young children’s innate language learning abilities. Meanwhile, thanks to exciting breakthrough in brain science since the 1970s, there is some indication that it would not take long for the Chomskyan black box to see the light. Today, there is increasing consensus that, how the electronically traceable and measurable pathways in the language-active parts (e.g., Broca’s area, Wernicke’s area) of the human brain operate, and the neural mechanisms thus identified, hold the key to the puzzle, why and how in terms of language learning performance, cognitively more mature adults (under L2 or FL learning conditions) tend to be no match for babbling infants or toddlers (under L1 learning conditions). In this regard, Lenneberg’s (1967) ‘Critical Period Hypothesis’ (CPH) is probably the best-known explanatory model to date (cf. Penfield and Roberts 1959). Lenneberg postulates that L1 acquisition relies on neuroplasticity in the brain, which declines with age due to maturation, resulting in progressive loss of neural sensitivity to fine nuances at all linguistic levels. Lenneberg further postulates that the loss of neuroplasticity and the resultant cerebral lateralization generally culminates at puberty (about age 10–16, de Boysson-Bardies 1999, p. 31), which helps explain why those who start learning a language at teenage or later would find it more difficult to attain native-like proficiency in that language. This is especially true with regard to accent. Since the 1970s, CPH has inspired a lot of empirical research, but the findings are far from being convergent (see, e.g., Snow and Hoefnagel-Höhle 1978). One of the limitations is methodological design; in principle, data obtained from longitudinal studies have greater potential for generating robust and hard evidence, but longitudinal studies are methodologically more challenging to organize compared with cross-sectional studies.

Already in the 1970s, a number of studies showed that infants are able to hear or perceive the fine differences between discrete speech sounds (especially vowels and consonants, the building blocks of words) or phonetic units that belong to different languages (Eimas 1975; Eimas et al. 1971; Lasky et al. 1975; Werker and Lalonde 1988). In the 1980s, it was further discovered that infants’ universal ability to perceive all possible phonetic units peaks at around 6 months of age, and progressively becomes more and more language-specific by 1-year-old (Werker and Tees 1984). Similar results were later obtained in Kuhl (1993) and Kuhl et al.’s (1992) studies. A succinct summary of this consolidated research finding is presented by de Boysson-Bardies (1999) as follows:

According to Kuhl, the initial sound space is divided by universal psychoacoustic boundaries. By six months, as a result of contact with the language spoken around them, babies have reorganized and simplified this space: they have made it pertinent to their particular language. Thus nonpertinent categories in the native language disappear (…). In a matter of weeks, then, infants have selected the elements compatible with their linguistic environments. They begin to fail to hear those elements that are generally absent from the phonetic structures that they perceive in their usual experience of language. (de Boysson-Bardies 1999, p. 42)

On the basis of this psychoacoustic development in infants at 6 months of age, Kuhl (1993) puts forward the ‘native language magnet theory’. More recently, based on an analysis of brain measurements of perceptions of the /r–l/ contrast in American English collected from infants who were 6- to 8-month-old and 10- to 12-month-old in the United States and Japan, Kuhl et al. (2006) found evidence of “directional asymmetry” in infants’ developmental change in phonetic perception during their first year of life. That is, over the same biological stage during the period 6–12 months of age, whereas the performance of native language perception of the AmEng /r–l/ contrast increased significantly (US group), the performance of the non-native language perception of the same contrast declined (Japanese group). What this means is that, by the first year of age, infants’ brain architecture as reflected in their perception of discrete phonetic units progressively becomes more specialized or neurally committed to the phonetic properties of their native language. As infants’ abilities to perceive and process native-language phonetic units are progressively enhanced, their abilities to perceive and process non-native-language phonetic units will undergo a gradual decline correspondingly. Similar findings have also been obtained using the Spanish /b–p/ contrast (e.g., bano versus pano) as the focus of investigation in the perception performance of American and Spanish infants who were controlled for age: whereas the Spanish infants perceived /b/ and /p/ as discrete phonemes differentiating word meanings, their American counterparts ignored the overt difference in these two phonetic units, which are non-phonemic in English (i.e., they manifest as allophones appearing in complementary distribution, witness, e.g., the pronunciation of /p/ in Eng. pain, [ph], akin to Span. pano, as opposed to Eng. Spain, [p], akin to Span. bano). This led Kuhl et al. (2006, p. F13) to conclude that “neural commitment to native-language phonetic properties explains the pattern of developmental change in the first year”. This finding, termed ‘native language neural commitment’ (NLNC), has subsequently been shown to be supported among L2 learners or users from different languages and cultures in a migrant context like the US, for example, Korean and Chinese users of English (Johnson and Newport 1989); Korean-L1 and/or Korean-L2 speakers of English in the US (Flege et al. 1999; Yeni-Komshian et al. 2000); and Spanish-L1 speakers of English (Birdsong and Molis 2001). In general, age on arrival is a fairly good predictor of native-like pronunciation of the language in the host country (e.g., English in the US). Yeni-Komshian et al. (2000), for instance, found that Korean participants who arrived in the US before the age of 9 tended to have better pronunciation in English than Korean, while the opposite was true of Korean participants arriving at the age of 12–23 (i.e., better Korean pronunciation than English). This finding is consistent with one observation in empirical L1 acquisition studies that suggests “in normally developing children, complete mastery of phonology, productive control of most of syntactic structures, and early literacy are achieved by about age eight” (Yeni-Komshian et al. 2000, p. 146).

One particularly instructive study was conducted by Mayberry and Lock (2003), who used two tasks as instruments – timed grammatical judgement and untimed sentence to picture matching – to measure the English grammatical abilities of deaf and hearing adults (two groups each, n = 54). The purpose was to examine the impact of the participants’ linguistic experiences, spoken or signed, during early childhood on their English grammatical abilities. Thirteen of the 14 normal hearing adults (7 men, 6 women, aged from 17 to 57, mean age 32.46) were native users of English who had acquired another language as their L1 from birth: Urdu (8), French (2), German (1), Italian (1) and Greek (1). Their English-medium schooling started at different ages, from 6 to 13 (mean starting age = 9). By contrast, the 13 profoundly deaf participants were born to English-speaking parents. Due to deafness, they received negligible speech input either in the family or preschool from age 3 to 6.Footnote 17 The twelve deaf participants were subsequently switched to schools where sign language was used when they were aged 6 to 13 (mean age at which the switch took place = 9.4). Unlike the normal hearing participants who made up the ‘Early Language’ group, the group of profoundly deaf participants was characterized as ‘No Early Language’, although one group received some speech (English) input at preschool between the age of 3 and 5, while the other ‘Early Sign’ group’s input at that same age range was primarily restricted to sign language. Data analysis was controlled for age of English exposure and length of English use. No discernible differences were found with regard to the degree of hearing loss (the ‘No Early Language’ group), non-verbal IQ, age of preschool entry, method of English instruction, or non-language cognitive test performance (Mayberry and Lock 2003, p. 374). The English grammaticality task tested adult participants on five different sentence structures: simple sentences, dative sentences, conjoined sentences, passive sentences, and relative clause sentences. The results showed that:

adults who acquired a language in early life performed at near-native levels on a second language [here, English] regardless of whether they were hearing or deaf or whether the early language was spoken or signed. By contrast, deaf adults who experienced little or no accessible language in early life performed poorly. These results indicate that the onset of language acquisition in early human development dramatically alters the capacity to learn language throughout life, independent of the sensory-motor form of the early experience. (Mayberry and Lock 2003, p. 369)

These findings led Mayberry and Lock (2003) to conclude that:

Instead of being a phenomenon of diminishing ability to learn language caused by increasing brain growth, the critical period for language would instead be a time-delimited window in early life where the degree and complexity of neurocortical development underlying the language system is governed, in part, by linguistic stimulation from the environment which together with neurocortical development creates the capacity to learn language. (...) early language experience helps create the ability to learn language throughout life, independent of sensory-motor modality. Conversely, a lack of language experience in early life seriously compromises development of the ability to learn any language throughout life. These findings mean that timely first-language acquisition is necessary, but not sufficient, for the successful outcome of second language learning. (Mayberry and Lock 2003, p. 382; emphasis added)

Tomasello (2003) reaches a similar conclusion after reviewing a number of empirical studies designed to assess the validity of the critical period. He compares the negative impact of missing exposure to a target language in early life with the low level of performance in various sports activities or skills (e.g., playing the piano) by adult learners and remarks that:

It is usually very easy to identify in a group of skiers or tennis players or piano players those who began learning their skill in early childhood and those who are adult learners – and language is no exception. This final consideration is especially important in explaining the relative lack of fluency of deaf persons who are not exposed to their first language (sign language) until late childhood or adulthood. (Tomasello 2003, p. 287)

There is thus some evidence showing a “time-delimited window in early life” (Mayberry and Lock 2003) being crucial for infants’ developmental brain architecture, subject to the only constraint of regular exposure to one or more natural languages. Within that window, children will progressively get attuned to fine phonemic contrasts that hold between dissimilar phonetic units in their native language, while non-phonemic contrasts (e.g., allophones) are ignored (Kuhl 2007, 2010). Beyond phonology, there is also some indication that, without the needed exposure to language at infancy and early childhood, subsequent language learning efficiency and performance in the development of grammatical competence would also be adversely affected (Mayberry and Lock 2003).

The intricate, interlocking neuro-pathways and mechanisms of the human brain remained scientifically inaccessible until recently. However, armed with technological advances and increasingly sophisticated tools of investigation in the last two decades, including Electroencephalography (EEG), Event-related Potentials (ERPs), functional Magnetic Resonance Imaging (fMRI), Magnetoencephalography (MEG), and Near-Infrared Spectroscopy (NIRS), neuroscience is on the verge of some exciting breakthroughs in infants’ NLNC beyond their phonetic perceptions up to the first year of age. Neuroscientists like Kuhl (2010) have high hopes that with further research in the 2010s and beyond, at least part of the Chomskyan black box will soon see the light, making it possible for us to envision if not visualize the nuts and bolts of that hitherto mysterious Language Acquisition Device. It remains unclear, as predicted by the Critical Period Hypothesis (Lenneberg 1967), whether puberty (around age 10–16) is the absolute cut-off biological stage beyond which native-like proficiency in the learning of a new language is virtually unattainable. One thing is certain, however: the human brain is predisposed to NLNC following regular exposure to one or more dominant first languages, in that “neural circuitry and overall architecture develops early in infancy to detect the phonetic and prosodic patterns of speech” (Kuhl 2010, p. 716; cf. Kuhl 2004; Y. Zhang et al. 2005, 2009). At the same time, through “statistical learning” in computational terms, as the human brain gets increasingly specialized or attuned to the linguistic subsystems in the infant’s first language(s), its ability to process fine linguistic nuances in subsequent languages (e.g., encountered or studied from around age 10 onwards) is neuro-biologically pre-programmed to decline progressively over time:

This architecture is designed to maximize the efficiency of processing for the language(s) experienced by the infant. Once established, the neural architecture arising from French or Tagalog, for example, impedes learning of new patterns that do not conform. (Kuhl 2010, p. 716)

A significant breakthrough has thus been achieved in infants’ perception of phonetic units in their first-language(s). What about other linguistic subsystems such as morphology, syntax and vocabulary? While more neuroscience research is being conducted to probe into these areas, there is some indication that the “temporally defined critical ‘windows’” are asymmetric (Kuhl 2010, p. 716):

The developmental timing of critical periods for learning phonetic, lexical, and syntactic levels of language vary, though studies cannot yet document the precise timing at each individual level. Studies indicate, for example, that the critical period for phonetic learning occurs prior to the end of the first year, whereas syntactic learning flourishes between 18 and 36 months of age. Vocabulary development ‘explodes’ at 18 months of age, but does not appear to be as restricted by age as other aspects of language learning—one can learn new vocabulary items at any age. (Kuhl 2010, p. 716)

More work in neuroscience research is underway, with the objective of unlocking the respective onset and closing critical periods of other linguistic levels beyond phonetic perception and acquisition of L1 phonology, and better understanding the ways they function.

The findings outlined above were obtained under laboratory conditions. Can such findings be replicated when infants and young children are engaged in social interaction with others, for example, their parents or caretakers who tend to use ‘infant-directed speech’ or ‘motherese’?Footnote 18 Kuhl and her colleagues have conducted a number of studies probing into the possible effects of social interaction on infants’ brain mechanisms, and found that interaction with a live person (e.g., parent, caretaker or tutor), as opposed to an inanimate source such as video-recorded TV programs, creates a social context which has fundamental, positive influence on the infant’s quality and quantity of language learning (Kuhl et al. 2003). In a number of studies in which infants living in an English-speaking environment were exposed to words in a non-local language such as Spanish, the results show that:

The degree of infants’ social engagement during sessions predicted both phonetic and word learning—infants who were more socially engaged showed greater learning as reflected by ERP [Event-related Potential] brain measures of both phonetic and word learning. (...) Taken as a whole, the data are consistent with the notion that cognitive skills [e.g., executive control of attention] are strongly linked to phonetic learning at the initial stage of phonetic development (Kuhl 2010, p. 721)

A number of social or interactional factors conducive to the quantity and quality of language acquisition have been identified in subsequent analysis: (1) attention and/or arousal, (2) information, (3) a sense of relationship, and (4) activation of brain mechanisms linking perception and action (Kuhl 2010, p. 720). Some of the key findings are as follows (cf. Conboy and Kuhl 2010; cf. Conboy et al. 2008):

  1. (a)

    the amount of attention, in terms of ‘infant looking time’ measures, correlates positively with vocalization performance (‘low attenders’ are outperformed by ‘high attenders’);

  2. (b)

    the amount of the infant’s visual gaze at objects of reference to which the speaker’s gaze is directed correlates positively with vocalization performance;

  3. (c)

    the infant appears to interpret the speaker’s gaze as a social cue and follows it; it is likely that such social interactions activate brain mechanisms that lead to a growing awareness of the self and the other – the cognitive basis of a social relationship; and

  4. (d)

    infants’ periodical exposure to a non-local language leads to “an early coupling of sensory-motor learning in speech” (Kuhl 2010, p. 722), which is conducive to the vocalization of words in that language.

Based on empirical findings outlined above, Kuhl (2010) concludes that “early mastery of the phonetic units of language requires learning in a social context” (p. 713), without which language acquisition would be adversely affected. For a comparison, Kuhl (2010) points to children diagnosed with autism, who tend to face problems in social cognition as well as language learning and use. All this led Kuhl to the ‘Social Gating Hypothesis’ (2007, 2010), whereby the computational mechanisms underlying statistical [language] learning of the brain require that the ‘social brain network’ be activated, metaphorically like an opened gate. If not (i.e., with the gate in a state of being closed), the hypothesis predicts that statistical learning of the infant could not proceed, in which case language acquisition would be seriously impeded.

Important insights of recent neuroscience research outlined above clearly have implications for the language-in-education policy of a multilingual context like Hong Kong. In particular, the earlier infants are exposed to regular, high-quality input in the target language(s), the stronger is the likelihood for them to develop native-like proficiency in those languages. Compared with the current policy provisions, however, the current policy appears to be lopsided, in that resources and funding support for language learning are heavily tilted toward secondary and tertiary (as opposed to pre-primary and primary) levels for students aged 12 or above, whose language learning efficiency or acquisitional ease has generally become more sluggish to say the least. By comparison, government funding for pre-primary education is insignificant. At a biological stage when preschoolers’ sensitivity to language inputs and language learning tasks is much higher, government support is meager relative to the goal of optimizing schoolchildren’s learning outcomes in the target languages, English and Putonghua. We will further explore the policy implications in the last chapter.

7.4 Learning Putonghua as an Additional Language: A Sequential Approach to Developing Additive Bilingualism

Cantonese being the regional lingua franca in the Pearl River Delta, plus the fact that it has been actively used in the domains of government, media (broadcast and print, cf. ‘written Cantonese’, see Chap. 3), education, business, films and other lingua-cultural consumables such as karaoke video discs in Hong Kong since the 1990s, there is as yet no evidence that it is under any threat of language shift or loss (compare Bauer 2000; Li 2000). At the same time, curriculum space being limited, it is imperative for the education authorities to identify efficient and effective means to help younger generations of Hongkongers to develop a high level of communicative competence in Putonghua, in keeping with the national language policy of ‘dialect bilingualism’ or ‘bidialectalism’ (Erbaugh 1995; Li 2006), but also to facilitate communication with Chinese Mainlanders from non-Cantonese-speaking areas. The key to success to promoting Putonghua in the SAR is to ensure that any advancement in its community-wide promotion does not take place at the expense of the majority’s first language, Cantonese. In other words, rather than the much dreaded scenario of subtractive bilingualism, additive bilingualism should be the target model, to be supported by empirically sound learning outcomes.

Nearly two decades have elapsed since Putonghua was made an integral part of the primary (and, to a lesser extent, secondary) school curriculum. Despite some encouraging signs that the early introduction of Putonghua in the lower primary curriculum since the year 2000 has yielded some positive results, as reflected in the Putonghua competence of secondary and tertiary students today (Zhu et al. 2012), the progress attained by studying Putonghua for 2–3 hours per week is slow. There is general consensus among scholars, school principals and teachers of Chinese that on its own, teaching Putonghua as a separate subject is unlikely to bring about any major impact relative to developing students’ Putonghua competence. Given the closer lexico-grammatical affinity between Putonghua and SWC, embedding Putonghua into the teaching of the Chinese Language subject (i.e., TCP) would seem to be a reasonable alternative and goal. Provided a pedagogically sound curriculum design is in place to overcome the contrastive phonological differences between Cantonese and Putonghua (Chap. 3) with demonstrably attainable goals, setting TCP as a long-term objective (SCOLAR 2003) is entirely worth supporting.

Critical voices and dissenting views among scholars and educators, in public as well as social media on the internet, are not rare. For example, some short essays critiquing TCP as an ideology and practice may be found on the Internet.Footnote 19 Dispreference of using Putonghua to teach Chinese may also be found in more neutral reports. For instance, the trilingual student Susane Wong Yui-Hin cited above, who at age 17 achieved outstanding HKDSEFootnote 20 results – 5** in Chinese language, Chinese History, Chinese Literature, English, Mathematics, Economics and Liberal Studies, plus grade ‘A’ in Spanish – chose Chinese as the major for her undergraduate degree program at CUHK (Chik Wiseman 2014). Of particular interest here is the fact that the formative stage of her Chinese literacy acquisition, from kindergarten to Form 3 (Grade 9), took place in Putonghua:

Wong spent her primary years at C. & M.A. Chui Chak Lam Memorial School in Yuen Long, where Chinese classes were taught in Putonghua, but she had already started learning it in kindergarten. The language skill made it easy for her to transition to the secondary school, as the lower forms also use Putonghua to teach Chinese. In the last three years leading up to the DSE, however, the subject was once again taught in Cantonese. (Chik Wiseman 2014)

Despite being fluent in Putonghua, Susane reportedly felt unsure about using Putonghua to study Chinese. When asked which medium of instruction she would recommend for studying Chinese, she was quoted as saying “I think that Cantonese is still a better option for teaching the Chinese subject as it will serve most people” (Chik Wiseman 2014).

Apart from pedagogical concerns such as the availability of fluent and professionally trained TCP teachers and the attitudes of the school management, on purely curricular grounds strong reservation against TCP may be broadly accounted for by two main concerns: (a) a lack of empirical evidence that TCP students’ performance in the Chinese Language subject is at least on par with, if not better than, their peers’ performance in Cantonese-medium Chinese Language classes; and (b) a fear of subtractive bilingualism being the learning outcome, as expressed in a concern that TCP students would lose the ability to articulate their ideas and thoughts in colloquial, idiomatic Cantonese (see, e.g., parents’ views, ATV Home 2014).

In light of the scientific insights extrapolated from psycholinguistic and neurolinguistic research above, I will recommend a few strategies that in my view would help enhance the efficiency and effectiveness of Putonghua teaching and learning among Cantonese-L1 young learners.

7.5 Teaching Putonghua to Cantonese-L1 Learners: Proposed Strategies

In light of the review of the relevant literature in reading and literacy development in L1 and L2, plus instructive insights obtained from neuroscience research above, I will recommend three strategies to enhance the quality of Putonghua teaching and learning: (i) early exposure to Putonghua, K1–K3; (ii) teaching pinyin at Primary 1; and (iii) teaching Chinese in Putonghua, P1–P3.

Recommended Strategy 1: Early Exposure to Putonghua, K1–K3

Research in reading development has shown that literacy, in any language, is mediated by speech (Chap. 3). This insight has received strong empirical support in numerous psycholinguistic word reading and recognition experiments in English and Chinese, suggesting that effective language acquisition, be it an alphabetic language like English or a logographic language like Chinese, is premised on the learner’s phonological awareness of the target language(s). Phonological awareness refers to:

being aware of the fact that a speech stream can be segmented into small discrete units such as syllables and phonemes, which can be counted, deleted, and manipulated in other ways. One such discrete unit is represented in each graph of a writing system: A phoneme in an alphabetic letter; a syllable in a syllable graph; and a tone syllable in a Chinese character. (Taylor and Taylor 2014, p. 143)

Phonological awareness is absolutely crucial for Chinese children’s reading acquisition and literacy development, as Tseng (2002) puts it:

We cannot understand a writing system without considering the spoken language it attempts to transcribe (...). In fact, a major task in learning to read is for the reader to come to an understanding of the nature of the correspondence between the written script and the spoken language. (Tseng 2002, p. 5)

The intimate link and inter-dependency between spoken and written language as an important key to literacy development is also clearly evidenced in Gudschinsky’s (1976, p. 3) definition of a literate person:

That person is literate who, in a language s/he speaks, can read with understanding anything s/he would have understood if it had been spoken to him [or her]; and can write, so that it can be read, anything s/he can say. (Cited in Stubbs 1980, p. 13)

Particularly worth highlighting in this “single quotable statement” is “the critical element of reciprocity between oral and written competence together with scrupulous neutrality in respect of the area to which the skills of literacy are applied” (Carrington 1997, p. 82). As Stubbs (1980, p. 13) observes, this “useful and careful definition” of functional literacy is grounded in Gudschinsky’s lifelong involvement and experience in organizing literacy programs for people from different language backgrounds in various developing countries. One significant implication for literacy training, in any language, is the need to equip learners with speech associated with that language. During the colonial era, before Putonghua came into the language-in-education policy matrix, Cantonese was the only Chinese variety used for bridging the link between the vernacular and written Chinese in Hong Kong. With Putonghua and written Chinese added in the postcolonial language policy goal of biliteracy and trilingualism, such a link may be strategically extended from Cantonese to include Putonghua within part of its primary curriculum, in keeping with the principle and goal of developing additive bilingualism. The question is: ‘How?’

In Hong Kong, one of the targets of literacy development is Standard Written Chinese (SWC), which adopts a logographic writing system and, in the two Special Administrative Regions Hong Kong and Macao, is written in the traditional rather than the simplified script. This makes the acquisition of Chinese literacy a relatively more cumbersome task for Hong Kong Chinese students compared with their peers in Mainland China (Chap. 3). Further, given that SWC is more closely aligned with Putonghua than Cantonese, to capitalize on the lexico-grammatical affinity between Putonghua and SWC, Cantonese-L1 students should ideally be exposed to Putonghua as early as possible, preferably through such multi-modal resources as songs, nursery rhymes, games, riddles, poems, and extracts of verses adapted from primers such as the ‘Three Character Classic’Footnote 21 and ‘The Book of Family Names’.Footnote 22 Far from being a chore, rote learning or committing words to memory – in any language – is what preschoolers are good at, provided the right kinds and amounts of interesting input in the target language are assured. Such a practice has been a trialed-and-tested, age-old method in traditional Chinese literacy training for young children in China (Tao and Qian 2012a, b; cf. ZHANG Zhigong 1992). Hao (2001b) also echoes Zhang’s (1992) suggestion that selected extracts from traditional primers be incorporated into the contemporary primary curriculum, with a view to speeding up children’s grasp of basic Chinese literacy and reading skills development:

The ‘Three Character Classic’ is easy at the beginning, with few characters that are difficult to write, recognize and read. ‘The Book of Family Names’ contains four characters per phrase, with a focus on written forms rather than meaning, which is conducive to enhancing [children’s receptive] knowledge of Chinese characters (...). (Hao 2001b, p. 104, my translation; cf. ZHANG Zhigong 1992)Footnote 23

In addition to exposing children to everyday, general knowledge in Putonghua, a traditional primer like the ‘Three Character Classic’ also teaches about Chinese ethics, and so content-wise it lends itself very well to meeting young children’s needs for basic literacy and general education (Hao 2001b, p. 104; cf. ZHANG Zhigong 1992). That rote learning in an L2 could be handled by preschoolers relatively effortlessly is partly evidenced by Cantonese-L1 kindergarteners performing linguistically sophisticated tasks at choral speaking competitions and recital contests, in English or Putonghua, individually or in groups. Such challenging tasks could not have been accomplished without preschoolers first memorizing the poems or verses in accordance with the norms of pronunciation required. News stories on such preschoolers’ marvelous performance in English and Putonghua are reported from time to time; for instance, one news story in March 2015 features three adjudicators giving a thumbs-up overall appraisal of the performance of 14 K1–K3 finalists at the Second Hong Kong Kindergarten Choral Speaking Contest.Footnote 24

In terms of learning goal, however, it is crucial that the pedagogical priority at pre-primary and early primary levels be focused on developing young children’s receptive competence in recognizing and reading (aloud) Chinese characters in their home language (Cantonese) and Putonghua, rather than developing productive competence in writing them correctly. That is, nurturing young children’s ability to recite, sing or read (aloud) texts composed in various poetic genres and recognize the characters thus memorized is far more important than their ability to produce them in writing following the mandatory sequence of strokes. This is so because physiologically, young children’s hands are not yet fully developed to handle the writing of characters repeatedly, especially those that involve a large number of strokes (S.-K. Tse 2001). S.-K. Tse and his colleagues made a very good point that in Hong Kong, children are often required to memorize and dictate a large number of high-frequency characters that are judged to be important almost exclusively from adults’ point of view (Lee et al. 2011, p. 667; cf. S.-K. Tse et al. 2007; S.-K. Tse 2001, 2014). This is what makes the learning of Chinese characters in literacy-focused activities such a tedious, boring and demotivating chore. Guided by the phenomenographic theory of learning, through the ‘integrative perceptual approach’ to teaching and learning Chinese characters, S.-K. Tse et al. (2007) and Lee et al. (2011) have demonstrated how the learning of Chinese characters can be made more enjoyable. The key is for kindergarten teachers to accommodate young pupils’ mental lexiconFootnote 25 (Aitchison 2003) when planning their literacy-focused activities, such that lexical items that children bring to the classroom by virtue of their frequent occurrence in everyday life (e.g., kinship terms; food items like rice, pork, beef, fish, hamburger and sundry vegetables; names of the children’s neighborhood; TV cartoon figures, and so forth) may be exploited and used as a stepping stone for raising their awareness of various orthographic principles of character formation, for example, introducing characters formed similarly by phonetic compounding with the same phonetic component or semantic radicalFootnote 26 (S.-K. Tse et al. 2001; cf. Hao 2001a, 2001b; S.-K. Tse 2001, 2014). In addition to exploiting preschoolers’ mental lexicon in their home language Cantonese, however, I would suggest extending this teaching strategy to Putonghua on a trial basis, and monitor the preschoolers’ performance.

Drawing on extensive bilingual acquisition data involving six children bilingual in Cantonese and English growing up in Hong Kong,Footnote 27 Yip and Matthews (2007) have demonstrated convincingly how these children developed bilinguality naturally. They also found a variety of linguistic evidence showing:

how a dominant language influences the development of a weaker language and vice versa in a number of grammatical domains, resulting in bidirectional cross-linguistic influence; and how bilingual children may take strikingly different paths from monolingual children to reach the target grammar. (Yip and Matthews 2007, p. 256)

Hong Kong-based bilingual acquisition research is therefore not at all a tabula rasa. There is much that we can learn from exemplary studies towards establishing a theoretically informed research agenda concerning the feasibility and desirability of Teaching Chinese in Putonghua (TCP) in multilingual Hong Kong. In my view, Yip and Matthews’s (2007) carefully conducted longitudinal research, and the data-driven groundwork that they have laid in children’s bilingual acquisition of Cantonese and English, may serve as a useful model or starting point for conceptualizing and extending children’s bilingual acquisition research to include the acquisition of Putonghua (cf. Yip 2006).

Recommended Strategy 2: Teaching Pinyin at Primary 1

Provided preschoolers have developed a certain level of sensitivity to Putonghua before entering primary school, it will be opportune time, at Primary 1, to consolidate their phonological awareness in Putonghua by teaching them pinyin systematically. In general, local schools’ current practice still adheres to the SAR’s curriculum guidelines devised in 1997, whereby pinyin is taught over a 6-year curriculum: teaching tones in P1–P3, vowels and consonants from P4, and revision in P5–P6 to consolidate students’ pinyin knowledge. S.-M. Tse (2010, 2013) reviews the role of pinyin in the TCP policy and rightly points out that such a pace is too slow; instead, she recommends that pinyin should be introduced thoroughly and much earlier, at P1–P2. In a small-scale study she conducted (S.-M. Tse 2013) involving the teaching of pinyin to P1–P2 students within 3 to 6 months, including revision and consolidation, encouraging results were obtained. She then draws implications by comparing her pilot scheme with the policies and practices in Mainland China, Taiwan and Singapore, where the phonetic transcription system – pinyin in Mainland China and Singapore, Zhuyin Fuhao in Taiwan – is taught in early primary (S.-M. Tse 2013, pp. 222–223; see Table on p. 225; cf. Cheung and Lo 2006). In Taiwan, Zhuyin Fuhao is taught over 10 weeks, while in Singapore the teaching of pinyin is embedded in other teaching objectives, including conversational skills and the reading and writing of Chinese characters over a 14-week period.

In mainland China, depending on the schoolchildren’s ‘dialect’ background, the teaching of pinyin may be completed within 6–8 weeks in Mandarin-speaking areas but up to 12 weeks in ‘dialect’ areas (cf. Ingulsrud and Allen 1999, 2003). Dai (2001, p. 150) outlines the typical curricular arrangement widely followed by Han-Chinese primary schools in mainland China, whereby pinyin is introduced in the first term at Primary 1, but will gradually phase out in a 3-year transitional process until the second term of Primary 3, as follows:

  • Primary 1, 1st term: focus on reading pinyin texts; gradual transition to hanzi texts by beginning of P2;

  • Primary 2, 2nd term: principally hanzi texts, supplemented with pinyin for difficult hanzi;

  • Primary 3, 2nd term: only difficult characters are supplemented with pinyin – children should be able to pronounce them.

The auxiliary role of pinyin in this curriculum design, according to Dai (2001), is to facilitate and promote schoolchildren’s reading development through independent learning.Footnote 28 One important merit of earlier introduction of pinyin at P1–P2 is for pupils to master this important tool to facilitate self-learning of Chinese characters, in particular to look up vocabulary words in dictionaries using pinyin where necessary. It should be emphasized that for the teaching of pinyin to work at early primary level, that is, for P1–P2 students to discover its patterned phonological features – segmental or suprasegmental – in Putonghua, preschoolers must have developed a fairly high level of sensitivity to its speech sounds, preferably through exposure to multi-modal, visuals-enriched resources (Chap. 9). In other words, recommended strategy 1 is a pre-condition for recommended strategy 2.Footnote 29

One argument that mitigates against the early introduction of the Roman-alphabet based phonological coding system pinyin is possible confusion with the pronunciation of alphabetically based English words. This is understandable given that in some cases, the same letter combination in pinyin and English are associated with very different normative pronunciations (compare, e.g., pinyin bān, 班: /pɑːn/, which is rather different from its English near-homograph ban: /bæn/). However, provided the sound-spelling patterns in both target languages are supported by unambiguous pronunciation and illustrated with ample examples in context (e.g., by different teachers in English versus Putonghua classes) – assuming quality classroom input – such confusion should gradually give way to metalinguistic awareness of language-specific pronunciations among young learners over time.

Recommended Strategy 3: Teaching Chinese in Putonghua, P1–P3

Relative to the goal of extending Cantonese-L1 students’ linguistic sensitivity to Putonghua and consolidating their grasp of its phonological system and rules such as tone sandhi, using Putonghua to teach Chinese is probably the most productive at lower primary level, P1–P3. Following the practice in Mainland China, all Chinese characters in the main texts of the course books should have pinyin clearly indicated to facilitate the learning of their pronunciation. Care should be taken to ensure that all pinyin marks are reader-friendly, in that their legibility would not be unduly affected by the choice of a poor color scheme (e.g., legibility problems will likely arise if the characters are printed in black, pinyin in blue, and stress marks in red, see, e.g., Y.-N. Wong 2012, p. 114). In addition, to fully capitalize on the linguistic affinity between Putonghua and written Chinese texts, it is advisable to avoid teaching classical texts with wenyan elements such as poems in Putonghua; rather, where necessary, poetic genres are more appropriately taught in Cantonese. In other words, Putonghua would be more productively taught if the texts in question contain interactional features in conversation (Leung and Fan 2010; cf. Tong et al. 2006).

In terms of pedagogical support for teachers, in addition to traditional methods such as reading aloud (Lo 2000a, b), a number of teaching strategies or practices may be helpful. First, it is suggested that all character texts in the P1–P3 curriculum should be made available and accessible in standard Putonghua (e.g., online) to facilitate imitation and practice. Second, it would be a good idea to develop appropriate teaching materials such as nursery rhymes to help raise students’ phonological awareness and rhythm, for example, through teaching aloud (朗讀, long 23 duk 22 /lăng dú):

大熊貓,//不是貓,

dà xióng māo, // bú shì māo,

‘big pandas are not cats’

黑色//白色//全身毛,

hēi se // bái se // quán shēn máo,

‘black and white, whole body covered in hair’

愛吃竹葉//不吃肉,

aì chī zhú yè // bú chī ròu,

‘like to eat bamboo leaves but not meat’

孩子見了//個個笑。

hái zi jiàn le // ge ge xiào.

‘children [who] see them will all laugh’

 

(Tong and Mok 2000, p. 107, pausing at ‘//’)

Apart from interesting short rhyming texts like this one, S.-M. Tse (2010, p. 179) offers a few instructive illustrations of a productive teaching strategy for raising students’ phonological awareness:

(a)

meaning-focused mnemonic (here, targeting the consonant in the last syllable), e.g.:

收聽廣播 bō bō bō

shōu tīng guăng bō // bō bō bō

‘listen to [radio] broadcast’

爬上山坡 pō pō pō

pá shàng shān pō // pō pō pō

‘climb up [hill] slope’

(b)

rhyming verses / couplets, e.g.:

小獅子,過生日,

xiăo shī zi, guò shēng rì,

‘little lion, have birthday’

好朋友,全到齊

hăo péng yǒu, quán dào qí.

‘good friends, all arrived’

吃蛋糕,喝果汁,

chī dàn gāo, hē guǒ zhī,

‘eat cake, drink juice’

慶生日,真歡喜

qìng shēng rì, zhēn huān xĭ.

‘celebrate birthday, really happy’

Third, being good at rote-learning, students aged 6–8 may be tasked periodically with memorizing rhyming short texts, which may be reinforced through unseen dictation (individually) or recitation (individually or in groups). Inter-class or inter-school competitions in Putonghua, in the form of games and riddles, may also be held to stimulate students’ interest and motivation (Huang and Yang 2000, pp. 219–220). All this takes place in tandem with the teaching of the target number of Chinese characters set for P1–P3 (cf. S.-K. Tse 2014).

Expected learning outcomes in Putonghua and the Chinese Language subject by the end of Primary 3, age 8–9. Following the recommended strategies 13 above, I believe there is good potential for students to have acquired Putonghua up to a fairly high level (cf. Huang and Yang 2000). At the same time, being familiar with pinyin as a learning aid or tool, they will stand a better chance of being able to look up unfamiliar characters independently for their meanings or normative pronunciation in Putonghua. Together, these two learning outcomes will hopefully lay a solid foundation for P4 students to become life-long learners as they move up the education ladder from primary to tertiary in preparation for their work life.

The status of Putonghua in Cantonese-dominant Hong Kong, whether it is more like a second or foreign language, or a ‘half first, half second language’ (i.e., L1.5, Lai-Au Yeung 1997), depends essentially on the learner’s onset age and how it is taught and learned. If it is introduced early at K1K3 (age 4–6) more or less along the lines outlined above, thanks to the “time-delimited window in early life” (Mayberry and Lock 2003, p. 382), the condition of Putonghua learning may well be comparable to L1 acquisition, even though its support in the home may be inadequate or lacking. On the other hand, delaying the teaching of Putonghua and pinyin as a learning tool in the curriculum till late primary level (i.e., P4P6, age 911, cf. late immersion) would make the learning of it more like an L2. Those students who learn Putonghua from scratch at secondary level (age 13 and beyond), with or without also learning pinyin, would be learning it like a foreign language.

Apart from being psycholinguistically and neurolinguistically informed – age 46 being a neurobiological time-delimited window for effective language acquisition – the three recommended strategies above are premised on two policy assumptions if they are to bear fruit as hoped:

  1. (a)

    Cantonese will continue to be widely used in society and in school, as MoI in other subjects, so that TCP from P1 to P3 would pose no risk to the vitality of Cantonese in society; and

  2. (b)

    Education authorities should make it very clear that the proposed strategies for teaching Putonghua above, if implemented, are guided by the principle of additive, rather than subtractive bilingualism.

In addition, the following support measures may raise the odds of success of the proposed strategies:

  1. (a)

    Education authorities should step up the training of qualified Putonghua teachers to take up the teaching of Putonghua (as a subject or TCP) from preschool and kindergarten K1–K3 to lower primary P1P3;

  2. (b)

    The quality of Putonghua teaching materials (e.g., TCP textbooks) should be monitored closely, while support for their development should be strengthened considerably;

  3. (c)

    Scholars with expertise in Putonghua teaching and Putonghua teacher training should be engaged to monitor the learning outcomes of Putonghua and to tackle any problems arising, be they policy-driven or pedagogy-related;

  4. (d)

    A user-friendly website, bilingual in Chinese and English, where Putonghua teachers’ problems can be posted should be set up and serve as a platform for addressing queries and exchanging views regarding how best to resolve them, much like the online forum, expert advice and support provided to front-line mother-tongue education teachers by CECLER (2004);Footnote 30 and

  5. (e)

    Advice and assistance should be provided to teachers of Putonghua at kindergartens and schools, with a view to creating an environment which is conducive to using and learning Putonghua.

The approach to TCP proposed here may be characterized as a ‘sequential additive plurilingualism’ model: Putonghua is introduced at a linguistically sensitive neurobiological stage to capitalize on the “time-delimited window in early life” (Mayberry and Lock 2003, p. 382) from K1–P3 (age 4–8). Apart from familiarizing schoolchildren with a subset of the Chinese characters in Putonghua by the end of lower primary level, they will also be equipped with an important tool for independent learning (e.g., checking the Putonghua pronunciation of unfamiliar characters by applying their knowledge of pinyin, or vice versa; using their knowledge of pinyin to check the written forms of known characters). Above all, provided the MoI is switched back to Cantonese at Primary 4 (age 8–9), and given that most of the P1–P3 Chinese vocabulary taught in Putonghua will be recycled from Primary 4 onwards, there is little risk of Cantonese-L1 pupils’ mother tongue being lost or compromised. For one thing, poetic genres and texts with classical elements will continue to be read and accessible in Cantonese, while pupils at upper primary level or above should be able to work out the corresponding Putonghua version if they so wish. These are among the most obvious advantages of the proposed ‘sequential additive plurilingualism’ model of TCP which, incidentally, is comparable to the “remedial action” proposed by Lord and T’sou (1985) some 30 years earlier, which consists of introducing:

a sound Chinese curriculum into the schools, based on Modern Chinese usage, and supported by a carefully phased introduction of Putonghua [such that effective bilingualism] will rest on the twin pillars of Modern Standard Chinese/Putonghua and English. If that happens the problem of literacy in standard Chinese will largely take care of itself. But there is no point in pretending all this can be achieved by the aid of a magic wand. We need a very careful and properly piloted planned and phased curriculum development, from kindergarten right through to tertiary level and beyond.”

(Lord and T’sou 1985, p. 7; also cited in Lord 1987, p. 10)

In their comparative study of the learning of Putonghua by Cantonese-L1 pupils at Primary 1 under school-based immersion conditions versus as a separate subject, Huang and Yang (2000) conclude that the teaching of Putonghua at secondary level could be obviated provided a solid foundation has been laid at primary level (p. 215). The quality of input at early primary level is therefore absolutely crucial. To ensure success at early primary level, a carefully planned Putonghua curriculum at preschool level is needed to take full advantage of the “time-delimited window in early life” (Mayberry and Lock 2003, p. 382). In terms of curriculum design, as Huang and Yang (2000), among others, have demonstrated, the ‘language across the curriculum’ (LAC) approach, and the more recent ‘content-and-language integrated learning’ (CLIL) paradigm, would seem to be an important pedagogic principle that has good potential for guiding both language and content teachers to work fruitfully together, with a view to making tangible contributions to the extension of our future pillars’ biliteracy and trilingual skills development to include a high level of attainment in Putonghua.