Introduction

The relationship between language and music systems has been investigated for many years (Christiansen & Kirby, 2003; Honing, 2018; Masataka, 2020). Though music, defined as temporally organized sounds in isochronous grid and frequency, differs from verbal language which is regarded structurally arbitrary and continuous (Koelsch, 2019; Reybrouck & Podlipniak, 2019), literature suggests their underlying mechanisms might both derive from temporally organized acoustic signal processing (Asaridou & McQueen, 2013; Masataka, 2020; Patel, 2010). This might be observed through similar developmental and neurobiological patterns of these two systems. For instance, longitudinal studies discovered that music and language structures are undistinguishable for infants until the emergence of the first word (Nikolsky, 2020). Moreover, a recent review conclusively claimed that music and language processing shared intimate neural connections in human brain, particularly undifferentiated in infancy (Koelsch, 2019).

Through investigating specific features of language and music, studies have attempted to disentangle their underlying relationship. Musical rhythm, a nonlinguistic beat-based sequential pattern of time interval (Kasdan et al., 2022; Ozernov-Palchik & Patel, 2018), has been discovered to have association and promising predictive effect to various linguistic domains, including phonological awareness and reading skills (Lee et al., 2020; Ozernov-Palchik & Patel, 2018; Tierney et al., 2021). For example, Politimou et al. (2019) discovered that phonological awareness significantly correlated to musical rhythm perception and synchronization to the beat in children aged between 3 and 4 years old. Additionally, studies showed that phonological awareness skills could be effectively predicted by musical rhythm skills (Anvari et al., 2002; Ozernov-Palchik et al., 2018; Swaminathan & Schellenberg, 2020). Apart from rhythm-phonology relationship, a link between musical rhythm and reading skills were also discovered. Tierney et al. (2021) discovered positive correlations between musical rhythm and reading ability in multiple modalities in children aged between 7 and 11 years old. Moreover, Ozernov-Palchik et al. (2018) examined the relationship between musical rhythm and reading abilities in preschoolers and discovered significant contribution of beat-based rhythm to reading, controlled for cognitive abilities and phonological awareness. They assumed that the rhythm-language link was partially mediated by phonological awareness, implying the temporal signal processing skills shared by musical rhythm and phonology might be crucial for rhythm-language relationship.

However, inconsistencies were also discovered. Patscheke et al. (2018) administered either rhythm or pitch training in children between 4 and 6 years old. Their results revealed that only the pitch training group significantly improved in phonological awareness, compared to the rhythm training and the control groups. Whereas, they claimed that the materials in pitch training also involved musical rhythm elements with emphasis on melody-related elements. Their results indicate that pitch-related musical performance, such as melody perception and reproduction, might also associate with phonological awareness. Verney (2013) also observed significant improvement in 4-to-5-year-old children’s phonological awareness and reading skill, after receiving rhythm-related interventions. He assumed that the music aptitude required in tapping to an isochronous beat is directly linked to phonological awareness skills. However, inconsistencies were found in Bonacina et al. (2020)’s study. They discovered that, in children aged between 5 and 8 years old, phonological awareness were significantly related to performance on reproduction of rhythmic pattern (i.e., asynchronous tapping) but not to real-time musical rhythm reproduction (i.e., drumming to an isochronous beat), whereas the association to reading skill was vice versa. They assumed that the association between phonological awareness and musical rhythm reproduction might be underpinned by shared abilities to retain and integrate temporal information. On the other hand, isochronous rhythmic reproduction and reading skill both involve the temporal coordination between action and predictable external cues. These findings suggested that the specific effect of different rhythmic components towards specific language abilities warrants further exploration. Moreover, the decision of the methodology may also affect the examined aspects of rhythm-language relationship. Overy et al. (2003) reviewed previous literature and discovered that most of the musical aptitude tests requires complex cognitive processes. Thus, they developed a series of simplified but reliable music aptitude tests for dyslexic children by reducing task demands and task-unrelated cognitive loadings. Aligned with Overy et al. (2003), the current study adopted their test designs in order to examine the maximal performance of the preschoolers. Through untangling the relationship among phonological awareness, reading skills, and perception as well as reproduction of musical rhythm and melody in preschoolers, we may offer future direction for music intervention to enhance certain musical rhythmic components that can foster language development more effectively (Bonacina et al., 2020).

Apart from phonological awareness, this rhythm-language relationship might also be affected by different native language experience. Behavioural and neurobiological evidences suggested that native language experience might shape one’s auditory perception, associated with temporal patterning of linguistic units and music aptitude (Kuhl et al., 2006; Luo et al., 2007; Xu et al., 2006). Kuhl et al. (2006) discovered that infants’ phonetic perception would be less sensitive to non-native phonetic properties after 12 months old. Taking Mandarin-Chinese and English speakers as example, Luo et al. (2007) reported that Chinese-speaking adults could more accurately identify frequency-modulated signals compared to English-speaking adults, suggesting the correlation between tonal language perception and frequency identification. This also supports the difficulty of learning Chinese tones for English speakers (Ling & Grüter, 2022; Pelzl et al., 2019; Shen & Froud, 2019), revealing how native language experience might affect one’s acoustic signal processing skills. Associated with auditory perception, native language experience might also affect music aptitude. Zhang et al. (2020) examined the musical perceptions in Japanese-speaking and Chinese-speaking adults and discovered Chinese native speakers performed better on melody perception than rhythm perception. Moreover, Liu et al. (2021) investigated adults’ music perception skills across 40 languages in a universal and representative sample size, and reported that tonal language speakers performed better on melody discrimination but not on pitch and musical rhythm perception, compared to non-tonal and pitch-accented speakers. Therefore, the influence of native language experience that might shape individual auditory perception, associated with music aptitude, is considered crucial while examining rhythm-language relationship.

Nevertheless, Ozernov-Palchik and Patel (2018)’s review study discovered a lack of cross-language research in this field, omitting native language experience that might saliently affect individual auditory perception, related to phonological awareness and musical rhythm skills. Bekius et al. (2016) examined 26 adults with 11 different native languages and discovered a universal relationship between musical rhythm and reading skills. However, the age range of the recruited participants varied widely, neglecting preschoolers who may have less experiential influences such as music training or non-native language exposure (Bekius et al., 2016; Zhang et al., 2020). Moreover, some studies did not have representative sample size, impacting the effect size of the results. Also, limited studies have examined vocabulary knowledge, which is a fundamental ability for reading, and divided it into receptive and expressive domains, omitting more precise linguistic performance in reading-related skill.

Accordingly, we expected this study to understand whether native language experience would affect musical rhythm skills, and whether musical rhythm skills can predict vocabulary knowledge in preschoolers. We investigated the relationship among musical rhythm, phonological awareness and vocabulary knowledge in Chinese-speaking and English-speaking preschoolers aged between 4 and 6 years old. The predictive effects of the musical rhythm toward receptive and expressive vocabulary knowledges were also examined. The current results may evidence the robustness of the rhythm-language relationship across different native languages.

Methods

Participants

A total of 185 participants composed of 112 monolingual Chinese-speaking preschoolers (hereafter Chinese group; mean age = 67.52 months ± 6.54; 52% girls) and 73 monolingual English-speaking preschoolers (hereafter English group; mean age = 61.42 ± 6.67; 49% girls) were recruited from Taiwan, and the United States (US), respectively. Due to the pandemic, unequal number of the participants for the groups were recruited. The inclusion criteria were as follows: (1) native language is Mandarin Chinese (for the Chinese group)/English (for the English group); (2) did not receive any music-related extracurricular classes; (3) no known intellectual, physical, mental or neurological disabilities, nor any developmental delay or language impairment. Based on the demographic information, participants in both groups were recruited from areas of middle-class residence, as defined by the local city standard. Their written consents were obtained from the participant’s parents prior to the experiment. Ethical approvals of the Chinese and English group were obtained from the Centre of Research Ethics at National Taiwan Normal University and the Protection of Human Subjects Committee at the Southwestern Oklahoma State University, respectively.

Procedure

In the Chinese group, all participants were administered to a series of phonological, and musical tests, including standardized assessments for vocabulary knowledge, self-designed tone awareness test and music ability tests. Two 30-min experimental sessions allocated on two different days were implemented on a one-to-one basis. Prior to each of the tests, practices were available to ensure that the participant fully understood the tasks. As in the English group, three 30-min study sessions instructed on a one-to-one basis were administered. Standardized linguistic measures, and self-designed music ability tests that were identical to the tests for the Chinese group were conducted.

Measures

Table 1 presents the summary and comparison of all measures administered to the Chinese group and the English group. The following sections will introduce these measures separately.

Table 1 Summary and comparison of measures implemented in the Chinese group and the English group

Receptive and Expressive Vocabulary Test (REVT)

The REVT is a standardized assessment evaluating the vocabulary knowledge of Chinese-speaking children aged between 3 and 6 years old (Huang et al., 2010). The full score of REVT is generated from receptive and expressive vocabulary tests, each is composed of the following subtests: (1) Naming subtest, in which the child either pointed to the picture named by the instructor (receptive) or named the picture pointed by the instructor (expressive); (2) Category subtest, in which the child identified 4 pictures presented one-by-one by pointing (receptive) or verbally listed things of the demanded category (expressive); (3) Definition subtest, in which the child identified the item that met the description given by the instructor (receptive) or described the characteristics or functions of an object (expressive); and (4) Reasoning subtest, in which the child pointed to one semantically dissimilar item out of four options (receptive) or expressed the similarity between two items (expressive). Good to excellent internal consistency (Cronbach’s α = 0.80–0.96) has been reported.

Tone Awareness (TA) Task

The TA task is a self-designed assessment measuring the tone discrimination ability in Mandarin Chinese, adopted from same-different paradigm. A total of 4 practice trials and 16 test trials, containing half real and half nonsense syllables, were administered. Each trial contained three monosyllabic words corresponding to a cartoon picture. The child was required to identify the syllable with the odd tone. Each affirmative answer would be scored as one point (maximum score = 20), with higher score reflecting better tone discrimination ability. Detailed task design was replicated from our previous publication (Wang et al., 2012, 2019). The TA task was reported to have moderate internal consistency (Cronbach’s α = 0.63).

Peabody Picture Vocabulary-Fifth Edition (PPVT-5)

The PPVT-5 is designed to assess the receptive vocabulary ability in individuals aged above 2.5 years old (Dunn & Dunn, 2019). The child was required to point out one picture the instructor named out of four options. Pictures were shown in a digital stimulus book through the screen of a laptop. Excellent internal consistency was reported by estimating Spearman-Brown corrected split-half correlations (r = .94–.98).

Expressive Vocabulary Test-Third Edition (EVT-3)

The EVT-3 measures the expressive vocabulary and word retrieval in individuals aged above 2.5 years old (Williams, 2019). The instructor asked questions about a picture presented to the child and instructed him/her to answer with one word that goes with the picture. Pictures were shown in a digital stimulus book through the screen of a laptop. Good test–retest reliability has been reported (r = .90).

Comprehensive Test of Phonological Processing-Second Edition (CTOPP-2)

The CTOPP-2 is designed to assess the phonological awareness in individuals aged between 4 and 24 years old (Wagner et al., 2013). The following 3 subtests were applied in the current study to obtain a phonological awareness composite score: (1) Elision subtest, in which the child was asked to repeat after a word then say the word without a specific sound (e.g., saying “help” without the sound of /h/); (2) Blending Word subtest, in which the child was asked to listen to fragmented sounds of a word via CD then to combine those sounds into an intact word; and (3) Sound Matching subtest, in which the child was asked to identify which of the pictures match either the initial or the final sound of the target picture’s corresponding word. Good average internal consistency for this composite was reported (Cronbach’s α = 0.85).

Melody Discrimination (MD) Task

The MD task is a self-made computer-based assessment designed to examine children’s children’s perception of pitches. This task was modelled after the melody discrimination test in Overy et al. (2003). The child was asked to judge whether two short melodies were identical or not. A total of 10 trials, including two practice trials, were conducted in the task. Each trial would be scored as one point if the response was correct (maximum score = 8), with higher scores indicating better melodic perception ability. The complexity of the melodies gradually increased as the task proceeded. The standard melody, lasting approximately 1.5 s with a 500 ms fade-out, was generated by synthetic sounds (Steinway grand piano, Logic Pro X). The first melody of each trial served as a standard melody, which was composed of three continuing quarter notes, 262 Hz/C4, 294 Hz/D4, and 330 Hz/E4, and it was presented in 120 beats per minute (bpm). The second melody of each trial was either the same as the standard melody or a deviant one where the second note was raised from one to four semitones. Moderate internal consistency was reported (Cronbach’s α = 0.61).

Rhythm Discrimination (RD) Task

The RD task is a self-designed computer-based assessment evaluating children’s abilities to perceive musical rhythm. This task was modified from the rhythm task in Bishop-Liebler et al. (2014). Similar to the MD task, the child was asked to judge whether two rhythmic patterns were identical. Two practice trials and 12 test trials were implemented (maximum score = 12). Each rhythmic pattern lasted 1.6 s, containing five 200 ms pure tones at 500 Hz. Each tone contained 50 ms rise and fall times. The different rhythmic patterns in the trials were created by manipulating the time intervals between the five tones. The inter-stimuli intervals (ISIs) between the first and second tones and between the second and third tones were fixed at 150 ms. The ISIs between the third and fourth tones varied, linearly ranging from 15 to 210 ms (e.g. 15, 45, 60, 150, 210). The ISIs between the fourth and fifth tones were set according to the ISIs between the third and fourth tones so that the sum of these two ISIs was always 300 ms. Examples of a rhythmic pattern represented by time intervals between the tones included “150 ms, 150 ms, 150 ms, 150 ms” and “150 ms, 150 ms, 15 ms, 285 ms.” Moderate internal consistency was reported (Cronbach’s α = 0.67).

Rhythm Reproduction (RR) Task

The RR task is a self-made assessment examining children’s abilities to reproduce musical rhythms. This task was modified from the rhythm copying task in Overy et al. (2003) where the child copied a short rhythm on the keyboard. Unlike the rhythm copying task in Overy et al. (2003), in RR, the child was required to play back the rhythmic patterns by tapping on the desk using a stick with their dominant hand after hearing the rhythmic patterns. These rhythmic patterns were composed of the tubular bell tones, synthesized by Logix Pro X, played at the tempo of 120 bpm. There were five rhythmic patterns composed of a combination of five to nine eighth, quarter, and/or half notes, shown in Fig. 1a. An additional rhythmic pattern was provided for practice. The first three trials were four beats in length and the last two were eight beats in length. For each trial, the child was asked to tap twice and the better performance would be scored by two instructors based on the rhythm and speed. Children received a score between 0 to 4 for their performance (maximum score = 4). The total score would be divided by 5 for the final score of the child. The RR task has been reported to have good internal consistency (Cronbach’s α = 0.85).

Fig. 1
figure 1

a The musical rhythmic patterns used for the Rhythm reproduction task. b The musical rhythmic patterns constructed for the Song rhythm reproduction task

Song Rhythm Reproduction (SRR) Task

The SRR task is another self-made assessment examining children’s abilities to reproduce musical rhythms. It was modified from the song rhythm task in Overy et al. (2003). The child was asked to tap in the same way as they did in the RR task whilst, in this task, they listened to the musical phrase of “Happy Birthday” (Yamaha grand piano, synthesized by Logix Pro X) played at the tempo of 120 bpm. Figure 1b presented four test trials and two practice trials. The test trials had musical phrases composed of a combination of six to seven eighth, quarter, and/or half notes to make six beats. Two practice trials included the first phrase of the chorus of “Jingle Bells” and the first phrase of “Twinkle, Twinkle Little Star.” These musical phrases had a combination of six to seven quarter and half notes to make eight beats. The child’s performance was scored by two instructors in the same way as in the RR task (maximum score = 4). The total score would be divided by 4 for the final score of the child. Good internal consistency has been reported (Cronbach’s α = 0.87).

Data Analyses

SPSS Version 26.0 was used for current statistical analyses. Preliminary analyses confirmed the assumption of univariate normality and linearity were achieved. The outliers (i.e., the score that was two standard deviations from the mean) were replaced by the mean scores of the corresponding tasks in both groups. Relationship between linguistic and musical rhythm assessments was examined by applying partial correlation analyses. For those whose correlation coefficients reached the significance level (p < .05), hierarchical regression analyses would be administered to examine the predictive effect of musical rhythm performance toward receptive and expressive vocabulary knowledges. Specific analytical steps conducted in the Chinese and English groups will be further described in the counterpart sections. Beta (β) was examined for the effect size of hierarchical regression analysis.

Results

Tables 2 and 3 summarize the descriptive statistics of the age and all task performances in the Chinese and English groups, respectively. No ceiling or floor effect was found in any tasks.

Table 2 Averaged age and scores of the task performances in the Chinese group
Table 3 Averaged age and scores of the task performances in the English group

Outcomes of the Chinses Group

Partial Correlations Among Linguistic and Musical Assessments

Table 4 presents the partial correlations among linguistic and musical assessments in the Chinese group, controlled for age. Among linguistic and musical assessments, the receptive vocabulary knowledge was positively associated with the performance on the RD (r = .208, p < .05) and RR scores (r = .257, p < .01), suggesting association between receptive vocabulary knowledge and musical rhythm perception as well as reproduction. Whilst, the expressive vocabulary knowledge related to both the RR and the SRR performances (r = .382, p < .001; r = .403, p < .001, respectively), indicating association between expressive vocabulary knowledge and musical rhythm reproduction. Moreover, the TA score presented significant association with the RD (r = .298, p < .01), RR (r = .348, p < .001), and SRR (r = .316, p < .001) task scores.

Table 4 Partial correlations among vocabulary development, phonological awareness and musical rhythm processing, controlled for age, in the Chinese group

Predictive Effect of Musical Performance to Specific Vocabulary Knowledge

Based on the results of partial correlation analyses, hierarchical regression analyses focusing on the associated musical factors (e.g., RD, RR, and SRR) were conducted to further investigate that to what extent music processing accounted for specific vocabulary knowledges in the Chinese group.

Table 5 demonstrated the processes of the analyses for investigating predictive effects of musical performance to receptive vocabulary knowledge. The TA performance was the only factor exhibiting significant predictive effect toward receptive vocabulary knowledge, β = .328, △R2 = .102, p = .001, suggesting phonological awareness accounted for additional 10.2% of receptive vocabulary knowledge. None of any associated musical rhythm performances revealed significant predictive effect toward receptive vocabulary knowledge. Differing from receptive vocabulary knowledge, expressive vocabulary knowledge could be significantly predicted by age, TA, RR, and SRR performances (Table 6). The TA skill was the most powerful predictor of expressive vocabulary knowledge, β = .431, △R2 = .176, p = .001, suggesting phonological awareness accounted for additional 17.6% of expressive vocabulary knowledge. As for the musical rhythm reproduction, the RR task accounted for 6.4% of its unique variance, β = .283, △R2 = .064, p = .01, slightly higher than the SRR task, β = .255, △R2 = .54, p = .01. Surprisingly, age also accounted for 4.3% of the unique variance on expressive vocabulary knowledge, β = .208, △R2 = .043, p < .05.

Table 5 The contribution of musical rhythm perception and production to receptive vocabulary in the Chinese group
Table 6 The contribution of musical rhythm perception and production to expressive vocabulary in the Chinese group

Outcomes of the English Group

Partial Correlations Among Linguistic and Musical Assessments

Table 7 presents the partial correlations among linguistic and musical assessments in the English group, with age controlled. The receptive and expressive vocabulary knowledge both exhibited significant associations with the MD (r = .478, p < .001; r = .252, p < .05, respectively) and the RD performances (r = .368, p < .01; r = .309, p < .05, respectively), indicating association between vocabulary knowledge and musical rhythm perception. Moreover, the CTOPP-2 scores significantly associated with the SRR scores (r = .277, p < .05), suggesting significant relationship between phonological awareness and musical rhythm reproduction.

Table 7 Partial correlations among vocabulary development, phonological awareness and musical rhythm processing, controlled for age, in the English group

Predictive Effect of Musical Performance to Specific Vocabulary Knowledge

To further explore the contributions of different musical performances to specific vocabulary knowledge in the English group, hierarchical regression analyses focusing on the associated musical factors (i.e., MD and RD) were conducted.

The processes of analysing the predictive effect of musical performance to receptive vocabulary knowledge are demonstrated in Table 8. The results revealed that the MD scores was the most powerful predictor of receptive vocabulary knowledge, β = .432, △R2 = .148, p < .001, suggesting this factor accounted for an additional 14.8% of receptive vocabulary knowledge. Lower than the MD task’s contribution, the CTOPP-2 composite scores accounted for 11.5% of unique variance on receptive vocabulary knowledge, β = .349, △R2 = .115, p < .01. While, the RD task only accounted for 5.7% of unique variance on receptive vocabulary knowledge, β = .250, △R2 = .057, p < .05. Table 9 presents the process of analysis of the predictive effect of musical performance to receptive vocabulary knowledge. Unlike receptive vocabulary knowledge, none of the factors could significantly predict expressive vocabulary knowledge except for CTOPP-2 composite scores, β = .323, △R2 = .097, p < .01, accounting for 9.7% of unique variance on expressive vocabulary knowledge.

Table 8 The contribution of musical rhythm performance to receptive vocabulary in the English group
Table 9 The contribution of musical rhythm performance to expressive vocabulary in the English group

Discussion

The current research examined the relationship among musical rhythm perception reproduction, phonological awareness, and vocabulary knowledge in 4-to-6-year-old Chinese-speaking and English-speaking preschoolers. Our empirical results indicated that vocabulary knowledge is significantly associated with musical rhythm perception and reproduction across languages. In the Chinese-speaking preschoolers, musical rhythm perception and reproduction were significantly related to receptive vocabulary knowledge, whereas only musical rhythm reproduction was significantly related to expressive vocabulary knowledge. On the other hand, in the English-speaking preschoolers, both types of vocabulary knowledge were related to musical rhythm perception. Interestingly, regarding the predictive effect of musical performance to vocabulary knowledge, musical rhythm reproduction skills effectively predicted the expressive vocabulary knowledge in Chinese-speaking preschoolers, whilst musical rhythm perception skills predicted the receptive vocabulary knowledge in the English-speaking preschoolers. Our results may not only offer empirical evidence for the relationship between musical rhythm and language abilities across different languages, but also the present specific relationship between musical rhythm and vocabulary knowledge.

Relationship Between Musical Rhythm and Vocabulary Knowledge Across Languages

Consistent with previous literatures (Asaridou & McQueen, 2013; Koelsch, 2019; Nikolsky, 2020), the abovementioned associations between language and musical rhythm performance might support the argument that music and language mechanisms were both derived from temporally acoustic signal processing, starting from early childhood (Ozernov-Palchik et al., 2018; Patel et al., 1998; Politimou et al., 2019). In addition, our study expanded previous investigation by examining preschoolers with different native languages. The correlational results of the Chinese group revealed that receptive vocabulary knowledge was positively associated with both musical rhythm perception and reproduction, suggesting that children who are more sensitive to the auditory cues of the rhythmic patterning had better vocabulary knowledge in receptive performance. A possible interpretation may be that musical rhythm is a valuable perceptual cue to the syllabic structure within an utterance of the language. Young children tend to exploit this characteristic to perceive the suprasegmental patterns of their native languages and to produce proto syllables (Goswami et al., 2002). This relationship between reproductions of vocabulary and musical rhythm might also affecting each other, as our results of expressive vocabulary knowledge exhibited. Apart from the Chinese group, the results of the English group showed that both types of specific vocabulary knowledges were related to musical rhythm perception but not reproduction, consistent with the Precise Auditory Timing Hypothesis proposed by Tierney and Kraus (2014), arguing that language development demands the ability to sense precise timing, associated with musical rhythm perception. Though our results might provide a preliminary picture of the relationships between specific musical rhythm performance and vocabulary knowledge across languages, their underlying mechanisms remain unclear. Therefore, future exploration of why specific vocabulary knowledge is related to different musical rhythm performance and why this specific rhythm-language relationship differs in different native languages were required.

Relationship Between Musical Rhythm and Phonological Awareness Across Languages

In addition to the correlation between musical rhythm and vocabulary knowledge, our study further extended previous research of rhythm-phonological relationship on a behavioural basis (Anvari et al., 2002; Montague, 2002). In the Chinese group, lexical tone awareness was related to both the musical rhythm perception and reproduction; whereas in the English group, phonological awareness was related to musical rhythm reproduction. These results were consistent with Moritz et al. (2013) who found that the ability to copy rhythmic patterns was related to the ability to segment words in sentences and syllabus in words. Comparably, the current results were also aligned with Bonacina et al. (2020)’s findings, in which revealed a significant relationship between phonological awareness and musical rhythm asynchronous reproduction in both groups, indicating the shared ability of retaining and integrating temporal information. A longitudinal research suggested a significantly positive correlation between beat reproduction and phonological awareness in 53 six-year-old children (David et al., 2007). They concluded that this correlation between phonological awareness and rhythm perception might be related to fundamental skills in blending and segmenting sounds as well as in processing temporal sequence. A hypothesis by Peynircioglu et al. (2002) has suggested that phonemic awareness and tonal processing might both involve similar skills in sound discrimination, temporal sequencing, attention, and working memory. Through exposure to speech and musical experiences, children practice and develop auditory skills, such as analysing auditory patterns, which may also contribute to the overlap among musical perception, speech perception, and phonological awareness. Specific differences of rhythm-music relationship were found between the Chinese and English groups, extending Patsckeke et al. (2019)’s study and Patel (2010)’s Shared Sound Category Learning Mechanism Hypothesis (SSCLM), in which exclaimed that categorical building blocks of language (e.g., phonemes) and music (e.g., melody and music notes) are correlated. In the current outcomes, the phonological awareness did associate with melody-related reproduction task (i.e., the SRR task) in both groups, consistent with previous discoveries; whereas, significant association with musical rhythm perception and reproduction were also observed in the Chinese group. These results might provide an explorable direction to understand the commonality and distinction between music aptitude and the development of different languages.

Predictive Effect of Musical Rhythm to Vocabulary Knowledge

Additionally, our study further expanded previous investigations of the predictive effect of musical rhythm toward language abilities. By examining specific domains of vocabulary knowledge and musical rhythm under different native language experiences, we further discovered that only expressive vocabulary knowledge in Mandarin-Chinese and receptive vocabulary knowledge in English were contributed by musical rhythm reproduction and perception, respectively. Interestingly, none of any musical rhythm performances’ contributions to Chinese receptive vocabulary knowledge and English expressive vocabulary knowledge were significant. This might be affected by the contribution of phonological awareness to vocabulary knowledges, supporting Ozernov-Palchik et al. (2018)’s assumption to the mediation of phonological awareness toward this rhythm-language link. These results indicate the temporal signal processing skills shared by musical rhythm and phonology might affect rhythm-language relationship. Future research may follow the longitudinal influence of phonological awareness to specific vocabulary knowledge in different languages.

Limitations

Although the current study provided evidence of the relationship between musical rhythm and vocabulary knowledge, several limitations should be noted. First, the sample size for each group might not be representative and comprehensive enough. Thus, larger and more universal population should be included to obtain more promising evidence. Due to the pandemic, the unequal sample size between groups limited the examination of comparison of the abovementioned results. Future study may implement more balanced experiments to compare the differences between groups. Second, parent education level was not examined in the current study. Previous research suggested that parent education level might affect the language development of their children (Becker, 2011). Therefore, further study may control participants’ demographic information to prevent potential confounding variables and compare the differences between groups. Finally, the current study applied several self-designed assessments that only achieve moderate reliabilities and validities. Psychometrics of those musical assessments or other well-known standardized measurements may be further established or applied to ensure better fidelity of the results. Due to the aforementioned limitations, the current study could not directly compare the Chinese group to the English group. However, our results still offer a preliminary exploration of rhythm-language relationship across languages.

Conclusively, the current study contributed to the increasing body of research investigating the relationship between musical rhythm and language abilities. We examined the relationships among musical rhythm, phonological awareness, and vocabulary knowledge in Chinese-speaking and English-speaking preschoolers. We found significant association between the musical rhythm performance and vocabulary knowledges across languages, suggesting music aptitude as potential predictor and facilitator for language abilities. Future studies may explore the longitudinal influence of musical rhythm upon language development across languages.