Autism spectrum disorder (ASD) refers to a range of neuro-developmental conditions characterized by persistent deficits in social communication and interaction, along with restricted, repetitive patterns of behavior, interests, or activities (American Psychiatric Association, 2022; Ennis-Cole, 2019). ASD has become more common globally. Statistics from the US Centers for Disease Control and Prevention (2023) reported an increased autism rate among children in the US from 1 in 44 in 2018 to 1 in 36 in 2020. In China, the first nationwide population-based study measuring the prevalence of ASD among children aged 6 to 12 was conducted between 2014 and 2016. This study reported an estimated prevalence of 0.70% (Zhou et al., 2020). Relevant studies have consistently indicated that children with ASD deliver poor language and cognitive performance (Cantio et al., 2018; Kelly, 2011).

In terms of language performance, previous research has demonstrated that, generally speaking, children with ASD have deficits in basic language domains and possess poor higher-order language skills involving the full use of primary language abilities and cognitive skills (e.g., Kelly, 2011; Kim & Pilcher, 2016; Knight & Sartini, 2015). Among the higher-order language skills, listening comprehension of literal and inferential information is greatly impaired in the ASD group (Knight & Sartini, 2015; Sorenson Duncan et al., 2021). Literal comprehension involves understanding information that is explicitly presented, whereas inferential comprehension entails making connections and integrating information to go beyond what is directly stated (Dawes et al., 2019a; van Kleeck, 2006). These listening comprehension skills are of great importance in that they play an integral part in individuals’ academic achievements, career accomplishments, and social relationships (Dawes et al., 2019b; Kalandadze, 2018). Deficits in literal listening comprehension have prevented children with ASD from gaining factual knowledge, and their impairments in listening comprehension of inferential statements make it challenging for them to establish successful social communication in different contexts. Therefore, listening comprehension in children with ASD deserves more attention because improving associated skills is crucial for their academic success and essential for fostering better interpersonal relationships.

As for the cognitive performance of children with ASD, prior research has highlighted a robust and extensive deficit in theory of mind (ToM) among children with ASD (for a review, see Andreou & Skrimpa, 2020). ToM refers to a cognitive mechanism related to human beings’ abilities to attribute mental states, such as knowledge, desires, emotions, or beliefs, to oneself and others (Fu et al., 2023). It involves a broad range of cognitive functions, such as executive functions and reasoning. Typically developing (TD) children are assumed to develop ToM in a consistent and predictable pattern, probably starting from 18 months of age (Frith & Frith, 2003; Westby & Robinson, 2014). However, understanding others’ mental states can be quite difficult for children with ASD. For instance, an early study by Baron-Cohen et al. (1986) demonstrated that children with autism performed very poorly in appreciating characters’ beliefs in given narratives. Further, Baron-Cohen (1997) considered the poor performance of children with ASD in interpreting others’ mental states as a mindreading deficit and a kind of mind-blindness. Generally, impairments in ToM abilities contribute to social, behavioral, and communication deficits in children with ASD, as they encounter difficulties perceiving that behavior is driven by mental states (Andreou & Skrimpa, 2020).

Listening comprehension involves a complex process that requires more than just basic language skills; cognitive abilities such as ToM, inhibitory control, and comprehension monitoring are also essential for listening comprehension proficiency (Kim & Phillips, 2014). Research among TD children has shown that ToM contributes statistically significantly to listening comprehension, as both require interpreting others’ intentions and emotions (e.g., Jackson et al., 2022; Kim, 2016; Kim, 2017). Given ToM’s crucial role in effective listening comprehension for TD children, a ToM deficit commonly found in children with ASD is likely to pose challenges to their listening comprehension abilities. Therefore, to explore the mechanism underlying their listening comprehension, it is necessary to shed more light on the possible relationships between impairments in literal and inferential listening comprehension and the ToM deficit among children with ASD.

The following literature review aims to investigate how ToM might directly and indirectly (via other cognitive abilities like intelligence) influence literal and inferential listening comprehension in children with and without ASD, with the intention of building a rationale for further investigation into these interconnected aspects.

The Situation Model and Literal and Inferential Listening Comprehension

Generally, language comprehension entails constructing “a mental representation of the described situation” (Zwaan & Radvansky, 1998, p. 162). More specifically, according to the situation model proposed by van Dijk and Kintsch (1983), successful comprehension requires developing a mental representation of the wording (a surface representation), focusing on the explicitly conveyed meaning of given statements or texts (a propositional textbase), and activating world knowledge associated with a specific situation (a situation model). In this sense, successful language comprehension involves understanding the surface code, using inferential skills to integrate propositions, and constructing a situation model to process explicitly and implicitly conveyed information (Kim, 2016).

The situation model can also be applied to listening comprehension (van Dijk & Kintsch, 1983). Listening comprehension refers to the ability to listen to and comprehend oral language (Kim & Pilcher, 2016). Based on the situation model, listeners must understand the literal and non-literal meaning of utterances for effective listening comprehension. Driven by this idea, researchers have shone some light on literal and inferential listening comprehension of preschool children. Regarding the performance of children’s listening comprehension, some studies highlighted the developmental trajectory and challenges preschool children face in inferential comprehension. In terms of the developmental course, Filiatrault-Veilleux et al. (2015) identified key milestones in inferential comprehension across various ages. At age three, children demonstrate an emerging understanding of the relationship between emotions and situations. At age four, children begin to comprehend the structural and causal elements of narratives. Also, they develop an ability to correctly infer a character’s goal and the problem to solve. Between ages five and six, children show sensitivity to consequences or solutions within a story.

As for difficulties in comprehending inferential language, Florit et al. (2011) administered a listening comprehension test to 221 TD preschool children aged between 4 and 6 to evaluate their performance in processing explicitly and implicitly conveyed information with the Test for Listening Comprehension for 3 to 8 year olds (TOR 3–8). They found that TD children developed the ability to understand explicit and implicit information between ages 4 and 6. Their listening comprehension of both explicit and implicit information was appropriate for their age. A fixed-order hierarchical multiple regression analysis showed that type of information (literal vs. inferential) accounted for 10% of the variance in the number of correct answers on the listening comprehension test TOR 3–8, with children performing better on explicit than on implicit questions.

Regarding children with ASD, Zhao et al. (2021) conducted a study among 98 Mandarin-speaking children with ASD to evaluate their competence in understanding literal and inferential statements, revealing that children with ASD performed better at literal comprehension than inferential comprehension. Another study by Cheung et al. (2020) found that for Cantonese-speaking children with ASD, the scores for literal statements were marginally significantly higher than for similes.

While these studies have offered insights into how children with and without ASD understand literal and inferential statements, there has been a paucity of direct comparisons between TD children and children with ASD to probe into their differences in processing literal and inferential statements. Cheung et al. (2020)’s research made an attempt to examine the comprehension of literal statements and similes in Cantonese-speaking children with ASD and TD children, matched for both chronological age and verbal mental age. It was found that the ASD group were generally less accurate in understanding both literal statements and similes than TD children. Still, the study mainly focused on children’s interpretations of similes, rather than explore children’s inferential language comprehension in more diversified contexts. The limited availability of such direct comparisons from multiple perspectives has hindered our understanding of the specific differences in how these two groups comprehend and interpret explicit and implicit information.

The Relationship Between ToM and Listening Comprehension

Appreciating others’ mental states plays an essential part in effective communication. Many previous studies have identified some possible connections between ToM and language development. Miller (2006) pointed out the interdependence of ToM and language in development, arguing that successful communication entails understanding others’ mental states while language provides interlocutors with more opportunities to learn about ToM.

As a higher-order cognitive skill, ToM is considered a basis for the construction of the situation model and effective listening comprehension. Kim (2015) pointed out that ToM is closely associated with constructing the situation model as ToM can help individuals integrate various propositions and make appropriate inferences to establish local and global coherence. Some scholars have further explored the relationship between ToM and language comprehension among children. More specifically, a study was conducted by Kim (2016) among 201 TD Korean children in Grade 1 to examine the role of ToM in listening comprehension. The study showed that participants’ ToM performance was statistically significantly related to their listening comprehension and that the total effect of ToM on their listening comprehension was the largest (0.52). According to another study by Kim (2017) among 350 sec graders in the United States, the ability to understand others’ perspectives, as measured by ToM, was found to be independently associated with listening comprehension, even after considering knowledge-based inference, comprehension monitoring, and other fundamental language and cognitive abilities. Additionally, a longitudinal study by Jackson et al. (2022) found that at Time 2 (mean age = 5;11), ToM had a direct impact on listening comprehension. However, longitudinal observations indicated that ToM at Time 1 (mean age = 4;1) did not influence later listening comprehension at Time 2; instead, earlier ToM impacted later listening comprehension via concurrent ToM at Time 2.

Nevertheless, although the connection between ToM and listening comprehension has been explored among children in Korea (Kim, 2016), the United States (Kim, 2017), and the United Kingdom (Jackson et al., 2022), there has been a paucity of research into the above-mentioned relationship among Chinese preschool children. Investigating this relationship within the Chinese context remains crucial. Liu et al. (2008) and Cheung et al. (2022) pointed out several cultural differences that are relevant to factors known to influence ToM development, including societal expectations (e.g., an emphasis on implicit communication and social harmony), parental practices (e.g., efforts to promote their children’s interpersonal communication skills), language characteristics (e.g., exposure to mental state verbs), and executive functioning (e.g., strong impulse control). These cultural and cognitive factors are likely to impact language comprehension. Therefore, conducting relevant research specific to Chinese preschool children allows for a better understanding of how cultural factors shape these cognitive processes and further language understanding within the Chinese context, contributing to a more comprehensive understanding of these constructs across diverse populations.

Additionally, while much literature has focused on the link between ToM and listening comprehension among TD children, only a handful of studies have touched upon the relationship between ToM performance and listening comprehension among children with ASD. Indeed, investigating the connection between ToM and listening comprehension can provide insights into how ToM deficits impact the ability to process auditory information in preschool children with ASD, helping to gain a more nuanced understanding of the mechanisms underlying their listening comprehension. On this point, more relevant analyses are necessary.

The Connection of IQ with ToM and Listening Comprehension

ToM has been long considered a higher-order ability, which is thought to be related to multiple cognitive abilities, such as intelligence (i.e., verbal IQ and nonverbal IQ). As for the relationship between ToM and verbal IQ, De Mulder et al. (2019) conducted an associated longitudinal study of 101 Dutch-speaking kindergartners. The study revealed statistically significant correlations between ToM and verbal IQ measured by the Peabody Picture Vocabulary Test III at both the first testing wave and second testing wave (first testing wave: r = .61, p < .001; second testing wave: r = .57, p < .001) and that earlier ToM was a statistically significant predictor of later verbal IQ, t (97) = 3.60, p = .001. Concerning nonverbal IQ, Ibanez et al. (2013) investigated individual differences in ToM among 424 school-aged students, confirming that nonverbal IQ measured by the Raven’s Standard Progressive Matrices had a positive effect on ToM of 0.25 (p < .01).

When it comes to listening comprehension, it is a higher-order linguistic task with high cognitive demand. Intelligence is pivotal in developing linguistic and cognitive abilities. Previous research has identified the direct and indirect relationships between IQ and listening comprehension (e.g., Florit et al., 2011; Pan & Lin, 2022; Zhao et al., 2019; Zhao et al., 2021).

Empirical evidence has highlighted that verbal IQ is a statistically significant predictor of listening comprehension. More specifically, for TD children, Florit and colleagues (2011) carried out research among 221 TD children aged 4 to 5;11 years to explore factors influencing participants’ understanding of the explicit and implicit information. Results demonstrated that verbal IQ measured with the Peabody Picture Vocabulary Test-Revised (PPVT-R) was statistically significantly related to participants’ listening comprehension of explicit (r = .50, p < .01) and implicit information (r = .36, p < .01). Kim (2015) conducted research among 145 children in Korea, reporting that verbal IQ evaluated with the Peabody Picture Vocabulary Test IV was directly related to listening comprehension. As for children with ASD, Zhao et al. (2021) revealed the correlation between verbal IQ measured by PPVT-R and literal listening comprehension (r = .26, p < .05) among 98 Chinese preschool children with ASD.

In terms of nonverbal IQ, a study by Pan and Lin (2022) among 179 TD kindergarteners in Hong Kong revealed that nonverbal IQ measured by the Raven’s Standard Progressive Matrices was statistically significantly correlated with listening comprehension (r = .28, p < .01). Moreover, it was found that nonverbal IQ was linked to reading comprehension through listening comprehension. For children with ASD, Paynter et al. (2023) reported that preschool nonverbal IQ evaluated with the Mullen Scales of Early Learning visual reception and fine motor subtests and listening comprehension showed a statistically significant correlation with each other (r = .61, p < .01).

Nevertheless, limited research has addressed the mediating role of IQ in the relationship between ToM and listening comprehension. Understanding the mediating role of IQ in this relationship is of significance as it helps us recognize the cognitive abilities that contribute to the link between ToM and listening comprehension.

The Present Study

Given the research gap mentioned above, the present study aimed to compare ToM performance and literal and inferential listening comprehension in children with ASD and age- and gender-matched TD peers. Also, the present study sought to examine the direct and indirect relationships between ToM and listening comprehension of literal and inferential statements, as well as the potential mediating role of IQ, among preschoolers with and without ASD in the Chinese context. Two research questions of this study are as follows:

(1) How did Chinese preschool children with ASD perform on linguistic tasks and cognitive tests compared to their age- and gender-matched TD peers?

(2) Were there relationships between ToM and listening comprehension among Chinese preschool children with and without ASD, respectively? What, if any, were the direct and indirect relationships between their ToM and listening comprehension, considering the role of IQ?

By addressing these research questions, this study aimed to facilitate insights into cognitive and linguistic profiles among preschoolers with and without ASD in the Chinese context and the interplay between ToM and listening comprehension, hoping to inform these constructs across diverse populations. Furthermore, this study intended to gain a more nuanced understanding of the cognitive processes involved in language processing by examining the role of IQ.

Method

Participants

Forty-nine (N = 49) preschool children with ASD (mean age = 58.90 months; SD = 7.30 months; 38 boys, 11 girls) were recruited from special education institutes in Southern China. Fifty-two (N = 52) preschool TD children were recruited from mainstream kindergartens in Southern China (mean age = 60.13 months; SD = 8.23 months; 35 boys, 17 girls). Caregivers of all participants had signed the informed consent form before the research and approved the testing in the present study. The result of a Mann-Whitney U test indicated no statistically significant difference between the two groups regarding age (U = 1194.50, p = .589). A chi-square test of independence was performed to examine the relation between group and gender. The relationship between these variables was insignificant, (X2 = 1.32, N = 101, p = .250).

Participants in the ASD group had previously been diagnosed by experienced pediatricians according to the DSM-5 diagnostic criteria for ASD (American Psychiatric Association, 2013). For the confirmation of participants’ diagnoses, their caregivers were required to complete the Chinese version of the Autism Spectrum Quotient: Children’s Version (AQ-Child; Auyeung et al., 2008), which is a highly sensitive and reliable 50-item parent-report questionnaire assessing autism traits of children with and without ASD. The average score of the AQ-Child in the ASD group was 77.08 (SD = 12.94), above the cut-off score of the AQ-Child. In the ASD group, children with full or partial loss of hearing were not included in the study, but those with comorbidity participated in the study. Participants in the TD group were not diagnosed with ASD or other types of developmental disorders. The average score of the AQ-Child in the TD group was 62.36 (SD = 15.51), statistically significantly below the cut-off score of the AQ-Child.

Assessment and Measures

All children involved in this study were given assessments of listening comprehension with literal and inferential statements and administered ToM tasks. Also, they were requested to perform tasks evaluating their verbal and nonverbal IQ.

Literal and Inferential Listening Comprehension

The listening comprehension task is a newly developed task based on a subscale from the Hong Kong Cantonese Oral Language Assessment Scale (HKCOLAS, Department of Health of Hong Kong SAR, 2006) and previous studies on language acquisition and inferential language development among individuals with ASD (e.g., Dennis et al., 2001; Eigsti et al., 2011). The instructions and statements in this task were presented in Mandarin Chinese.

In this study, twelve statements with literal meaning and nine with inferential meaning were used to examine participants’ knowledge of some basic language domains and their performance in the inferential language task. In the literal listening comprehension task, the participants were presented with three pictures and instructed to point to the one that most closely matched the given statement. In the inferential listening comprehension task, the participants were shown three pictures and required to choose the picture that best represented the implications, figurative language, intentions, or logical outcomes described in the statements.

More specifically, the literal listening comprehension subtask examined participants’ abilities to understand information related to the aspect (e.g., My dad has finished his meals), quantity (e.g., Many children are sitting in the classroom listening to the teacher’s lecture), location (e.g., There is a bottle of water on the table), and deixis (e.g., A girl is playing on the swings and she has long hair). The subtask also assessed how participants processed sentences using passive voice (e.g., The umbrella was blown away by the wind) and compound sentences (e.g., My sister put a bottle of water on the table and started eating the cake).

In the present study, the inferential listening comprehension subtask evaluated participants’ understanding of mental state verbs involving implications (e.g., Dad did not forget to buy apples) and figurative language (e.g., My sister was smiling like a flower). The subtask also examined how participants made inferences about others’ intentions (e.g., The girl said, “Your cookies look delicious.” What should the boy do?) and made logical inferences based on their world knowledge (e.g., My sister got up late today. What would happen?). Cronbach’s alpha for scores of the listening comprehension task was 0.85. According to the performance of the top 27% and the bottom 27% of the participants in the present study, the task had an upper difficulty index of 0.95 and a lower difficulty index of 0.49. Also, it had an overall ideal discrimination index of 0.47.

ToM Performance

The Chinese version of a ToM understanding task battery developed by Wellman and Liu (2004) was used to evaluate participants’ ToM performance from multiple perspectives. The task battery is widely used among preschool children. Given the developmental period of preschool children, ToM tasks devised by Wellman and Liu emphasized younger children’s abilities to understand desires, emotions, knowledge, and beliefs. These scaled tasks have been conducive to the comprehensive understanding of children’s ToM and in-depth studies on individual differences in ToM.

The research among 74 Chinese children by Wu and Su (2014) used the battery and reported the Cohen’s Kappa value of 1.00 for the ToM task, showing good inter-rater reliability. The highly-scalable battery comprises five subtasks examining individuals’ abilities to understand others’ beliefs, desires, and intentions from different dimensions (i.e., Diverse Desires task, Diverse Beliefs task, Knowledge Access task, Contents False Belief task, and Real-Apparent Emotion task).

In the present study, the Diverse Desires task examined participants’ ability to differentiate between their desires and others’ desires for the same thing. The Diverse Beliefs task assessed participants’ capability to distinguish between their beliefs and the beliefs of others and to make choices based on others’ beliefs. The Knowledge Access task focused on children’s understanding of others’ ability to access knowledge without ever being exposed to the content material and the relationship between what people saw and what they knew. The Contents False Belief task evaluated children’s ability to understand the unconventional and to predict the beliefs of others. The Real-Apparent Emotion task laid emphasis on children’s ability to distinguish between mental feelings and external signs of emotions. Participants who passed one ToM task in this study would get one point. Therefore, the ToM task battery’s scores ranged from 0 to 5. Cronbach’s alpha for scores of the five subtasks was 0.70.

Verbal IQ

As a measure of verbal IQ in Chinese, the Peabody Picture Vocabulary Test-Revised (PPVT-R) in Chinese was used to evaluate participants’ vocabulary (Sang & Miao, 1990). It was adapted based on the Peabody Picture Vocabulary Test-Revised (Dunn & Dunn, 1981). Pilot studies were conducted among 600 Chinese children from 3.5 to 9 years old in Shanghai for standardization. In the present study, children were instructed to listen to the experimenter and select one picture that best matched the word uttered by the experimenter out of four pictures. When a participant gave six incorrect responses out of eight consecutive items, the exam came to an end.

Nonverbal IQ

The nonverbal IQ of participants was measured by the Chinese Combined Raven’s Test (Li et al., 1988). This test was developed based on the Raven’s Colored Progressive Matrices and sections C, D, and E of the Raven’s Standard Progressive Matrices (Raven, 1960, 1965). The test comprises six sets of 12 items each (A, AB, B, C, D, E), where each item presents a target matrix with a missing component. The children participating in the study were required to select the part that would best complete the matrix from a selection of six to eight options.

Results

Performance on Linguistic and Cognitive Tasks

A Shapiro-Wilk test of normality was conducted to determine whether the data on participants’ performance on linguistic and cognitive tasks were normally distributed. For the ASD group, the data on overall listening comprehension (p = .038), literal listening comprehension (p = .005), total score of ToM (p = .006), five ToM subtasks (all ps < .001), verbal IQ (p = .040), and nonverbal IQ (p = .012) were not normally distributed, while inferential listening comprehension was normally distributed (p = .250). For the TD group, the data on overall listening comprehension (p < .001), literal listening comprehension (p < .001), inferential listening comprehension (p < .001), total score of ToM (p < .001), five ToM subtasks (all ps < .001), and nonverbal IQ (p = .023) were not normally distributed, while results also revealed that the data on verbal IQ were normally distributed (p = .088). Given the distribution of the data, nonparametric tests were adopted below.

Table 1 shows how Chinese preschool children with and without ASD performed on listening comprehension of literal and inferential statements, the ToM task battery, and IQ measures (i.e., verbal and nonverbal IQ). It highlights the means, standard deviations, range, and mean rank of participants’ scores for tasks in the present study. Generally, for the TD group and the ASD group, participants had a substantially higher accuracy rate of listening comprehension of literal statements than inferential statements (TD, U = 781.00, p < .001; ASD, U = 694.00, p = .010). The scores of children with ASD on multiple linguistic and cognitive tasks were statistically significantly different from TD children’s scores. More specifically, children with ASD got statistically significantly lower scores in comparison to their TD peers in terms of overall listening comprehension (U = 255.00, p < .001), literal listening comprehension (U = 388.50, p < .001), inferential listening comprehension (U = 294.00, p < .001), general ToM performance (U = 295.00, p < .001), Knowledge Access task (U = 427.50, p < .001), Contents False Belief task (U = 274.50, p < .001), Real-Apparent Emotion task (U = 699.50, p < .001), verbal IQ (U = 210.50, p < .001), and nonverbal IQ (U = 615.00, p < .001). No statistically significant difference in the performance on the Diverse Desires task (U = 1021.00, p = .256) and the Diverse Beliefs task (U = 959.50, p = .112) between children with and without ASD was identified.

Table 1 Performance on linguistic and cognitive tasks among TD children and children with ASD

Intercorrelations of Age, Listening Comprehension, ToM, Verbal IQ, and Nonverbal IQ Among Chinese Preschool Children with and without ASD

Table 2 presents the intercorrelations of age, listening comprehension, ToM performance, as well as verbal and nonverbal IQ among Chinese preschool children with and without ASD, respectively. For the ASD group, Spearman’s rank correlation has highlighted that their age was statistically significantly correlated with their overall listening comprehension (rs = 0.34, p = .024), inferential listening comprehension (rs = 0.42, p = .004), and general ToM performance (rs = 0.38, p = .012). Their overall listening comprehension was positively correlated with their general performance on the ToM task battery (rs = 0.37, p = .017), verbal IQ (rs = 0.59, p < .001), and nonverbal IQ (rs = 0.38, p = .016). More specifically, there was a statistically significant positive relationship between literal listening comprehension of the ASD group and their inferential listening comprehension (rs = 0.59, p < .001), general ToM performance (rs = 0.31, p = .044), verbal IQ (rs = 0.64, p < .001), and nonverbal IQ (rs = 0.45, p = .004). The ASD group’s inferential listening comprehension was statistically significantly correlated with their general ToM performance (rs = 0.38, p = .013) and verbal IQ (rs = 0.39, p = .010). Their general ToM performance was found to be statistically significantly correlated with verbal IQ (rs = 0.49, p = .001).

Table 2 Intercorrelations of age, listening comprehension, ToM performance, verbal IQ, and nonverbal IQ among participants

For the TD group, Spearman’s rank correlation showed that TD participants’ age was statistically significantly correlated with their overall listening comprehension (rs = 0.30, p = .030), literal listening comprehension (rs = 0.35, p = .012), and nonverbal IQ (rs = 0.51, p < .001). A marginal correlation could be found between participants’ age and ToM abilities (rs = 0.25, p = .077). Additionally, their overall listening comprehension was positively correlated with their general performance on the ToM task battery (rs = 0.41, p = .003) and nonverbal IQ (rs = 0.39, p = .005). More specifically, their literal listening comprehension was statistically significantly correlated with their general ToM performance (rs = 0.44, p = .001) and nonverbal IQ (rs = 0.59, p < .001). Also, there was a statistically significant positive relationship between participants’ ToM performance and their nonverbal IQ (rs = 0.38, p = .007). Nevertheless, we found no statistically significant correlations between overall ToM performance and inferential listening comprehension (rs = 0.22, p = .125), nor between ToM performance and verbal IQ (rs = 0.20, p = .188).

The Roles of ToM and IQ in Literal and Inferential Listening Comprehension Among Chinese Preschool Children with ASD

On the basis of the statistically significant intercorrelations among variables, we conducted path analyses to explore the roles of ToM and IQ in literal and inferential listening comprehension (Hayes, 2018). The standardized path coefficients for the path models are presented in Fig. 1. We tested a parallel mediation model, Model (1a), to evaluate the role of verbal and nonverbal IQ on literal listening comprehension among participants with ASD. Results showed that ToM performance statistically significantly predicted verbal IQ (β = 0.46, p = .004) and that verbal IQ predicted literal listening comprehension (β = 0.47, p = .005). In contrast, ToM did not statistically significantly predict literal listening comprehension (β = 0.12, p = .441) or nonverbal IQ (β = 0.13, p = .448). The path from nonverbal IQ to literal listening comprehension was also non-significant (β = 0.22, p = .118). Model (1a) accounted for 39.71% of the variance in literal listening comprehension (p < .001).

Fig. 1
figure 1

Final path models from ToM to literal and inferential listening comprehension among participants with ASD

According to a simple mediation analysis in Model (1b), ToM performance among participants with ASD statistically significantly predicted their verbal IQ (β = 0.45, p = .003) and marginally predicted their inferential listening comprehension (β = 0.32, p = .056); however, their verbal IQ did not statistically significantly predict their inferential listening comprehension (β = 0.19, p = .262). Model (1b) explained 19.34% of the variance in inferential listening comprehension among participants in the ASD group (p = .017).

Table 3 presents the direct, indirect, and total effects of ToM performance on literal and inferential listening comprehension abilities among participants with ASD. To estimate the 95% confidence intervals (CIs) of the indirect effect, a 5,000-bootstrap bias-corrected procedure was employed. For Model (1a), there was no direct effect of ToM on literal listening comprehension, b = 0.31, 95% CI [-0.50, 1.13]. The indirect effect of ToM on literal listening comprehension through verbal IQ was statistically significant, b = 0.57, 95% CI [0.20, 1.17], after controlling for nonverbal IQ, whereas the indirect effect of ToM on literal listening comprehension through nonverbal IQ was insignificant when holding verbal IQ constant, b = 0.08, 95% CI [-0.11, 0.38]. As for Model (1b), the result indicated that the direct effect of ToM on their inferential listening comprehension was non-significant, b = 0.59, 95% CI [-0.01, 1.19]. The indirect effect of ToM on their inferential listening comprehension through verbal IQ was also non-significant, b = 0.15, 95% CI [-0.07, 0.51].

Table 3 Direct, indirect, and total effects of ToM on literal and inferential listening comprehension among participants with ASD

The Roles of ToM and IQ in Literal Listening Comprehension Among Chinese Preschool Children without ASD

Similarly, we also developed a simple mediation model using SPSS via Process (Hayes, 2018) to examine the contributions of ToM and nonverbal IQ to TD participants’ literal listening comprehension based on statistically significant intercorrelations among the variables. Fig. 2 displays the standardized path coefficients for the path analysis. In Model (2a), it was found that ToM performance statistically significantly predicted nonverbal IQ (β = 0.31, p = .029) and that nonverbal IQ predicted literal listening comprehension (β = 0.49, p < .001), while ToM marginally predicted literal listening comprehension (β = 0.21, p = .090). Model (2a) explained 35.29% of the variance in literal listening comprehension among TD participants (p < .001).

Fig. 2
figure 2

The final path model from ToM to literal listening comprehension among TD children

Table 4 demonstrates the direct, indirect, and total effects of ToM performance on literal listening comprehension abilities through nonverbal IQ among TD participants. In Model (2a), there was no direct effect of ToM on literal listening comprehension, b = 0.17, 95% CI [-0.03, 0.36]. The indirect effect of ToM on literal listening comprehension through nonverbal IQ was statistically significant, b = 0.12, 95% CI [0.03, 0.25].

Table 4 Direct, indirect, and total effects of ToM on literal listening comprehension among TD children

Discussion

This study focused on Chinese preschool children with ASD and their TD peers to explore their performance on linguistic and cognitive tasks and identify the direct and indirect relationships between ToM and listening comprehension. Specifically, we compared the two groups on their listening comprehension of literal and inferential statements, ToM, verbal and nonverbal IQ. We further explored whether ToM influenced different dimensions of listening comprehension skills in the ASD and TD groups, respectively.

Abilities to Tackle Linguistic and Cognitive Tasks

We found that participants in both groups encountered more challenges in understanding inferential statements compared to literal statements. The result was in line with extant studies (Cheung et al., 2020; Florit et al., 2011; Zhao et al., 2021) that found difficulties in processing inferential statements among preschool children with and without ASD. Typically, literal comprehension is considered a basic level of comprehension that is essential for and less challenging than more advanced inferential comprehension (Kim & Petscher, 2021). Literal comprehension is closely related to surface-level representation, whereas inferential comprehension entails inferencing skills to construct the textbase and the situation model (Dawes et al., 2019b). Inferential comprehension involves going beyond the surface interpretation and connecting information and world knowledge, which might pose enormous obstacles for preschool children with limited cognitive abilities and world knowledge.

Also, the study found a statistically significant difference between children with and without ASD in listening comprehension, ToM abilities, verbal IQ, and nonverbal IQ. The current findings of listening comprehension of literal and inferential statements were consistent with Cheung and colleagues (2020), who highlighted the general difficulties in processing orally presented literal statements and similes among children with ASD when compared to their TD peers matched for chronological age and verbal mental age. In essence, literal and inferential listening comprehension are complex linguistic and cognitive tasks involving the construction of the situation model at different levels. In this sense, the common deficits in language abilities and cognitive skills among children with ASD might prevent them from performing well in listening comprehension (Cantio et al., 2018; Kelly, 2011).

Moreover, the ASD group’s relatively poor performance on ToM tasks supported the hypothesis concerning the prevalent ToM deficits among individuals with ASD (e.g., Baron-Cohen et al., 1986). Individuals with ASD, for the most part, exhibit reduced ToM abilities compared to their TD peers, which was closely connected with their core deficits (for a review, see Kimhi, 2014). We also found that ToM was statistically significantly associated with age in participants with ASD, whereas a marginal correlation was found between age and ToM among TD participants. The findings implied that participants’ ToM abilities developed with age in early childhood, fitting with the previous studies identifying the effect of age on children’s ToM performance (Wellman & Liu, 2004; Peterson et al., 2005).

Concerning verbal IQ, the result was consistent with Yi et al. (2013), who found that Chinese preschool children with ASD scored statistically significantly lower than their chronological age-matched TD peers on verbal IQ measured by the Chinese version PPVT-R (p < .001). As for nonverbal IQ, the result was partly aligned with the study by Ellis Weismer et al. (2018), who found statistically significantly lower nonverbal IQ evaluated with the Wechsler Intelligence Scale for Children–Fourth Edition among school-aged children with ASD when compared to their chronological age-matched TD peers, t(116) = 3.03, p < .01.

The Direct and Indirect Relationships Between ToM and Listening Comprehension

The study showed that, for Chinese preschool children with ASD, their ToM performance was significantly positively correlated with their overall listening comprehension, literal listening comprehension, and inferential listening comprehension. Furthermore, TD participants’ ToM abilities were found to be positively associated with their overall listening comprehension and literal listening comprehension; however, no statistically significant correlation was identified between their ToM performance and inferential listening comprehension.

The statistically significant correlations between ToM and listening comprehension were consistent with extant studies that showed a connection between concurrent ToM and listening comprehension among preschoolers (Kim, 2015; Kim, 2016). Also, our findings might reflect a reciprocal relation between oral language and ToM identified by de Villiers (2000). More specifically, ToM involves individuals’ ability to understand and infer others’ mental states and predict others’ behaviors accordingly (Kim & Phillips, 2014), which is the basis of constructing a coherent representation and a situation model to achieve successful literal and inferential listening comprehension. In turn, listening comprehension of literal and inferential statements enables individuals to enhance their ToM abilities through the need to interpret factual information and make appropriate inferences.

Nevertheless, it should be noted that our findings suggested statistically significant correlations between ToM and listening comprehension in the ASD group. Zhao et al. (2021) did not find such relationships among children with ASD. We propose that the inconsistency might be due to differences in the difficulty levels of the tests used for assessing listening comprehension. The listening comprehension test adopted by Zhao et al. (2021) was characterized by a relatively higher difficulty level. This study used a simplified version tailored to the cognitive and linguistic development of children with ASD by providing fewer options and reducing context complexity. We also obtained higher score reliability in the current listening comprehension test.

Furthermore, to some extent, the insignificant association between ToM and inferential listening comprehension in the TD group seemed counterintuitive and contradicted previous studies that highlighted a statistically significant contribution of ToM to listening comprehension among TD children (e.g., Jackson et al., 2022; Kim, 2016; Kim, 2017). We partly attribute this inconsistency to the tendency of previous studies to combine literal and inferential comprehension questions within a single measure of overall listening comprehension, thereby potentially overlooking nuanced differences in the mechanisms underlying literal and inferential comprehension. By not separately examining the comprehension of explicit and implicit statements, these studies might have failed to capture the specific relationship between ToM and inferential listening comprehension among TD children, which might explain the disparity in findings. Additionally, the reduced complexity of the inferential listening comprehension task in the current study might not have challenged TD children enough to fully engage their ToM skills. As a result, most TD children could perform well on the task (M = 7.75, SD = 0.99) without relying on their ToM abilities, thus weakening the observed correlation between ToM and inferential comprehension.

Path analyses were performed to examine the mechanisms underlying the relationship between ToM and listening comprehension in a more detailed manner. For the ASD group, the effect of ToM on their literal listening comprehension was fully mediated by verbal IQ. Participants with ASD who delivered a better performance on ToM tasks were more likely to have a higher verbal IQ. The finding was basically in line with the longitudinal study of 101 Dutch-speaking kindergartners by De Mulder et al. (2019). It could be because the ability to comprehend others’ mental states provides children with a conceptual foundation for acquiring and utilizing vocabulary that accurately conveys this understanding. Further, our finding implied that participants having a higher verbal IQ tended to perform better on listening comprehension. The connection has been well-established in previous studies (e.g., Florit et al., 2011; Kim, 2015). Verbal IQ has been essential for constructing propositions and building the situation model required for listening comprehension and connected with the lexical quality hypothesis highlighting the role of vocabulary in meaning integration in comprehension (Perfetti, 2007).

In terms of the TD group, the effect of ToM on their literal listening comprehension was fully mediated by nonverbal IQ, showing a potentially different mechanism underlying listening comprehension of literal statements from the ASD group. More specifically, those who performed better on the ToM task battery tended to have a higher nonverbal IQ. Indeed, the role of nonverbal IQ on ToM has been identified in previous research (e.g., Ibanez et al., 2013). In essence, ToM refers to individuals’ capacity to infer the mental states of others (Howlin et al., 1999), whereas nonverbal IQ reflects individuals’ cognitive abilities to process information and solve problems logically and reasonably, involving abstract reasoning, conceptualization, and motor abilities (Kuschner, 2013). In this sense, there might be an overlap in core competence between ToM and nonverbal IQ, suggesting that ToM can be a potential predictor of nonverbal IQ to some extent. Furthermore, TD participants with a higher nonverbal IQ tended to score higher in literal listening comprehension. The finding was comparable to the research by Tighe et al. (2015). The research showed that nonverbal reasoning, measured by the Wechsler Abbreviated Scale of Intelligence, was found to be one of the strongest predictors of third-grade listening comprehension. It also emerged as the strongest predictor of listening comprehension in seventh and tenth graders. As a measure of abstract reasoning and conceptualization, nonverbal IQ plays a pivotal role in integrating information logically and constructing a situation model.

Implications

The implications of the findings are manifold. First, given the difficulties in inferential listening language comprehension among children with ASD, their caregivers are supposed to use more concrete expressions and literal language to achieve effective communication.

Second, measures should be taken to train multiple aspects of ToM abilities among children for better performance on listening comprehension. More specifically, some empirical studies have tapped into ToM interventions targeting components of ToM. Fletcher-Watson et al. (2014) reviewed a series of intervention studies aiming to develop ToM abilities among children with ASD, pointing out the potential benefits of emotion recognition training and therapist-led joint attention interventions.

Third, the enhancement of verbal and nonverbal IQ might bring tangible benefits to their listening comprehension. For instance, in terms of verbal IQ, primary caregivers can consider paying more attention to the development in children’s reading ability by providing children with more opportunities for book reading because it has been suggested that the ability to read might be a predictor of the potential increase or decrease in verbal IQ over time (Ramsden et al., 2013). When it comes to nonverbal IQ, those caregivers can also involve hands-on tasks (e.g., puzzles and mazes) to boost children’s nonverbal IQ.

Limitations

The limitations of the present study can be summarized as follows. The study could have paid more attention to the heterogeneity of ToM performance in the ASD group. Limited by the sample size, we failed to identify subgroups of participants with ASD, which would have allowed more detailed analyses. As for future research, researchers should consider larger samples of the ASD group and try to find subgroups on the basis of their traits to discuss their ToM performance and its correlations with listening comprehension. Additionally, while children with comorbidities participated in the ASD group, the study did not specify the types or prevalence of these comorbidities, nor did we analyze their potential influence on the results. Future research should clearly specify comorbidities, analyze their impact, and consider subgroup analyses to ensure comprehensive understanding.

Our study did not explore the direct and indirect relationships between more ToM-related factors (e.g., executive function, reasoning) and listening comprehension among Chinese preschool children with and without ASD. In the future, researchers should take these factors into consideration to uncover the mechanisms underlying listening comprehension through the mediation of possible cognitive factors.

The newly-developed listening comprehension task did not highlight the age difference among preschoolers, making the task unable to fully reflect the difference in linguistic and cognitive abilities among preschool children of different ages. Thus, researchers should pay more attention to the age factor to devise a more specific task to evaluate the listening comprehension of preschoolers of different ages. Furthermore, since we took the prevalent impairments in linguistic and cognitive abilities among children with ASD into consideration while developing the listening comprehension task, the task might be too easy for TD participants, making it challenging to precisely examine TD children’s performance on listening comprehension. For future research, researchers should try to strike a balance to develop a suitable listening comprehension test for both TD participants and participants with ASD.

Finally, the variability in the syntactic structure and complexity of sentences could affect the listening comprehension, as some sentences might be inherently easier or harder to comprehend regardless of their literal or inferential nature. Future studies should attempt to control for complexity, ensuring that differences in comprehension are due to ToM and listening skills rather than sentence structure.