Introduction

Neuroconstructivist approaches to child development (e.g. Karmiloff-Smith 1998) emphasise the importance of small, seemingly insignificant differences between infants with and without developmental disorders. These tiny differences are hypothesised, over the lifespan, to develop into larger recognisable patterns of symptoms, constituting clinical syndromes (Karmiloff-Smith 1998). This suggests that from similar origins, a disorder can have diverse presentations, as the original deviance from a ‘typical’ pattern of development triggers further deviance, and more obvious signs of neurodevelopmental difference. Therefore, Karmiloff-Smith (1998) argued that neuropsychologists should focus on the small, micro-level differences between children with and without developmental disorders, rather than investigating wide ranging cognitive abilities.

Sensory processing has been of interest to autism researchers for decades, with the literature suggesting that people with ASD experience sensory events in a different way to people without ASD (for a review, see Iarocci and McDonald 2006). Audiovisual integration, or the integration of sight and sound, is particularly relevant to autism researchers, because of the importance of audiovisual integration in face-to-face communication and speech perception (Calvert et al. 1998). The existing literature on audiovisual integration in ASD reports mixed findings. Generally, studies involving language tasks have suggested audiovisual integration impairments in ASD (e.g. de Gelder et al. 1991; Smith and Bennetto 2007; Williams et al. 2004) and studies using non-language tasks have found audiovisual integration in ASD comparable to matched control groups (e.g. Bebko et al. 2006; van der Smagt et al. 2007). This suggests that there may be a language-specific deficit in audiovisual integration in ASD.

The McGurk effect (McGurk and MacDonald 1976) is a well-known illustration that audiovisual integration is important in speech processing. McGurk and MacDonald (1976) presented auditory speech sounds (e.g. ‘ba’) in conjunction with incongruent visual speech stimuli (e.g. ‘ga’), and demonstrated that the sound reported by participants was generally a ‘fusion’ response; that is, a response different to either the visual or auditory signal (e.g. ‘da’). The McGurk effect is a robust effect that has been demonstrated even when participants are told to attend to the auditory or visual stimulus only (Massaro 1987). For these reasons, the McGurk illusion represents a reliable method of investigating audiovisual integration in autism. To our knowledge, three published studies have investigated the McGurk effect in people with ASD compared to typically developing control groups; all found a reduced effect in ASD, but suggested different explanations. One study (Williams et al. 2004) concluded that reduced speech-reading ability (the ability to identify speech sounds from seen lip-movements) in ASD led to reduced audiovisual integration. In contrast, de Gelder et al. (1991), found a reduced McGurk effect in autism, but did not find poor speech-reading ability, suggesting that speech-reading deficits did not underlie reduced audiovisual integration. Mongillo et al. (2008) found a reduced McGurk effect in children with ASD, but did not measure speech-reading, so it is impossible to conclude whether speech-reading deficits influenced the results. However, no study adequately takes into account the developmental delay typically found in ASD. Evidence from typically developing children suggests that audiovisual integration (and therefore the size of McGurk effect) increases with age (Dupont et al. 2005; Tremblay et al. 2007), with one study finding that only 50% of 10–12 year olds tested displayed a McGurk effect equal in size to those of adults tested (Hockley and Polka 1994). These findings suggest an interesting possibility for autism researchers, as it is possible that evidence of reduced audiovisual integration in speech tasks represent a delay in the development of audiovisual integration in ASD, rather than a deficit that is constant across age. If this were the case, children with ASD might go onto develop ‘normal’ audiovisual integration by the time they reach adulthood. Previous studies that have examined the McGurk effect in ASD tell us little about the process of development of audiovisual integration, nor about where deviance from typical development begins.

This study aims to investigate the development of audiovisual integration in children with ASD compared to typically developing children. The McGurk task will be used to investigate this, as there is evidence that McGurk effect increases with age (Dupont et al. 2005). Developmental trajectories (e.g. Karmiloff-Smith et al. 2004) will be devised showing the development of auditory accuracy (the ability to identify auditory speech syllables), visual accuracy (the ability to identify speech syllables by lip-reading), and audiovisual integration across age for the ASD group and control group. The trajectories will be compared to see whether either the rate of development (indexed by the gradient of the best-fit line) or the level of performance at the youngest age tested (indexed by the intercept of the best-fit line, and indicative of developmental delay at the youngest age tested) are different between the ASD and control groups. It is hypothesised that there will be no significant differences in the development of auditory accuracy between groups, given the simple nature of repeating auditory speech syllables. Given evidence of poorer speech-reading in children with ASD (Williams et al. 2004), it is expected that the ASD group will be delayed in visual accuracy compared to the control group at the youngest age tested, shown by a lower intercept in the ASD trajectory. Furthermore, it is expected that the ASD group will be impaired in audiovisual integration at the youngest age tested compared to the control group, demonstrated by a lower intercept in the trajectory best-fit line for the ASD group.

Methods

Participants

Ethical permission for the study was granted by the university ethics committee, and informed consent for participation was obtained from both the child and their caregivers. Participants were 24 children with ASD aged 7:11–16:5 years (ASD group), recruited from local schools (both specialist ASD schools and mainstream schools with specialised resource units, in which children with ASD are taught in mainstream school with extra support), and 30 children without ASD aged 8:4–16:5 years (control group), recruited from local mainstream schools. This age range was chosen because children within this age range could understand the experimental task, but should still show development of audiovisual integration (Hockley and Polka 1994). Each child with ASD had a pre-existing diagnosis of ASD (High-functioning Autism, Autism, or Asperger Syndrome) made by a qualified practitioner based upon criteria specified in the Diagnostic and Statistical Manual of Mental Disorders, 4th edition (APA 1994) and was receiving specialist help within school because of this diagnosis. Each child was also rated by the experimenter using the Childhood Autism Rating Scale (CARS; Schopler et al. 2002), and parents were asked to fill in the Social Communication Questionnaire (SCQ; Rutter et al. 2003), a parent-report measure designed to screen for pervasive developmental disorders. All participants completed the Ravens Standard Progressive Matrices (RSPM; Raven et al. 1998) and the British Picture Vocabulary Scale (BPVS-II; Dunn et al. 1997). Descriptive statistics for chronological age, RSPM, BPVS-II, CARS and SCQ scores are shown in Table 1.

Table 1 Descriptive statistics for participants: mean (range; SD)

Table 1 shows that the children with ASD in this study were mainly high-functioning children who were mild to moderately affected by ASD symptoms at the time of testing. Two children with ASD did not meet the cut-off ASD score of 15 on the SCQ. However, these children were included in the study because of their relatively high CARS scores (for children with high-functioning ASD), and because they had reliable pre-existing diagnoses. Similarly, two control children scored close to cut-off on the SCQ. These children were included in the study because of their low CARS scores.

The process for matching control children is summarised in Fig. 1. Each child in the ASD group was individually matched on the basis of chronological age (to within 12 months), sex and non-verbal ability (to within 11 points, as indexed by RSPM) to a control child without autism. Moreover, where the child with ASD had a verbal mental age (as indexed by the BPVS-II) that was discrepant, by more than 1–2 years, to their chronological age, a further control child was recruited with the same chronological age as the child with ASD’s verbal mental age. Given the problems discussed in the literature of exact matching based upon measures such as the BPVS and RSPM (in which such measures typically overestimate the abilities of children with ASD, e.g. Burack et al. 2004; Mottron 2004), this group was not intended to be an exact verbal mental age match to the ASD group, but intended to ensure that the control group roughly encompassed the ASD group’s verbal mental and chronological ages. A benefit of the developmental trajectory approach is that the issue of control group matching has less potential to influence the results than in more traditional clinical vs. control group comparisons, as the factor of interest is development over a wide age range (thus a wide range of mental ages and ability-levels), rather than performance at a fixed age and ability level (Karmiloff-Smith et al. 2004). The control group in this study spanned both the chronological and verbal mental ages of the ASD group, consistent with the developmental trajectory approach (Karmiloff-Smith et al. 2004).

Fig. 1
figure 1

Diagram to show the matching process for control children

Materials

RSPM and BPVS-II are individually administered tests designed to measure nonverbal ability and verbal ability, respectively. Both tests have demonstrated reliability and validity, and are widely used both clinically and in research (Raven et al. 1998; Dunn et al. 1997). The CARS is an observer-rating scale designed to screen for autism, and has demonstrated good reliability and validity (Schopler et al. 2002). The SCQ is a parent-report questionnaire designed to screen for pervasive developmental disorders, and has shown good reliability and validity in clinical samples (Rutter et al. 2003).

McGurk stimuli used by Bohning et al. (2002) in their study of audiovisual integration in individuals with Williams Syndrome were used here as these stimuli have been shown to elicit reliable McGurk effects. Stimuli consisted of a range of disyllables (/aba/, /ava/, /atha/, /ada/, and/aga/) spoken by an unfamiliar female English speaker and presented on a laptop computer with a 13 × 8 inch high-definition screen. For further details of how the stimuli were generated, please see Bohning et al. (2002). Auditory only stimuli were syllables played on the soundtrack accompanying a blank computer screen, and visual only stimuli were syllables played visually (so that the speaker’s face was visible) without the auditory sound track. Audiovisual stimuli were stimuli in which the participant could both see and hear the speaker. Audiovisual stimuli were generated for every combination of auditory and visual disyllables, so that five items were congruent (the auditory soundtrack matched the visual track) and 20 items were incongruent (the auditory soundtrack was different to the visual track). The stimuli were organised into discrete lists, with each list containing the same 35 items representing the five auditory-only items, the five visual-only items, the five congruent audiovisual items, and the 20 incongruent audiovisual items. The order of items was different in each list. Each item consisted of the speech segment (1 s) and a 3 s blank screen in which the participant was asked to respond by repeating ‘what the lady said’. Participants first completed five practice items to ensure that the child understood the procedure.

In addition to the five practice items at the beginning of the experiment, participants completed four lists of items, meaning that each participant viewed each item four times, with a total of 140 items (excluding the practice items). Presentation of all four lists took approximately 14½ min. Children sat approximately 1 m from the computer screen, and the auditory sound track was presented via internal headphones to minimise extraneous noise where possible. Some participants in the ASD group would not use headphones. In these circumstances, the corresponding control child/children were also tested without headphones in order to equate the McGurk task presentation between groups. The experimenter sat with the child during the task to ensure that they looked at the computer screen during every trial.

Procedure

Each child in the ASD group completed the BPVS-II, RSPM and McGurk task in a 1 h individual testing session which was held in a quiet room at the child’s school, usually during a lesson period. The order of the session was either BPVS-II or RSPM, followed by the McGurk task, followed by BPVS-II or RSPM (depending upon which had been completed at the beginning of the session). The order of the BPVS-II or RSPM was alternated with each participant to counterbalance order effects. The McGurk task was always kept in the middle of the session to try to maximise concentration (the child had settled in but should not be tired).

Control participants completed the RSPM first in class groups, which was necessary within the time constraints of the study. This established whether children were suitable in terms of age, sex and non-verbal ability. The RSPM manual (Raven et al. 1998) states that the results obtained by group testing sessions are equivalent to those obtained by individual sessions where the individual is left to do the task themselves (without interaction), as with the ASD group. Matched control children (see Fig. 1 for details of the matching process) were invited to a 25 min individual testing session in which the BPVS-II and the McGurk task were completed. These individual sessions were held in a quiet room in school. The order of the BPVS-II and McGurk task was alternated between participants and control for order effects.

Results

Suitable matched control children within mainstream schools could not be found for four children with ASD due to the low ability levels of these children. However, these children were included in the analysis (and in Table 1) because it focuses upon developing trajectories or models of development for each group, and increasing the number of available data points can improve the accuracy of the resulting models. In addition, two children (one in the ASD group and one in the control group) only completed three out of four lists of stimuli due to time constraints. Visual inspection of the results suggested that none of these children were outliers in the McGurk task, and their results were therefore included in the analysis.

Similarly to Bohning et al. (2002), McGurk task data were scored so that participants gained credit—a score of 1—for correctly identifying the auditory disyllable (/aba/, /ava/, /atha/, /ada/ and/aga/) in auditory only trials and correctly identifying the viseme in visual trials (/aba/, /ama/, and /apa/ were scored as correct for the visual stimulus /aba/; /afa/ and /ava/ for visual /ava/; /atha/ for visual /atha/; /ata/ and /ada/ for visual /ada/; /aka/ and /aga/ for visual /aga/). McGurk scores across consonant types and trials were averaged so that mean scores for auditory only and visual only stimuli were calculated. Audiovisual trials were scored so that participants gained credit for correctly identifying the auditory disyllables for incongruent and congruent audiovisual stimuli (see below for further details). From the McGurk task scores, three dependent variables were identified: auditory accuracy; visual accuracy; and McGurk effect (audiovisual integration).

Each dependent variable (auditory accuracy, visual accuracy and McGurk effect), was plotted against chronological age separately for the ASD group and control group. Linear regression of chronological age on task score was used to plot best-fit lines depicting the linear relationship between age and task score for each group. These best-fit linear models were labelled ‘developmental trajectories’ (Thomas et al. 2009). Statistically significant linear models (with high R 2 and p < .05), suggested that there was a reliable linear relationship between chronological age and task score in a group. Statistically non significant models (p > .05) suggested that the linear relationship was unreliable. For all generated trajectories, Cook’s D statistics were calculated to identify whether any cases (participants) exerted undue influence upon the regression model, and cases with values >1 were excluded as outliers. Residuals were examined and z statistics for skew were calculated.Footnote 1 In line with the conventions described by Tabachnick and Fidell (1996), z statistics exceeding 2.58 for regression models were seen as indicators that linear models were inappropriate to characterise the data.

Cross-sectional Analysis of Covariance (ANCOVA) was used to establish whether the dependent variables for each group differed significantly in performance at the youngest age tested (intercept) and rate of development (gradient; Thomas et al. 2009). A significant main effect of age indicated a relationship between task score and age when the groups were combined. A significant main effect of group indicated that task score was different between groups, and that the intercept (performance at youngest age tested) was different between groups (Thomas et al. 2009). A significant interaction between age and group indicated that the rate of development (gradient) was different between groups (Thomas et al. 2009).

Auditory Accuracy

Auditory accuracy was the sum of the mean scores for each auditory-only disyllable, with a maximum possible score of 5 and a minimum possible score of 0. Each participant obtained a mean score (between 0 and 1) for each disyllable (representing four trials), and these mean scores were summed across the four disyllables (/aba/, /ava/, /atha/, /ada/ and /aga/). Auditory accuracy thus represents 20 auditory only trials (four trials of each of the five consonant disyllables). Trajectories for the ASD and control groups are shown in Fig. 2.Footnote 2 Linear regression suggested that the relationship between chronological age and auditory accuracy was reliable for the ASD group (R 2 = .19, F(1, 22) = 5.172, p < .05), and auditory accuracy appeared to increase with chronological age. In contrast, the relationship between chronological age and auditory accuracy was not reliable for the control group (R 2 = .003, F(1, 28) = .074, p = .787). The lack of relationship between chronological age and auditory accuracy in the control group appears to reflect a ceiling effect, as most participants scored 80% correct or more. No Cook’s D statistics exceeded 1.

Fig. 2
figure 2

Developmental trajectories showing auditory accuracy for the ASD group and the control group, with auditory accuracy plotted against chronological age

ANCOVA was used to compare the rate of change in performance relative to chronological age, and the age at onset (intercept) between groups. Auditory accuracy was entered as the dependent variable, group as the independent variable, and chronological age was entered as the covariate. Following Thomas et al. (2009), the interaction of group × covariate (chronological age) was also entered into the ANCOVA model in order to examine whether auditory accuracy varied differently with chronological age across the two groups. There were no statistically significant effects of chronological age (F(1,50) = 1.845, p = .181, η 2 p  = .036), group (F(1,50) = 3.789, p = .057, η 2 p  = .07), or group × chronological age interaction (F(1,50) = 3.024, p = .088, η 2 p  = .057). As illustrated in Fig. 2, this suggests that the development and onset (performance at the youngest age tested) of the ASD trajectory for auditory accuracy was not significantly different from the control group trajectory, although the lack of statistical reliability of the control group model means that this model should be treated with caution.

Visual Accuracy (Speech-Reading)

Visual accuracy was the sum of the mean scores for the 20 visual only items (the mean of the four trials of each disyllable, summed for all five disyllables). The maximum possible score was 5 and the minimum possible score was 0. Trajectories for the ASD and control group are shown in Fig. 3. Linear regression suggested that visual accuracy reliably increased with chronological age for both the ASD group (R 2 = .249, F(1, 22) = 7.298, p < .05) and the control group (R 2 = .151, F(1, 28) = 4.982, p < .05). No Cook’s D statistics exceeded 1. ANCOVA was used to compare the gradient and intercept of the regression lines between groups. Visual accuracy was entered as the dependent variable, group as the independent variable, and chronological age was entered as the covariate. As with auditory accuracy, the interaction of group × chronological age was also entered into the model. There were statistically significant main effects of group (F(1,50) = 12.735, p < .01, η 2 p  = .203) and chronological age (F(1,50) = 13.287, p < .01, η 2 p  = .210), but there was not a significant group × chronological age interaction (F(1,50) = 1.982, p = .165, η 2 p  = .038). As illustrated in Fig. 3, these results suggest that the ASD group was significantly delayed in performance at the youngest age tested relative to the control group, but that the rate of development of visual accuracy (gradient) in the ASD group was not significantly different than in the control group.

Fig. 3
figure 3

Developmental trajectories showing visual accuracy against chronological age for both groups

McGurk Effect

Initially, McGurk scores were calculated using the metric of Bohning et al. (2002). Mean scores for incongruent audiovisual stimuli (the mean score for the four trials of each of the 20 items in which the auditory soundtrack was different from the visual stimulus, summed to give a score out of 20) were calculated. This gave an indication of whether identification of the correct auditory disyllable was affected by the presence of an incongruent visual stimulus, and thus about the extent of McGurk effect. Initial trajectory analysis using linear regression suggested that neither the ASD group scores (R 2 = .009, F(1,22) = .209, p = .652), nor the control group (R 2 = .005, F(1,28) = .133, p = .718) improved reliably with age. There were no significant effects of chronological age (F(1,48) = .323, p = .572, η 2 p  = .007), group (F(1,48) = 1.342, p = .252, η 2 p  = .027) or chronological age × group interaction (F(1,48) = .523, p = .473, η 2 p  = .011). However, it was felt that this scoring method could have been affected by children who had poor auditory recognition skills, and in fact when auditory accuracy was entered into the ANCOVA model as a covariate, its main effect reached significance (F(1,48) = 20.114, p < .001, η 2 p  = .295), whilst visual accuracy did not (F(1,48) = 2.726, p = .105, η 2 p  = .054). In the Bohning et al. (2002) scoring, incorrect responses to incongruent audiovisual stimuli were taken as indicators of increased McGurk effect, so poor auditory recognition skills may have confounded with audiovisual integration. To address this issue, baseline auditory accuracy was then corrected for by subtracting the mean individual score for incongruent audiovisual stimuli (the mean score for the 20 items in which the auditory disyllable was different from the visual disyllable, ranging between 0 and 1) from the mean individual score for congruent audiovisual stimuli (the mean score for the five items in which the visual disyllable was the same as the auditory disyllable, ranging between 0 and 1) for each participant. The resulting difference score was labelled ‘McGurk effect’, and taken to represent level of audiovisual integration.

Figure 4 depicts trajectories showing McGurk effect for the ASD and control groups. Linear regression suggested that McGurk effect increased reliably with age in the ASD group (R 2 = .447, F(1, 22) = 17.749, p < .001), but not in the control group (R 2 = < .001, F(1, 28) = .025, p = 875). No Cook’s D statistics exceeded 1.

Fig. 4
figure 4

Developmental trajectories showing McGurk effect against chronological age for both groups

ANCOVA was used to compare performance at the youngest age tested (intercept) and rate of development (gradient) between groups. McGurk effect was entered as the dependent variable, group as the independent variable, and chronological age as the covariate. Results showed a statistically significant main effect of group (F(1,50) = 16.176, p < .0001, η 2 p  = .244), chronological age (F(1,50) = 5.027, p < .05, η 2 p  = .091), and group × chronological age interaction (F(1,50) = 6.165, p < .05, η 2 p  = .110). As illustrated in Fig. 4, these results suggest that the ASD group were delayed in frequency of McGurk effect at the earliest age tested (intercept) but showed a faster rate of development (gradient) relative to the control group, resulting in similar scores to the control group at the older ages tested. However, due to the unreliable linear model for the control group, these results should be treated with caution.

To examine whether the differences found between groups in McGurk effect could be attributed to poorer visual accuracy, ANCOVA was repeated as before with visual accuracy entered as an additional covariate. Similarly, results showed a significant main effect of group (F(1,49) = 7.788, p < .01, η 2 p  = .137), and a significant group × chronological age interaction (F(1,49) = 4.473, p < .05, η 2 p  = .084). The main effect of visual accuracy also reached statistical significance (F(1,49) = 4.064, p < .05, η 2 p  = .077), but the main effect of chronological age did not (F(1,49) = 1.276, p = .264, η 2 p  = .025). This suggested that although visual accuracy influenced frequency of McGurk effect, when visual accuracy was controlled for, the ASD group still displayed a delayed performance at the youngest age tested and a faster rate of development across chronological age than the control group.

Discussion

This study investigated the development of audiovisual integration in a group of high-functioning children with ASD and a group of typically developing children (control group). Results suggested that the ASD group were delayed at the youngest age tested (relative to the control group) in audiovisual integration and in visual accuracy. However, the ASD group developed audiovisual integration skills at a faster rate than the control group, resulting in the ASD group ‘catching-up’ with the control group at the older ages tested. Reduced audiovisual integration in the ASD group was partly (but not exclusively) attributable to reduced visual accuracy.

Findings of delayed audiovisual integration skills at the youngest age tested in the ASD group were consistent with the initial hypothesis. However, the ASD group subsequently developed audiovisual integration at a faster rate than the control group. Whilst the unreliable regression models for the control group mean that conclusions about the delay at youngest age tested (intercept) and rate of development (gradient) might be limited, the statistically significant main effects of group in the McGurk effect ANCOVA suggest that there were genuine differences between the mean audiovisual integration scores between groups, with the ASD group showing lower levels of audiovisual integration than the control group. Thus, it can be concluded that the high-functioning ASD sample in this study showed reduced audiovisual integration compared to typically developing control children at the younger ages tested. Moreover, this effect occurred even when visual accuracy was controlled for, suggesting that although visual accuracy is important, reduced audiovisual integration scores could not be wholly attributed to poorer lip-reading ability in the ASD group. This is consistent with the findings of de Gelder et al. (1991).

One interpretation of the findings of reduced audiovisual integration in the ASD sample is that the control group had mainly developed their audiovisual integration skills by the youngest age tested in this study (8 years). In contrast, the ASD group seemed to mature in these skills across the age range tested. To confirm this, however, it would be necessary to include much younger children in the control group to see whether audiovisual integration develops at a younger age in typically developing children.

The ASD group were delayed in visual accuracy compared to the control group across the age range tested. This is consistent with previous findings (e.g. Smith and Bennetto 2007), and is in agreement with the initial hypothesis. The fact that the ASD group in this sample appeared to be developing speech-reading skills with age is promising, and it would be interesting to investigate visual accuracy in older children with ASD to see whether it reaches the control group level at older ages, or whether the ASD group remain delayed into adulthood.

The lack of a linear relationship between chronological age and audiovisual integration in the typically developing children in the current study is not consistent with previous studies that have investigated the development of audiovisual integration, as these studies have demonstrated increased audiovisual integration in older children and adults compared to younger children (Dupont et al. 2005; Hockley and Polka 1994; Tremblay et al. 2007). There are methodological differences between the current study and previous studies which may underlie these different findings. Firstly, previous studies included younger children (as young as 4 years) than the current study. It may be that the fastest development in audiovisual integration occurs before the age of 7 years (the youngest age in the current study), making it harder to show age effects in the current sample. Secondly, previous studies mainly compared groups of children at a particular age to groups of older children or adults, rather than charting development across a wide age range, as was the approach in the current study. Other studies have also used French speakers (Dupont et al. 2005; Tremblay et al. 2007), in contrast to the native English speakers who participated in the current study. Previous research has demonstrated different kinds of McGurk effects in different languages (Sekiyama and Burnham 2008).

The participants with ASD in this sample showed a reliable deficit in audiovisual integration that could not entirely be explained by poorer visual accuracy. Such a deficit may be consistent with the mirror neurone theory of autism (Williams et al. 2001) and the temporal binding or impaired connectivity hypothesis of autism (Rippon et al. 2007). Mirror neurone theory suggests that particular cells in the human superior temporal sulcus (STS) labelled ‘mirror neurones’, which are activated during passive observation of another person performing an action, do not function properly in autism, and that these cells are also involved in audiovisual integration (Williams et al. 2004). Impairment in mirror neurones in the STS of individuals with ASD might also explain deficits in speech-reading, as extensive activation of the STS during speech-reading tasks has consistently been shown in neuroimaging studies (Calvert and Campbell 2003). There are reasons to suppose that mirror neurone systems may continue to develop well into adolescence (see Kilner and Blakemore 2007), which would be consistent with the results of the current study. Impaired connectivity theory (Rippon et al. 2007) suggests that reduced functional connectivity between cortical regions underlies the problems found in ASD, which could result in reduced ability to combine information between the auditory and visual cortices. Future research investigating the development of audiovisual integration using brain imaging techniques will help to expose the neural basis for audiovisual integration deficits in ASD, and to elucidate the roles of mirror neurones and impaired connectivity in such deficits.

This study also highlights the importance of studying the development of abilities across age. The youngest children with ASD in this study were significantly delayed in audiovisual integration compared to the youngest typically developing children, but performance improved with age in the ASD group, resulting in similar audiovisual integration scores by the oldest ages tested. In the study of developmental disorders, the current study demonstrates that abilities which are deficient at a young age can develop, and suggests it is important to generate developmental trajectories before conclusions regarding the presence of deficits or strengths can be drawn (Karmiloff-Smith et al. 2004). This study supports the importance of neuroconstructivist approaches in viewing cognitive abilities as changeable, developing faculties rather than static, permanent functions within the brain (Karmiloff-Smith 1998).

The current study also has important clinical implications. The finding of delayed speech-reading and audiovisual integration in younger ASD children suggests that these abilities could be targets for early intervention. Given the importance of audiovisual integration and speech-reading in face-to-face communication (e.g. Calvert et al. 1998), helping children with ASD to process face-to-face speech could have implications for future communication and social abilities. Previous studies (e.g. de Gelder et al. 1991) with small groups of participants have suggested that training children with autism to speech-read improves visual accuracy and audiovisual integration, but further research with larger groups of participants is needed to establish whether these effects are reliable, and whether improvements (if found) have wider effects on communication.

Limitations of the current study include the relatively small sample sizes for linear regression analysis, although reliable regression models were obtained in most cases for the smaller ASD group, and the sample sizes are comparable with other published trajectory work (Thomas et al. 2009). It is possible that other factors might underlie the poorer audiovisual integration performance in the ASD group, including lack of attention or problems with following instructions. However, the experimenter ensured that every child looked at the computer screen during the task and provided prompts to look at the screen where necessary. Given that the children in this study were high-functioning (and scored highly on the other test measures, the RSPM and BPVS) it is unlikely that they misunderstood the simple instructions, and none of them appeared to experience speech production problems (which might limit their ability to reproduce ‘what the lady said’ in the McGurk task). It is also not clear whether any of the ASD participants had received previous interventions aimed at improving their lip-reading or audiovisual integration performance, although these interventions are not standard practice in the UK, so this seems unlikely. The results also suggested that extending the lowest age tested in the control sample would be important to investigate whether younger children show clear development of audiovisual integration across age, in contrast to the current control sample. Finally, this study included almost exclusively high-functioning children with ASD, many of whom attended mainstream school. Further research is needed to establish whether the results of this study would be replicated in lower-functioning children with more severe symptoms of ASD.