Introduction

Language and communication impairments that characterize autism are so severe that they comprise a diagnostic criterion (American Psychiatric Association, 1994). At the same time, this disorder is associated with an extremely diverse language phenotype, ranging from relatively typical linguistic capacities to mutism and little functional communication (Boucher, 2003; Kjelgaard & Tager-Flusberg, 2001; Lord & Paul, 1997). Indeed, between 25 and 50% (Gillberg & Coleman, 2000; Klinger, Dawson, & Renner, 2002) of all diagnosed individuals fail to acquire functional language during their lifetime.

Although pragmatics has been identified as the only aspect of language that is universally and specifically impaired in autism (e.g., Lord & Paul, 1997), deficits in processing linguistic utterances for meaning (semantics) have commonly been reported as well. However, studies examining semantic processing in autism have varied widely in terms of specific semantic requirements, and have provided mixed results. Experiments utilizing reading tasks have indicated a weakened tendency to correctly disambiguate pronunciations of homographs in semantic contexts (Frith & Snowling, 1983; Happé, 1997; Jolliffe & Baron-Cohen, 1999; López & Leekam, 2003), although other studies have shown that performance is unimpaired when participants with autism are specifically instructed to focus on the semantic level of the presented stimuli (Happé, 1994; Jolliffe & Baron-Cohen, 1999; Snowling & Frith, 1986). Experiments utilizing unambiguous materials have shown that participants with autism make significantly more errors than controls in answering multiple choice and open-ended format questions about a previously read text (Jolliffe & Baron-Cohen, 1999, 2000; Norbury & Bishop, 2002). Research into memory function has found that individuals with autism fail to utilize semantic context to aid free recall of semantically related word lists, although free recall of semantically unrelated word lists is unimpaired (Bowler, Matthews, & Gardiner, 1997; Hermelin & O’Connor, 1967; Smith, Gardiner, & Bowler, 2007; Tager-Flusberg, 1991). Further studies have shown that the typical levels-of-processing effect, whereby semantic processing of verbal materials facilitates long-term memory better than shallow (phonological/perceptual) processing, is absent, and that long-term memory resulting from perceptual processing is superior in autism relative to controls (Toichi & Kamio, 2002). In a similar vein, Mottron, Morasse, and Belleville (2001) found that whilst controls utilized semantic cues at recall, participants with autism were able to utilize both semantic and phonological cues. Although this pattern of findings led the authors to propose that individuals with autism may preferentially attend to the sound of words rather than their meanings, Smith et al. (2007) showed that participants with Asperger syndrome were impaired at recalling phonologically related words, suggesting that this was not the case.

Within the pragmatic domain of language, abnormalities in prosody are common in autism (e.g., Baltaxe & Simmons, 1985). Examples of expressive abnormalities include dull, robotic, singsong, bizarre, and stilted qualities of speech, encompassing deviances in rhythm, intonation, and pitch production. Although receptive studies are sparse, findings indicate that difficulties in interpreting both affective and linguistic prosody are characteristic in autism (Kujala, Lepistö, Nieminen-von Wendt, Näätänen, & Näätänen, 2005; McCann & Peppé, 2003; Paul, Augustyn, Klin, & Volkmar, 2005a; Peppé, McCann, Gibbon, O’Hare, & Rutherford, in press; Rutherford, Baron-Cohen, & Wheelwright, 2002). These findings are particularly intriguing when considered within the context of studies reporting proficient or enhanced musical pitch and pitch contour processing in autism (e.g., Heaton, Hermelin, & Pring, 1998; Heaton, 2003, 2005; Mottron, Peretz, & Ménard, 2000). Prosody and music share significant acoustic features, such as similar fundamental frequency and temporal variations of similar period. The pattern of sparing and deficit across these two domains in autism is therefore of considerable clinical and theoretical interest.

Two theoretical models that specifically address the question of uneven cognitive development in autism highlight the implications of atypical information processing. Amongst these accounts the weak central coherence (WCC) theory (Frith, 1989; Happé, 1999; recently updated in Happé & Frith, 2006) has explicitly addressed semantic processing difficulties in autism. This theory proposes that because the cognitive style of individuals with autism is not driven by central coherence, the typical propensity to process language both for meaning and in context is diminished. Instead, they show a relative advantage over typical individuals in featural or surface-biased (perceptual) processing of stimuli. Thus, the WCC predicts strengths, such as those observed in musical pitch processing (e.g., Heaton, 2003), as well as weaknesses, such as those seen in the interpretation of prosodic cues (e.g., McCann & Peppé, 2003). However, studies examining semantic processing have shown that WCC can be overcome when participants are specifically alerted to the semantic level of stimuli presented in experimental tasks (e.g., Snowling & Frith, 1986), and it may be the case that the underlying semantic processing capacity is intact, but it is not the “default” or primary processing mode.

Alternatively, the theory of enhanced perceptual functioning (EPF) (Mottron & Burack, 2001; recently updated in Mottron, Dawson, Soulières, Hubert, & Burack, 2006) proposes that neural networks underpinning perceptual processing are “over-specialized” and predispose locally oriented and EPF in autism. The finding that individuals with autism, unlike typical controls, benefit from phonological cues at retrieval in memory tasks (Mottron et al., 2001) has been cited as evidence of enhanced processing of low-level perceptual information. However, although perception (e.g., phonological processing) may play an unusually dominant role in language processing in autism, higher-order functions, such as the ability to process language for meaning, are assumed to be unimpaired. Thus, whilst cognitive processing in individuals without autism is characterized by mandatory higher-order processing, e.g., linguistic over perceptual dominance in language tasks, those with autism are able to regulate the perceptual versus higher-order control more flexibly.

The following experiment was designed to test whether children with autism show a preference for a linguistic interpretation (semantics) or a perceptual interpretation (intonation) of speech stimuli. A hypothesis that can clearly be derived from the WCC and EPF theories is that a strong propensity to process information perceptually will hamper higher-level cognitive functions in autism. However, no studies have directly addressed the possibility that perceptual processing abnormalities, i.e., a perceptual-level bias, might contribute to the speech and language abnormalities observed in autism. Given that findings from studies into phonological memory function have been inconclusive and that research has shown that participants with autism fail to benefit from semantic relations (semantic category versus unrelated words) in memory tests, this question is of considerable significance. The rationale and aims of the study are: (1) As linguistic interpretation involving meaning arguably requires higher-level cortical processing than that involving the recognition of intonation patterns, the levels-of-processing principle in autism in relation to speech processing (cf. Mottron et al., 2001; Smith et al., 2007, Toichi & Kamio, 2002) will be tested. (2) Happé (1999, p. 219) has proposed that “WCC characterizes the spontaneous approach or automatic processing preference of people with autism, and is thus a cognitive ‘style’ best captured in open-ended tasks.” Coherence within the verbal-semantic domain will therefore be tested. Children will be presented with a quasi-open-ended paradigm, in which speech stimuli will contain competing perceptual and linguistic information, and can thus be processed at either the perceptual or linguistic level.

Methods

Participants

For this experiment, 56 children were recruited from schools in Southwest England. They were a subset of children taking part in studies by Järvinen-Pasley et al. (accepted for publication). Twenty-eight children with a formal diagnosis of autistic disorder (AD) or Asperger disorder (AS) according to DSM-IV (APA, 1994) or ICD-10 (World Health Organization, 1993) criteria were recruited from two specialist educational establishments for children with autism. The diagnostic information was gathered from school files of documented medical diagnoses and clinical reports, and showed that each child had individually received a diagnosis, or had their diagnosis confirmed, by experienced clinicians within 2 years prior to conducting the study. No child had a diagnosis of Pervasive Developmental Disorder, Not Otherwise Specified, and 57% of the children had a diagnosis of AD and 43% had a diagnosis of AS. One child with AS had, at one stage, been diagnosed as having AD but the diagnosis had been amended to that of AS in that they exhibited autistic symptoms without a clinically significant delay in early language development. The children with a diagnosis of AD were classified as having autism accompanied by language delay. Children with Rett syndrome, Childhood Disintegrative Disorder or autism-related medical conditions, such as Fragile X syndrome, tuberous sclerosis, and neurofibromatosis, were not included in this study. Children in both the clinical and control groups were also excluded from participation if they had a diagnosis of any medical disorder (e.g., epilepsy). The selected children all met the following criteria: they had a mono-lingual English-speaking home environment and were Caucasian, had no sensorineural hearing impairment, and showed no evidence of neurological or medical abnormalities. About 71% of these children showed fluent use of spoken language. This information was gathered from the children’s teachers and was also noted by the experimenter in a pre-test conversation phase.

Control children were matched on an individual basis to those with autism for chronological age (CA), receptive vocabulary measured by the British Picture Vocabulary Scale (BPVS) (Dunn, Whetton, & Pintilie, 1997), and non-verbal intelligence (NVIQ) measured by the Raven Standard Progressive Matrices (RSPM) (Raven, Court, & Raven, 1992). As numerous previous studies have matched participants on receptive vocabulary level (e.g., Mottron, 2004), using it as a substitute for overall language abilities, to ensure that participants in this study had a similar non-syntactic verbal level, as well as to ensure consistency with the research literature, children were matched on the BPVS.

Control children were recruited from a mainstream primary school, a primary school for children with moderate learning difficulties (MLD), and a mainstream secondary school with a specialist unit for children with MLD. As 39% of children in the autism group had the BPVS and RSPM standardized scores falling at least two standard deviations below the population mean (score ≤70), the same proportion of children in the control group had MLD. Only children with learning difficulties conditions that were of non-specific origin were included as participants in this study, so as to avoid introducing a systematic bias of an accompanying disorder. No children were categorized as having language-specific disorders; instead, their language difficulties appeared to result from a combination of generally low cognitive abilities and environmental disadvantages. These children had no known history of neurological disorder or head injury. If there was any reason to suspect difficulties in social development for any control child (e.g., a sibling with an autistic spectrum disorder), participation was also ruled out. No children showed evidence of autistic-like behaviors or language use. All diagnostic information was again gathered from school personnel and school files of documented medical diagnoses and clinical reports, and the selected children all met the same criteria as specified for children with autism. About 61% of the control children were typically developing. They were characterized as showing average academic ability, and were perceived as having no special problems, by their teachers. These children had no history of a language, neurological or medical disorder, and scored within the normative range on the BPVS and RPSM. Children were also screened for musical training. Only those who had not undergone periods of extensive musical training, defined as having taken two or more years of individual music lessons, were included in the study.

Informed consent was obtained from the parents of all participating children. Ethical approval was obtained from the Research Ethics Committee of Goldsmiths College, University of London, which is in accordance with the guidelines of the British Psychological Society (BPS).

Table 1 shows the demographic characteristics of the two groups of children. No significant between-group differences in age, the BPVS, or the RSPM standardized scores are in evidence (t tests all p ≥ 0.743). Furthermore, the samples did not significantly differ in terms of gender ratio (chi square test p = 0.093).

Table 1 Characteristics of the two participant groups (SD; range, in parentheses)

Piloting

Linguistic picture-sentence pairs were piloted on typically developing children. Seven children of average academic ability were recruited from a mainstream primary school in the Greater London area. The mean age of this group was 10 years, 3 months.

Thirty-six sentences were individually piloted along with a visual display consisting of three different pictures. One of the pictures was linguistically related to the sentence (correct target) whilst the two others were distracter items. The pictures were selected from the Peabody Picture Vocabulary Test (PPVT) (Dunn & Dunn, 1981), and each depicted a possible scenario. The position of the target item was randomized across the stimuli. The sentences were five or six syllables in length, and were read by a native female English speaker. As the PPVT is a test of receptive vocabulary, in which the level of difficulty of the test items varies, none of the test words associated with the pictures was included in the stimulus sentences. Instead, the sentences were composed of words that occur frequently in spoken and written language, to control for the level of linguistic difficulty. They were selected using the Medical Research Council (MRC) Psycholinguistic database, Version 2.0 (Wilson, 1988), and had a mean Familiarity rating of 585, a mean Kučera Francis Frequency rating of 3,183, and a mean Brown Frequency rating of 686. Sentences never directly named any of the objects in the pictures, but rather referred to “situations.” For example, the linguistic choice options for the sentence “It’s dinner-time soon” were a picture of a woman peeling potatoes, a girl wrapping a present, or a boy climbing a fence. The stimuli were presented on a laptop computer, and each auditory sample was followed by a different visual display.

The children were tested individually in a quiet room in their own school. The experimenter told the child that s/he was going to hear some short sentences (pre-recorded onto the computer), and that they would see three pictures on the screen after each sentence. The child was asked to point to the picture that s/he thought best matched the sentence.

As the chance rate of responding correctly in this task was 0.33, any items that yielded lower than a 50% correct response rate across participants were eliminated. This resulted in 33 sentences. The 24 highest scoring sentences were selected as test stimuli (each receiving ≥95% correct response rates), and a further eight (each receiving ≥90% correct response rates) were used for training trials. Cronbach’s alpha was used to calculate the intra-class correlation co-efficients for the 24 items selected as test stimuli, which was at a mean of 0.96, with a minimum of 0.86, and a maximum of 1.0. Thus, the linguistic content in the experimental stimuli was well controlled for difficulty.

Training stimuli

Eight sentences, selected on the basis of the pilot study, were recorded directly onto a laptop computer. The speech samples were edited using the Praat speech editor (Boersma, 2001). The sentences were read by a native English-speaking female in such a way as to produce one of four pitch contours (ascending, descending, low-high-low, high-low-high). Visual inspection of the fundamental frequency (F0) curves was used to ensure that the contours were produced as intended; where necessary, sentences were re-recorded until the desired contours were obtained. The perceptual contours were piloted on a group of ten undergraduate psychology students to ensure that they were perceived as intended. Four training blocks, using the eight sentences, were then built on a laptop computer. Perceptual training block (a) included four sentences of which each conformed to a different pitch contour. The presentation of each sentence was followed by a visual display depicting the four contours. Perceptual training block (b) was constructed as described above, but used the remaining four sentences. Linguistic training block (a) included the same sentences that were used in perceptual block (a), but here each sentence was followed by a visual representation of the correct linguistic choice and an incorrect linguistic choice (for materials, see pilot study). Correspondingly, linguistic training block (b) included the same sentences that were used in perceptual training block (b). The position of the correct choice was randomized across the linguistic training trials.

Experimental stimuli

Twenty-four sentences, conforming to equal numbers of four distinct pitch contours, were used in the test. The sentences were selected as described in the pilot study. The order in which the pitch contours appeared in the sentences was randomized across test stimuli. Twenty-four visual response slides were then constructed, with each including the correct pitch contour symbol, an incorrect pitch contour symbol, the correct linguistic choice, and an incorrect linguistic choice. The two perceptual choices were located in opposite corners on the screen (so as to be diagonally opposed to each other), and with each consecutive slide the two response modalities swap diagonals (see Fig. 1). The positioning of correct perceptual and linguistic targets was randomized across slides. During presentation each sentence was followed by a visual display.

Fig. 1
figure 1

Examples of visual slides used in the experimental stimuli, for sentences (a) “I Like Growing Older,” and (b) “I will Lose my Job”

Procedure

The experiment was carried out at the various participating schools. Each child was tested individually in a quiet room, and all stimuli were presented via loudspeakers. The first author (A. J. P.) conducted all experimental testing. Training preceded the administration of the experimental task to ensure that all children could perform auditory to visual mapping. The order of presentation of training blocks was counterbalanced across participants, in a way that ensured that no two blocks belonging to the same response domain (perceptual/linguistic) were presented in succession. The child with autism and their matched control always received the training blocks in the same sequence. The four possible training sequences were: (1) perceptual training (a), linguistic training (a), perceptual training (b), and linguistic training (b); (2) linguistic training (a), perceptual training (a), linguistic training (b), and perceptual training (b); (3) perceptual training (b), linguistic training (a), perceptual training (a), and linguistic training (b); and (4) linguistic training (b), perceptual training (a), linguistic training (a), and perceptual training (b).

For the perceptual training, the child was told that sentences could be said in different ways, so as to form differently sounding shapes, depending on how “high” the voice sounded. The child was then shown a visual display depicting the four possible pitch contour shapes on the laptop computer, and told that his or her task would be to point to the shape that s/he thought best matched each sound. A training block of four sentences, similar to those used in the actual experiment, was then played on the computer. If the child’s response was inaccurate, the experimenter corrected the child. In order to proceed to the second block of training, the child was required to make at least two correct judgments out of four stimuli in response to the contour information. If not, then a different block of sentences was played, until this criterion had been reached. Participants took as many practice trials as needed until reaching this criterion. About 71% of the children achieved this during the first trial, and the remaining children reached this on a second trial.

For the linguistic training block, the child was further told that each of the sentences told a little story that could be depicted by a picture, and now the experimenter was going to play the sentences again, but this time the child’s task would be point to a picture that s/he thought best matched the meaning of each sentence. The child was shown a visual display depicting four different pictures taken from the PPVT on the laptop computer. If the child’s response was inaccurate, the experimenter corrected the child. In order to proceed to the second block of training, the child was required to make at least two correct judgments out of four stimuli in response to the linguistic information. If not, then a different block of sentences was played, until this criterion had been reached. Participants took as many practice trials as needed until reaching this criterion. About 70% of the children achieved this during the first trial, 20% achieved this during the second trial, and the remaining 10% of children reached this on a third trial; thus, no children were excluded from the study on the basis that they could not be trained. For training given in the above-described sequence, the remaining perceptual block would be followed by the remaining linguistic block. Once the training phase was completed, the experimenter told the child that s/he would hear more sentences and could either match them to a shape or a picture. One practice item was then played to familiarize the child with the actual test materials. The child was asked to respond in whichever manner seemed best to him or her. No feedback was given, and the experimenter recorded the children’s responses.

Results

The means and standard deviations for the number of perceptual and linguistic choices are shown in Table 2, for both the children with autism and their matched controls.

Table 2 Means and standard deviations for the response type choices for both the children with autism and their matched controls (Number of trials = 24)

As Figs. 2 and 3 below show, the distributions of scores varied widely between the two groups.

Fig. 2
figure 2

Box plots of data distribution within the perceptual response choice category for the children with autism and their controls

Fig. 3
figure 3

Box plots of data distribution within the linguistic response choice category for the children with autism and their controls

A group of three control participants produced statistically significant outlier scores. However, due to the quasi-open-ended design of the study, these children’s scores were included in the statistical analyses.

A two-tailed Mann–Whitney U-test was carried out on the perceptual choice data in order to compare performance between the two groups. It was only necessary to perform the analysis for one set of the choice data, as the response type categories were mutually exclusive. This analysis revealed a significant effect of group (Z = (4.52, p < 0.001), with the children with autism making significantly more perceptual choices compared with the control children. However, Wilcoxon tests showed that both groups of children provided significantly more linguistic than perceptual interpretations of the stimuli (Autism: Z = (2.21, p = 0.027; Control: Z = (5.42, p < 0.001). Bonferroni alpha corrected correlations were then carried out between age, intelligence, and response preference data. All correlations failed to reach significance for both groups of children (all ≤ 0.24), suggesting that neither age nor cognitive ability significantly contributed to the children’s response preferences.

The children’s response patterns with regard to the individual stimulus items were then explored in order to determine whether any response biases were in evidence for either group of children. This analysis was only carried out for the perceptual responses, since the linguistic and perceptual categories were mutually exclusive. This analysis showed that, for the autism group, the children’s responses ranged from seven to 14 perceptual preferences per stimulus item, whilst for the controls, the perceptual preferences ranged from zero to six responses per item, with two stimuli receiving no responses. This suggests that no strong response biases were present. In order to assess the temporal reliability of the linguistic items, Cronbach’s alpha was again used to calculate the intra-class correlation co-efficient for the responses of the control children, who primarily tended to respond linguistically. Mean Cronbach’s alpha was 0.84, with a minimum of 0.61, and a maximum of 1.0. Taken together with the mean Cronbach’s alpha values for the piloting phase, which excluded children with learning disabilities, this suggests that the internal consistency reliability of the linguistic test items was satisfactory.

In order to explore differences in response accuracy within both response type domains (perceptual/linguistic), correct perceptual and linguistic identification scores were converted into percentages correct out of total numbers of responses made. The means and standard deviations for these scores are shown in Table 3.

Table 3 Means and standard deviations for the % correct responses within the perceptual and linguistic choice domains for both the children with autism and their controls

As can be seen from Table 3, the children with autism provided significantly more accurate perceptual interpretations than their matched control children. A two-tailed Mann–Whitney U-test confirmed that this was statistically significant (Z = (1.99, p = 0.023). Wilcoxon tests showed that the children with autism were equally accurate in their perceptual and linguistic judgments (Z = (0.017, p = 0.986), whilst controls showed significantly poorer accuracy in their perceptual than linguistic responses (Z = (2.22, p = 0.026). A Mann–Whitney U-test showed that linguistic accuracy of children with autism was not significantly different to that of their control children (Z = (0.70, p = 0.485).

Finally, Bonferroni alpha corrected correlations were carried out between the age, intelligence, and accuracy data. These showed that for the children with autism, accuracy within the linguistic domain correlated positively with receptive vocabulary [r (28) = 0.49, p = 0.05], and the accuracy of perceptual and linguistic interpretations was negatively associated with each other [r (24) = (0.73, p = 0.01], whilst no such association was in evidence for the controls [r (11) = (0.59, n.s.]. All other correlations failed to reach significance (all  0.11).

Discussion

The present experiment utilized a quasi-open-ended paradigm to test linguistic (semantic) and perceptual (intonation) speech processing preferences in children with autism and in control children matched for age, receptive vocabulary, and NVIQ. The principle aim of the study was to identify atypical speech processing biases that may contribute to the undercutting of higher-level language skills in children with autism. The main finding from the study was that, whilst children with autism provided significantly more perceptual interpretations of the stimuli than their controls, both groups preferentially responded to the linguistic content of the speech samples. However, the tendency to provide linguistic interpretations of the stimuli was significantly weaker in children with autism than in controls in that whilst the controls preferentially responded to the meaning of the stimuli on 94% of trials, this was true for only 65% of the autism group responses. An analysis of response accuracy within linguistic versus perceptual domains showed that children with autism provided significantly more accurate perceptual judgments than their matched controls. However, no group differences emerged on the linguistic judgment comparison and whilst children with autism made fewer linguistic responses than controls, these were as accurate. Within the control group, a statistically significant subgroup of three children with MLD provided a greater number of perceptual interpretations of the stimuli than the other control children. These children responded to the perceptual information on 17, 46, and 54% of trials. However, inspection of their accuracy data showed that they differed from their counterparts with autism in showing greater accuracy in linguistic interpretations and poorer identification of intonation contours. This suggests that superior processing of perceptual information in speech is specific to autism.

The current finding showing a linguistic speech processing preference in participants with autism may appear inconsistent with the WCC and EPF theories, which would variously predict that either a featural/surface-biased information processing (Happé & Frith, 2006), or EPF (Mottron et al., 2006), are the “default” or preferred processing styles in individuals with autism. However, studies testing WCC at the verbal-semantic level have shown that the otherwise robust tendency to process stimulus features disappears when participants are specifically instructed to focus upon the semantic level of the presented stimuli (e.g., Snowling & Frith, 1986). Similarly, although the EPF account proposes locally oriented and EPF in autism, higher-level processing capacities are assumed to be unimpaired. This is consistent with the current finding showing no between-group differences in linguistic accuracy. A further assumption of the theory is that individuals with autism are able to regulate perceptual versus higher-order cognitive processing flexibly, depending upon task demands. Whilst the current finding showing a lack of a robust speech processing bias in autism is consistent with the EPF theory, this principle is difficult to test experimentally. Open-ended tasks are rarely employed in cognitive research with participants with autism, and the current findings may have been influenced both by the design of the study, and by the fact that children were trained equally on both linguistic and perceptual components of the speech stimuli. Indeed, the task support hypothesis (Bowler et al., 1997) proposes that individuals with autism will be less impaired relative to controls in open-ended rather than in forced-choice tasks, as well as under conditions where training and prompts are provided. Such a design was utilized in the current study.

In the introduction, a number of research findings indicating that autism is characterized by a processing style in which perception dominates semantic processing were outlined (Hermelin & O’Connor, 1967; Mottron et al., 2001; Toichi & Kamio, 2002). This was the case for 25% of the current participants with autism. In contrast, only one control child with MLD preferentially responded to perceptual information for over one-half (54%) of the stimulus trials. Whilst linguistic pitch or intonation is the most significant of prosodic cues (Lieberman, 1960), and intonation contours are typically highly salient, those presented in the current experiment were not linguistically meaningful. Indeed, the controls appeared to largely ignore this information and preferentially respond to the meaning of the speech samples.

The current findings showing atypical processing of speech in autism are particularly important when considered within the context of widely documented abnormalities in prosodic processing and in sentence comprehension (e.g., Jolliffe & Baron-Cohen, 1999; McCann & Peppé, 2003). In the expressive domain, prosodic abnormalities, when present, are highly persistent and in evidence early in development (Simmons & Baltaxe, 1975). Importantly, these abnormalities constitute one of the most significant obstructions for social adjustment (Paul et al., 2005b). In typical development, increased awareness of the role of prosodic cues in speech is linked to developing communication and social skills. The current findings showing atypical processing of this type of auditory information raise important questions about language acquisition and speech perception in individuals with autism.

Research mapping language acquisition in typically developing infants has highlighted the importance of infants’ socially motivated interest in speech (Fernald, 1985; Fernald & Kuhl, 1987). In contrast are studies showing that social stimuli are markedly less salient for infants and children with autism (Dawson, Meltzoff, Osterling, Rinaldi, & Brown, 1998; Klin, 1991) Indeed, Kuhl, Coffey-Corina, Padden, and Dawson (2005) showed that neural mechanisms specialized for processing speech had failed to develop in pre-school children with autism who preferred a non-speech analogue to child-directed speech. A number of other neurological studies have reported speech-specific cortical activation abnormalities in individuals with autism (e.g., Boddaert et al., 2004; Gervais et al., 2004; Lepistö et al., 2005). A failure to learn to assign emotional and linguistic significance to, for example, perceptual cues in speech, as well as to attend to the linguistic meaning of utterances, may well be down-stream effects of the early neglect of social/communicative cues.

In conclusion, the current findings showing enhanced awareness of perceptual information in speech, a weakened linguistic bias, and an unimpaired linguistic accuracy in participants with autism are largely consistent with both the WCC and EPF theories. In the present study, a heterogeneous sample of children with autism was tested. However, it was clear from the data analysis that the atypical speech processing, demonstrated in the study, did not distinguish between children with higher and lower cognitive ability. The speech processing abnormalities reported in the current study may impact differentially upon intellectually able and less able individuals, in the latter case by undercutting language acquisition, and in the former case by limiting higher social and academic expectations and achievements. If, as the results suggest, increased attention to the perceptual level of speech contributes to linguistic processing abnormalities in autism, a major research goal will be to inform the development of intervention techniques and educational approaches aimed at ameliorating the negative effects of such a tendency. For example, directing the individual’s attention to the linguistic meaning of utterances may result in enhanced semantic processing (cf. Jolliffe & Baron-Cohen, 1999; Snowling & Frith, 1986). Increased awareness of the communicative and social meanings of prosodic cues may be achieved, for example, by over-emphasis (cf. Paul et al., 2005a, b). This may enable listeners with autism to link acoustic variations in speech with specifically linguistic and pragmatic functions, thereby increasing access to meaning in speech. Furthermore, exercises in which the importance of directing attention to both perceptual features and linguistic content is made explicit, may serve to improve the listener’s interpretation of communicative utterances. Future studies should focus upon elucidating interpretations of the current data. For example, it may be possible to determine the extent to which perceptual processing is influenced by linguistic relevance in autism by comparing performance with communicatively functional versus non-functional auditory-perceptual cues. Studies might also manipulate the level of linguistic complexity (semantic/pragmatic) and measure its effects upon perceptual processing. The current findings also highlight the need for experimental paradigms in which the perceptual and linguistic dimensions can be manipulated independently.