Introduction

Individuals with autism spectrum disorder (ASD) have impaired social interaction and communication, and restricted, repetitive and stereotyped patterns of behavior and/or interests. Symptom expression and severity in these core domains, however, vary considerably within the ASD population (Jones and Klin 2009). Indeed, clinical observations, parent report, and behavioral studies indicate a complex and highly variable phenotype across the spectrum (e.g., Charman et al. 2011; Geschwind and Levitt 2007; Lord et al. 2000, 2012; Volkmar et al. 1989). Observations of atypical brain activity in ASD are ubiquitous (Ameis et al. 2011; Boddaert et al. 2004; Brandwein et al. 2013; Cardinale et al. 2013; Courchesne et al. 1985; Dunn et al. 2008; Fan et al. 2012; Fishman et al. 2014; Frey et al. 2013; Green et al. 2013; Jemel et al. 2010; Kemner et al. 1995; Mak-Fan et al. 2013; Nair et al. 2013; Roberts et al. 2010; Russo et al. 2009, 2010; Wolf et al. 2008), and it is reasonable to assume that such atypicalities relate systematically to variance in the autism phenotype. Bridging biological processes to clinical phenotype is clearly essential to understanding the neurobiology of ASD. There is a growing literature probing how neural processing differences relate to symptoms (e.g., Campbell et al. 2010; Coutanche et al. 2011; Edgar et al. 2013; Elsabbagh et al. 2011; Hu 2012; Roberts et al. 2011; Russo et al. 2009), yet much work remains to arrive at a thorough understanding of these relationships.

Event-related potentials (ERPs) provide a direct measure of the brain’s response to sensory inputs. Clearly identifiable transitions in scalp-topography of the ERP reflect successive cortical phases of analysis (Leavitt et al. 2007; Picton et al. 1974). These evolving processing stages are observed at the scalp as a set of positive and negative deflections in the evoked response, commonly referred to as components, each reflecting coordinated activity within or across a network of cortical regions. The high temporal resolution of electrophysiological recordings (i.e., electroencephalogram or EEG) allows one to parse the response in terms of early cortical sensory registration, sensory-perceptual processing, and later cognitive stages of processing (Foxe and Simpson 2002; Lucan et al. 2010; Naatanen and Picton 1987). Studies using EEG recordings of brain activity reveal the presence of differences in auditory (e.g., Dunn et al. 2008; Kemner et al. 1995; Lepisto et al. 2005) and visual (e.g., Frey et al. 2013; Jemel et al. 2010) sensory-perceptual processing, as well as decreased integration of multisensory inputs (e.g., Brandwein et al. 2013; Russo et al. 2010) in ASD. Neurophysiological indices of sensory processing atypicalities may reflect neuropathology underlying clinical symptoms of ASD, and as such serve as strong candidates for biomarkers of clinical phenotype.

Unusual responses to sensory information have long been noted in ASD, and were in fact documented in some of the original descriptions of the disorder provided by Kanner (1943) and Asperger (1944). Recent studies indicate that a significant proportion of individuals with ASD have aberrant and pathological responses to sensory events (Ben-Sasson et al. 2009; Rogers and Ozonoff 2005), with estimates ranging from 45 to 90 % (Ben-Sasson 2011; Ben-Sasson et al. 2008; Leekam et al. 2007; Tomchek and Dunn 2007). These can involve a wide range of atypical reactions to stimulation including outright aversion to certain touches or sounds, indifference to other sounds, and obsession with particular types of visual stimulation. Families report that their routines and activities are significantly affected by their child’s sensory-related behaviors (Schaaf et al. 2011), and a recent study suggested that sensory over-responsivity in toddlers with autism is associated with increased maternal stress, and with disruptions in family life (Ben-Sasson et al. 2013). It has been proposed that sensory processing differences contribute to the social, cognitive, and repetitive behaviors and restricted interests associated with ASD (Baranek et al. 2006; Cascio et al. 2012; Mongillo et al. 2008; Mottron et al. 2006a; Schaaf et al. 2014), and even that failure to develop normal modulation and integration of sensory inputs is at the root of autism (i.e., ‘Sensory Integration Theory’ from Ayres 1979; Brock et al. 2002; Frith 1996; Happé 2005; Hermelin and O’Connor 1970; Hutt et al. 1964; Just et al. 2004b; Mottron et al. 2006b; Ornitz 1974; Ornitz et al. 1977).

The current investigation was designed to probe the clinical significance of EEG indices of auditory and visual sensory processing and integration. That is, how well do these neurophysiological responses predict the severity of core and associated symptoms of autism in a well-characterized sample of children and adolescents with ASD? We tested the hypothesis that the amplitude of auditory and visual sensory ERPs (e.g., the auditory N1-complex and the visual P1), and ERPs associated with multisensory integration (MSI), are systematically related to (a) autistic symptom severity and/or (b) visual/auditory sensitivities. Since behavioral measures were also available, we likewise tested the hypothesis that variance in reaction times to auditory, visual, and audiovisual stimuli, as well as a psychophysical index of MSI, are systematically related to (a) autistic symptom severity and/or (b) visual/auditory sensitivities. Analyses were focused on a purely clinical sample (see Brandwein et al. 2013 for a group level comparison between individuals with ASD and individuals with typical development), leveraging EEG and behavioral ASD data from Brandwein et al. (2013) (plus additional data that had been gathered from individuals with ASD in the intervening time). For dependent measures, EEG indices of early auditory and visual sensory processing and sensory integration (Brandwein et al. 2011, 2013; Di Russo et al. 2002; Foxe and Simpson 2002; Molholm et al. 2002; Naatanen and Picton 1987) were used. A measure of autism severity was derived from the Autism Diagnostic Observation Schedule (ADOS) (Lord et al. 1999), a semi-structured observation of the individual designed to measure both quantity and quality of social-communication skills, as well as stereotyped behaviors and restricted interests. To assess the presence of atypical reactions to visual and auditory sensory stimulation we relied on the Short Sensory Profile (SSP) (McIntosh et al. 1999a), a questionnaire on which parents/caregivers rate their child’s reactions, preferences, and tendencies when confronted with everyday sensory stimuli and situations.

Methods

Participants

Participants consisted of a purely clinical sample of fifty-two individuals with ASD (6–17 years, seven females). For the analysis of symptom severity, nine of these individuals were excluded because severity scores are not available for Module 4 of the ADOS (resulting in a total N = 43 for this analysis). For the analysis of visual/auditory sensitivities, six individuals were excluded due to missing SSP data (resulting in a total N = 46 for this analysis). The ADOS was administered and scored by a research-reliable psychologist or trainee, and an ASD diagnosis was confirmed with a developmental history and clinical judgment. Intellectual functioning was assessed using the Wechsler Abbreviated Scales of Intelligence (WASI, Wechsler 1999). Table 1 describes the intellectual makeup of the full sample. Supplemental Tables 1 and 2 characterize the sample separately for the analysis of autism symptom severity and visual/auditory sensitivities, respectively. Exclusionary criteria included a history of seizures (non-febrile) or head trauma, a performance IQ (PIQ) estimate below 80, or a known genetic disorder. The dataset was 69 % Caucasian, 12 % African American, 12 % Asian American, 4 % mixed race, 2 % Native American, and 2 % unspecified. Regarding maternal education, 11 % of mothers had a high school degree or less, 22 % had a bachelor’s, associate’s degree or some college education, and 19 % reported a graduate or professional degree. Audiometric threshold evaluation confirmed that all participants had normal hearing. Participants were formally screened for normal or corrected-to-normal vision using a Snellen eye test chart. Informed written consent was obtained from each participant’s parent or legal guardian prior to entering the study. Verbal or written assent was obtained from each participant. All procedures were approved by the Institutional Review Boards of the Albert Einstein College of Medicine, the City College and the Graduate Center of the City University of New York, and were in accord with the ethical standards laid out in the declaration of Helsinki. Participants were recruited through the Human Phenotyping Core (a facility of the Rose F. Kennedy Intellectual and Developmental Disabilities Research Center), referrals from clinicians (primarily at the Albert Einstein College of Medicine), advertising, and at community health fairs.

Table 1 Range, mean, and standard deviations of age, verbal and performance IQ for all 52 participants

Procedure

Clinical assessments, including the ADOS, WASI, and Sensory Profile, were administered at the participant’s initial visit to the laboratory. On the following visit, participants performed a simple reaction time task while continuous EEG was recorded. On average, the two visits were 3 months apart. The parameters of the task and the ERP acquisition, processing and analysis procedures are briefly described here, and in more detail in Brandwein et al. (2013).

Audiovisual Simple Reaction Time Task

Participants performed a simple reaction time task consisting of three stimulus conditions presented with equal probability. The ‘auditory-alone’ condition was a 1,000-Hz tone 75 dBSPL; 5 ms rise/fall time) presented from a single speaker for 60 ms. The ‘visual-alone’ condition was an image of a red disc with a diameter of 3.2 cm (subtending 1.5° in diameter at a viewing distance of 122 cm), which appeared on a black background for 60 ms. The ‘audiovisual’ condition consisted of the ‘auditory-alone’ and ‘visual-alone’ stimuli presented simultaneously. Auditory stimuli were presented from a Hartman Multimedia JBL Duet speaker located centrally atop the computer monitor (a Dell Ultrasharp 1704FTP) from which the visual stimuli were presented. The three stimulus conditions were presented in random order with an inter-stimulus interval (ISI) that varied randomly between 1,000 and 3,000 ms. Stimuli were presented in blocks of 100 trials each, and participants completed between 9 and 11 blocks (with the vast majority completing 10 blocks). Participants were instructed to press a button on a response pad as quickly as possible when they saw the circle, heard the tone, or saw the circle and heard the tone together. The same response key was used for all the three stimulus types. Breaks were encouraged between blocks to help maintain concentration and reduce restlessness or fatigue.

Behavioral Indices

Mean reaction times (RTs) were computed for each of the three stimulus conditions, so that we could assess how well these predicted the clinical measures of interest. Only trials with RTs falling within 2 standard deviations of an individual’s average RT were considered valid. Thus, the range of RTs accepted was determined at the individual participant level. Given the large age range and focus on a clinical population, significant intersubject variability in RT was expected. Using a 95 % cutoff to define the time window for acceptable trials rather than an absolute cutoff value allowed us to more accurately capture the range of RTs for each participant, an important factor in calculating the race model (described below). Hit rates, defined as the percent of trials on which a button press occurred within the individual’s specific RT range, were calculated for each participant and planned comparisons assessed for differences in hit rates across the three stimulus conditions. Planned comparisons between RTs for each three conditions tested for the presence of the ‘redundant signal effect’ (RSE), which, in this case would indicate behavioral facilitation (e.g., faster RTs) to the multisensory condition compared to each of the unisensory conditions. Our psychophysical index of MSI, however, was based on the Miller’s Race Model (Miller 1982), a stringent and established behavioral metric of MSI (e.g., Barutchu et al. 2009; Hughes et al. 1994; Maravita et al. 2008; Molholm et al. 2002; Neil et al. 2006). The race model assumes that a RSE can occur because the multisensory stimulus has two inputs to trigger a response (e.g., auditory and visual), and the fastest input wins. This in turn can lead to a faster mean RT to multisensory stimuli due to probability summation. Miller’s race model tests whether RT facilitation exceeds that predicted by probability summation. When the race model is violated (e.g., when RT facilitation is greater than that predicted by the race model), it can be assumed that multisensory RT facilitation is due to the interaction of the unisensory inputs during processing.

Miller’s race model (Miller 1982) is tested as follows: An upper limit is placed on the cumulative probability (CP) of a response at a given latency for redundant signals (i.e., the multisensory condition). For any latency, t, the race model holds when this CP value is less than or equal to the sum of the CP from each of the single target stimulus conditions (the unisensory stimuli). For each individual, the range of valid RTs was calculated for the three stimulus types (auditory-alone, visual-alone, and audiovisual) and divided into quantiles from the 5th to 100th percentile in 5 % increments (5, 10, …, 95, 100 %). Violations were expected to occur at quantiles representing the shorter RTs because this is when it was most likely that interactions of the visual and auditory inputs would result in the fulfillment of a response criterion before either source alone satisfied the same criterion (Miller 1982; Ulrich et al. 2007). A ‘Miller Inequality’ value is calculated by subtracting the value predicted by the race model from this cumulative probability value, and positive values represent the presence and amount of race model violation. It is important to note that failure to violate the race model is not evidence that the two information sources did not interact, but rather it places an upper boundary on RT facilitation that can be accounted for by probability summation. In the current study, we used maximum race model violation as the behavioral measure of MSI. Maximum race model violation is defined here as the largest ‘Miller inequality’ value across the first third of the distribution of RTs for each individual. To assess race model violation at the group level, the ‘Miller inequality’ value (from each participant, at each quantile considered) is submitted to a t test. The group is said to violate the race model at quantiles in which the t test was significant and the ‘Miller inequality’ value was positive.

Electrophysiological Indices

ERP Acquisition

Continuous EEG was recorded from 70 scalp electrodes at a digitization rate of 512 Hz using the BioSemi ActiveTwo™ electrode system with an open pass-band from DC to 103 Hz. Continuous EEG was recorded referenced to a common mode sense (CMS) active electrode and a driven right leg (DRL) passive electrode (for a description of the BioSemi active electrode system referencing and grounding conventions, visit www.biosemi.com/faq/cms&drl.htm).

ERP Processing

Matlab was used for offline processing and analyses. A low-pass filter of 45 Hz with a slope of 24 db/octave, and a high-pass filter of 1.6 Hz with a slope of 12 db/octave were applied to each participant’s continuous EEG. To generate ERPs, the EEG was divided into 600 ms epochs (100 ms pre-stimulus to 500 ms post-stimulus onset) with baseline defined as −50 to +10 ms relative to stimulus onset. To ensure that participants were paying attention to the stimuli, only trials for which the participant made a response (button press) within a specific time window were included in the analysis.

Artifact Rejection

Electrode channels with amplitudes larger than ±120 μV during the epoch surrounding stimulus presentation were considered to have excessive electromuscular activity, including those resulting from large eye movements, and were interpolated on a trial-by-trial basis using the nearest-neighbor spline (Perrin et al. 1987, 1989). Channels with a standard deviation of <.5 μV across the block were interpolated on a block-by-block basis. Finally, if there were more than four bad channels in a trial, then the trial was rejected (i.e., no more than four channels were interpolated for any given trial). For a given condition, a minimum of 180 trails (with an average of about 250 trials) were included in each of the participant averages. Epochs were sorted according to stimulus condition and averaged for each participant. The resulting auditory, visual, and audiovisual ERPs were re-referenced to an average of all electrodes. For each participant, a “sum” waveform was created by summing together the auditory and visual ERPs (from the unisensory conditions), the purpose of which is described in the following section.

ERP Analysis and Measures

EEG indices of early auditory and visual processing were based on the peak amplitudes of the auditory P1, N1a, N1b, and N1c and of the visual P1 and N1. The grand averaged ERPs across the full dataset for a given stimulus condition (auditory or visual) were used to identify the latency window and electrodes where the sensory evoked potential was largest (see Table 4). Automatic identification of the largest amplitude value within these timeframes and for these electrodes was then performed. Multisensory interactions were measured by comparing the sum of the responses to the auditory and visual unisensory conditions (the sum waveform) to the response to the multisensory audiovisual (AV) condition. This well-established and commonly used approach to measuring MSI (e.g., Brandwein et al. 2011; Foxe et al. 2000; Giard and Peronnet 1999; Molholm et al. 2002; Murray et al. 2005; Russo et al. 2010; Teder-Salejarvi et al. 2002) is based on the principle of superposition of electrical fields and nonlinear summation. Based on this general principle, any significant divergence between the sum and multisensory waveforms indicates that the auditory and visual inputs were processed differently when presented simultaneously versus when presented in isolation; i.e., that they interacted. EEG indices of MSI were based on the peak amplitudes of the MSI waveform (the difference between sum and multisensory responses) between 100 and 120 ms over fronto-central scalp, 100–130 ms over parietal scalp, and 180–210 ms over parieto-occipital scalp. These latencies and regions were defined by where and when ASD participants showed MSI in our earlier study (Brandwein et al. 2013).

Clinical Indices

Autism Symptom Severity

Severity scores were derived from ADOS raw total scores using the conversion table from Gotham et al. (2009). Severity scores are on a 1–10 point scale with higher numbers representing increased severity of autistic symptoms. A score under 4 is associated with a non-spectrum classification (Gotham et al. 2009). In this dataset severity scores ranged from 5 to 10. The distribution of scores within the sample is presented in Table 2.

Table 2 Distribution of ADOS severity scores and visual and auditory sensitivities (VAS) within the sample
Visual and Auditory Sensitivities

Visual/auditory sensitivity (VAS) scores were computed by mapping participants’ classification on the VAS scale of the SSP onto an ordinal 0–2 point scale such that 0 = ‘typical development’, 1 = ‘probable difference’, and 2 = ‘definite difference’. The distribution of VAS scores within the sample is presented in Table 2.

Statistical Analyses

Consideration of Participant Characteristics

Because certain demographic variables (e.g., maternal education, VIQ) have been shown, albeit inconsistently, to correlate with the expression and severity of ASD (Gotham et al. 2009; Howlin et al. 2004; Sell et al. 2012; van Eeghen et al. 2013a, b) and of sensory sensitivities (Ben-Sasson et al. 2009; Engel-Yeger et al. 2011; Gouze et al. 2009), participant characteristics (including age, verbal IQ, performance IQ, sex, maternal education, and race) were controlled for in these analyses. An initial correlation analysis assessed whether there were any significant relationships between the demographic characteristics of participants and the two clinical outcome measures. Participant characteristics shown to correlate with ADOS severity scores or VAS scores from the SSP were controlled for in the regression analyses by entering them into a hierarchical regression as ‘Stage 1’ variables. The preliminary correlation analysis on this dataset showed that none of the participant characteristics considered were significantly related to ADOS severity scores. Verbal IQ (VIQ) was significantly related to VAS [r(46) = −.414, p < .01] such that a lower VIQ was associated with higher levels of visual and auditory sensitivities. To control for the potential effect of VIQ on predicting VAS scores, VIQ was entered in Step 1 of the hierarchical regressions.

Predicting Autism Severity

Two multiple linear regression analyses were conducted to assess the extent to which (1) neurophysiological measures (the auditory P1, N1a, N1b, N1c; the visual P1, N1, and the three multisensory responses), and (2) behavioral measures (RTs for the three conditions and maximum race model violation), can predict autism symptom severity as measured by ADOS severity scores. In the first regression, the nine ERP measures of auditory and visual processing and MSI were entered into a simple linear regression as independent variables with severity scores as the dependent variable. In the second regression analysis, the four behavioral measures were entered into a simple linear regression as independent variables with severity scores on the ADOS as the dependent variable. The R 2 associated with the linear combination of the independent variables was used to evaluate the extent to which neurophysiological and behavioral measures of auditory and visual sensory processing and integration were associated with autistic symptom severity. The importance of individual ERP components and behavioral response patterns was considered by examining their relative contribution to the variance in autism symptom severity scores.

Predicting Visual and Auditory Sensitivities

Two hierarchical regression analyses were performed to assess the extent to which (1) neurophysiological measures (the auditory P1, N1a, N1b, N1c; the visual P1, N1, and the three multisensory responses), and (2) behavioral measures (RTs for the three conditions and maximum race model violation), are associated with VAS scores on the SSP, above and beyond that predicted by VIQ. For both of the regression analyses VIQ was entered in Step 1 of the regression analysis, as a preliminary analysis suggested that increased visual/auditory sensitivities are correlated with a lower IQ. The nine ERP measures were entered in Step 2 of the regression examining neurophysiological predictors as well as in the regression assessing behavioral variables. The change in R 2 resulting from the addition of the ERP and behavioral variables was used to evaluate the extent to which these experimental indices are associated with VAS once VIQ is controlled for. The importance of individual ERP components and of reaction time patterns was considered by examining their relative contribution to the variance in VAS.

Results

Behavioral Findings

Mean hit rates and RT for the full dataset are presented in Table 3. As expected based on Brandwein et al. (2013), mean hit rates were highest for the AV condition and lowest for the visual-alone condition (auditory compared to the visual: t 51 = 4.428, p < .01; auditory compared to AV: t 51 = 2.579, p < .05; visual compared to AV: t 51 = 6.008, p < .01). As a group, mean RTs were fastest to the AV stimuli and slowest to the visual-alone stimuli (auditory compared to the visual: t 51 = 3.605, p < .01; auditory compared to AV: t 51 = 13.978, p < .01; visual compared to AV: t 51 = 15.045, p < .01). Also as expected, at the group level the race model was not significantly violated.

Table 3 Mean and standard deviation of hit rate and reaction time for each of the three stimulus conditions for all 52 participants

Results of Regression Analyses

As described in the methods, we entered the mean RT for the three stimulus conditions and maximum race model violation (i.e., the ‘Miller inequality’ value) into each of the regression models as independent variables. The linear combination of these four behavioral variables did not predict ADOS severity scores [F(4, 38) = .326, p > .05] or VAS scores [after controlling for the effects of VIQ, R 2 change = .197, F(5, 40) = 4.674, p > .05].

Electrophysiological Findings

Clear Auditory and visual responses were readily observable (see Fig. 2, Supplemental Figure 1 for ERPs representing the composite signal from the electrodes used in the analyses). The auditory evoked potential was characterized by the typical P1–N1 complex with a fronto-centrally focused positivity (P1) around 75 ms followed by a negativity (N1b) around 115 ms. Over lateral scalp regions, the auditory evoked potential included a negativity that peaked around 75 ms (N1a) and 180 ms (N1c). The visual evoked potential was characterized by a large positivity (P1) over occipital areas peaking at about 150 ms, and a large bilateral negativity over lateral occipital areas that peaked around 225 ms. AV interactions, as indicated by the multisensory (AV) and the sum waveforms (A + V), were observed over fronto-central and parietal scalp between 100 and 150 ms, and bilaterally over parieto-occipital scalp between 180 and 210 ms. See Table 4 for a breakdown of the electrodes and latency windows used in the analyses.

Table 4 The latency windows and electrodes corresponding to the nine ERP predictors included in the regression analyses

Results of Regression Analyses

As described in the methods, we entered the peak amplitudes of the nine ERP measures (auditory P1, N1a, N1b, and N1c and the visual P1 and N1, and three multisensory responses) into each of the regression models as independent variables. The linear combination of the nine ERP measures was significantly related to ADOS severity scores, F(9, 33) = 2.928, p = .011. Approximately 44 % (R 2 = .444) of the variance of autistic symptom severity in the sample can be accounted for by the linear combination of ERP measures (Fig. 1). Table 5 presents the relative strength of the individual predictors. The auditory N1a and N1b were the strongest unisensory ERP predictors of autism symptom severity. The negative correlation between the N1a and severity scores suggested that a smaller N1a (e.g., a more positive amplitude value) was associated with less severe symptoms of autism (lower severity scores). The positive correlation between the N1b and severity scores indicated that a larger N1b (e.g., more negative amplitude value) was associated with less severe symptoms of autism (lower severity scores). MSI between 100 and 130 ms over parietal scalp significantly contributed to the variance observed in autism symptom severity, with larger amplitude MSI effects (the difference between the sum and multisensory responses) associated with less severe symptoms of autism. In contrast, the linear combination of the nine ERP measures did not account for a significant proportion of the variance in VAS scores after controlling for the effects of VIQ (R 2 change = .151, F(10, 35) = 1.675, p > .05).

Fig. 1
figure 1

Scatterplot displaying the relationship between autism symptom severity (y-axis), and the linear combination of nine ERP peaks that measure auditory, visual, and audiovisual processing (x-axis). Severity scores are derived from the ADOS and range from 1 to 10 with higher scores indicating increased symptom severity. Each point represents a single value for a participant. The p value associated with the R 2 of .444 is .012

Table 5 The bivariate and partial correlations of the ERP predictors with autism symptom severity

Schematic Representation of ERP Effects

For descriptive purposes only, we performed a median split of the ERP data as a function of the participant’s autism severity score. This allowed us to visualize the above effects in terms of the ERP response, albeit in summary form. Waveforms were generated to illustrate the three ERP effects significantly related to autism symptom severity, the auditory N1a, N1b, and the parietally focused MSI peak (between 100 and 130 ms). This yielded an ‘ASD-moderate’ group with severity scores between 5 and 7 (N = 22) and an ‘ASD-severe’ group with severity scores between 8 and 10 (N = 21). The two groups did not differ significantly in age or estimated PIQ or VIQ. Further, to visualize how the waveforms of the ‘ASD-moderate’ and ‘ASD-severe’ groups compared to those of TD children, we generated mean ERPs from a group of age and PIQ matched typically developing (TD) children (taken from a database of TD children run on the exact same paradigm/procedures). Waveforms for these three groups (Fig. 2a) show that the peak amplitude of the auditory N1a in the ‘ASD-moderate’ group is midway between the peak amplitude of the auditory N1a in the ‘ASD- severe’ and TD group. The auditory N1b component (Fig. 2b) in the ‘ASD-moderate’ group is very similar to, and in fact overlapping with, that of the TD group. The auditory N1b is strikingly smaller in the ‘ASD-severe’ group. The parietally focused MSI peak between 100 and 130 ms is largest in the TD group and smallest in the ‘ASD-severe’ group, with the peak amplitude of the ‘ASD-moderate’ group falling midway between the severely autistic and the TD children (Fig. 2c).

Fig. 2
figure 2

Mean ERPs for the ASD-severe, ASD-moderate, and typically developing groups. a, b The three groups’ responses to the auditory-alone condition, with dashed ellipses indicating the component of interest (the auditory N1a and N1b). c A measure of audiovisual integration, represented by a difference wave (explained in the text) over parietal scalp, with a dashed ellipse to indicate the response window of interest. Traces represent the composite signal from adjacent electrodes, the locations of which are indicated on the head models

Discussion

This study revealed a significant relationship between neural indices of early auditory and visual processing and the severity of autistic symptoms, in a group of children and adolescents with ASD. A particularly robust relationship was observed between severity of autism and basic auditory processing and audiovisual integration. In contrast, our EEG indices and reaction time data did not predict visual/auditory sensitivities, as assessed by parent responses on the SSP.

A Role for Impaired Auditory Processing in the Severity of Autism

The strongest neurophysiological predictors of autistic symptom severity were the auditory N1a and auditory N1b. The N1 response reflects early sensory processing, and is associated with neural activity largely focused in auditory cortices in the temporal lobe (Naatanen and Picton 1987; Ruhnau et al. 2011; Scherg et al. 1989). Interestingly, post mortem studies reveal that typical neural patterning during development in these very same regions is disrupted in ASD (Casanova et al. 2002; Stoner et al. 2014). Further, there is considerable evidence from converging methods for impaired auditory processing in ASD (Boddaert et al. 2004; Bruneau et al. 1999; Courchesne et al. 1985; Ferri et al. 2003; Martineau et al. 1984; Oades et al. 1988; Roberts et al. 2010; Samson et al. 2011). Previous findings on the auditory N1 response in ASD, however, have been highly variable. For example while Bruneau et al. (1999) found smaller N1b amplitude in 4–8 year old children with ASD, Oades et al. (1988) found that the N1b was larger and had a shorter latency in a sample of 5–17 year olds with ASD. Other groups, in contrast, report the absence of significant N1b amplitude differences in children with ASD (Dunn et al. 2008; Ferri et al. 2003; Lincoln et al. 1995; Martineau et al. 1984). Such inconsistencies in the literature are undoubtedly due in part to differences in participant characteristics (e.g., age, cognitive level of functioning, language function) and experimental parameters (e.g., inter-stimulus interval, and active vs. passive tasks; see Dunn et al. 2008) across studies. The current findings make clear, however, that symptom severity is another key factor accounting for variance in basic auditory processing in ASD. It is reasonable to assume that this reflects a relationship between neuropathology in auditory cortices and the degree of autistic symptoms. We would also surmise that magnitude of neuropathology in auditory cortices is indicative of neuropathology in other affected brain areas. To the extent that this is the case, associations between auditory responses (as indexed by neurophysiological recordings) and autism severity do not necessarily mean that auditory cortex is the only region involved. Nevertheless, it is worthwhile and valid to consider the possible role of auditory dysfunction in autism.

The auditory N1 typically shows developmental changes over childhood (Ceponiene et al. 2002; Gomes et al. 2001b; Ponton et al. 2000; Tonnquist-Uhlen et al. 2003). In young children (under ~9 years of age), the N1a is most prominent at lateral electrode sites, whereas response amplitude at these sites diminishes with increasing age (Gomes et al. 2001b; Tonnquist-Uhlen et al. 2003). The fronto-centrally focused N1 (N1b), a prominent early negative-going response in adults, is small or under some circumstances undetectable in young children (Ceponiene et al. 2002; Gomes et al. 2001b; Ponton et al. 2000), and reaches adult-like levels by about 15–16 years of age (Mahajan and McArthur 2012; Pang and Taylor 2000; Ponton et al. 2000). In contrast, the amplitude of the lateral N1 (a and c) gets smaller with increasing age (Gomes et al. 2001a; Tonnquist-Uhlen et al. 2003). In light of these documented changes in N1 morphology, the current findings, that increased severity of autism is associated with a larger lateral N1 (i.e., N1a) and a smaller fronto-central N1 (i.e., N1b), is consistent with the notion of immature responses to auditory stimuli during early stages of cortical processing. It is possible that differences in the microstructure of neural patterning, as has been observed in post-mortem morphological studies (Casanova et al. 2002; Stoner et al. 2014), would lead to immature auditory responses. It is important to point out that it is unlikely that age accounts for the current differences. Autism severity, as measured by the ADOS, is derived in such a way to be relatively independent of age, and in our preliminary processing of the data, autism severity and age did not correlate. Further, when the ASD participants were equally split based on whether the participant fell into the “moderate autism” or “severe autism” category, there was no significant difference in age (means of 10.8 and 10.4 years respectively).

We also examined the relationship between indices of early visual processing (visual P1 and N1 response amplitudes), and autism symptom severity and SSP scores. In contrast to the auditory N1, these neurophysiological metrics of visual processing did not have significant predictive value for our outcome measures. A number of studies report differences in early visual processing in individuals with ASD compared to age and IQ matched healthy controls (Brandwein et al. 2013; Frey et al. 2013; McPartland et al. 2011; Vlamings et al. 2010). As such we cannot rule out that neurophysiologial responses to different visual stimuli, or stimuli presented to more peripheral locations (Frey et al. 2013), would reveal such relationships. Indeed we expect that such relationships exist. It is further possible that different and/or more specific clinical variables would hold a stronger relationship to the neurophysiological measures of visual processing that we investigated here. Alternatively, visual processing deficits may be less variable across the spectrum and therefore not hold strong predictive value with regard to clinical symptoms in ASD.

A Role for Impaired Multisensory Processing in the Severity of Autism

An additional neurophysiological predictor of autistic symptom severity was found in an early MSI response over parietal scalp (in the 100–130 ms post-stimulus time window). Although the functional role of this MSI response remains to be unraveled, the fact that it is smaller in individuals with more severe autism supports the thesis that deficits in MSI are associated with the core symptoms of autism (Brandwein et al. 2013; Foss-Feig et al. 2010; Foxe et al. 2013; Stevenson et al. 2014b; Woynaroski et al. 2013). Neuroimaging studies indicate that brain connectivity is abnormal in ASD (Courchesne and Pierce 2005; Just et al. 2004a; Muller et al. 2011; Supekar et al. 2013), and it has been speculated that this has implications for the integrity of MSI. There is now substantial evidence for impaired multisensory processing in autism from both behavioral (Brandwein et al. 2013; Collignon et al. 2013; Foss-Feig et al. 2010; Foxe et al. 2013; Kwakye et al. 2011; Stevenson et al. 2014a, b) and EEG (Brandwein et al. 2013; Magnee et al. 2011; Murphy et al. 2014; Russo et al. 2010) studies. It is a reasonable assumption that suboptimal integration of multisensory inputs early in development would have cascading effects on the development of both language and social skills. For example, in typical development, early language learning involves combining incoming visual (lip movements) and auditory (speech sounds) information (Kuhl and Meltzoff 1982; Teinonen et al. 2008). We and others have shown the ability to benefit from such multisensory inputs during speech perception to be significantly impaired in autism (e.g., Foxe et al. 2013; Stevenson et al. 2014a). Regarding the development of non-linguistic social skills, emotion and speaker intention are communicated through multisensory signals such as facial expressions and changes in prosody of the speech signal (Ethofer et al. 2006). As such we would expect impaired integration to also have implications for the development of social communication. It is also possible that individuals with ASD are more reliant on redundant sensory inputs to learn social cues than are typically developing individuals, perhaps due to poorer attunement to social cues. In this case, a reduced ability to benefit from multisensory inputs may compound existing deficits, exacerbating the severity of symptoms seen in ASD. In addition, we and others have proposed that the integration of multisensory inputs is essential to the orderly grouping of information that enters through the separate sensory systems (e.g., Molholm et al. 2004; Stein and Meredith 1990). Accordingly, deficits in MSI may lead to experiences of a disorganized sensory environment and ‘sensory overload’, which in turn may lead to withdrawal and defensive sensory behaviors.

Neurophysiological Measures of Sensory Processing and Their Relationship to Reported Visual/Auditory Sensitivities

Whereas our neurophysiological indicators of auditory and visual processing and integration were good predictors of autism severity, these metrics failed to show a systematic relationship with participants’ auditory and visual sensitivities, as rated by their parents on the SSP. It is tempting to interpret this finding as lack of evidence for a neurophysiological relationship with auditory and visual sensitivities in ASD. However, such an interpretation is premature for a number of reasons. One is that here we only consider relatively early latency sensory processing, whereas sensory processing in later time-frames might be more relevant for reported sensory sensitivities. Additionally, we must consider the limitations of using an indirect, parent report measure to quantify visual/auditory sensitivities. The SSP was chosen as an outcome measure because it is currently the most commonly used scale of sensory processing in research. Nevertheless, it is far from ideal. Like most parent report measures, the SSP is problematic because parents can be strongly influenced by the symptoms they believe to be related to their child’s disorder (Dahlgren and Gillberg 1989), as well as by their own personal experiences with sensory stimuli. The construct validity of the SSP is based on the finding that children who scored lower on the SSP (indicating more abnormal behaviors) had more abnormal physiological responses (as measured by electrodermal responses) to repeated sensory stimulation (McIntosh et al. 1999a, b). However, the sample size was small and the relationship between SSP scores was non-specific (i.e., scores did not differentiate between hyper or hypo-responsive electrodermal responses). In addition, the psychometric properties of the SSP are based on a small sample size (N = 117) across a large age range (3–17 years); and the normative data are not age or IQ-specific, which would seem particularly important given the known influences of age and cognitive level on sensory responses and behaviors (Crane et al. 2009; Kern et al. 2006). A final issue to be considered is that the VAS section of the SSP focuses on over-responsivity to visual stimuli and sounds (sample question: “holds hands over ears to protect ears from sound”). It may be that our experimental variables represent a different aspect of auditory and visual processing abnormalities that are not captured by the questions in the SSP.

While the shortcomings of the SSP limit the conclusions that can be drawn from the current dataset, these issues highlight the need for an improved measure of sensory symptoms. The Sensory Integration and Praxis Test (SPIT) (Ayres 1989) is considered the gold standard tool for assessing sensory integration and praxis (Schaaf et al. 2014). This well standardized, reliable and valid measure of sensory symptoms has been used in children with ASD (Schaaf et al. 2014) and is valuable in that it involves direct observation of the child by a trained professional. However, this measure is not optimal for the current study as it does not include assessments of auditory and visual symptoms, the mainstay of this investigation. One promising tool that may prove valuable for characterizing sensory symptoms is the SensOR Assessment, an examiner-administered performance evaluation that measures sensory over-responsivity across seven domains (including auditory and visual) (Schoen et al. 2008). Unfortunately the SensOR is in development and is not yet standardized. Establishing a valid measure of sensory symptoms is important not only for research purposes, but also for use by clinicians. This is especially the case in light of the inclusion of sensory symptoms in the DSM-5 criteria for ASD.

Finally, we also note that for both regression analyses, RTs in response to unisensory stimulation and RT facilitation as assessed by race model violation did not predict clinical symptomology. This may reflect that these early latency ERPs are more proximal to the underlying neurobiology of ASD than are RT data.

Conclusions

In conclusion, the current investigation reveals a relationship between neurophysiological indices of basic sensory processing and clinical measures of autism symptom severity. Clinical diagnosis is currently made on the basis of behavioral characteristics and symptoms which can be highly subjective and often require a tremendous amount of clinical expertise. On the other hand, biomarkers (whether they are genetic, neuroanatomical, or in this case neurophysiological) can be measured objectively and systematically. Biomarkers may prove invaluable in sub-grouping this incredibly heterogeneous disorder, and aiding in developing targeted, individualized interventions that are tailored for maximum efficacy based on the individual’s specific strengths and weaknesses. While there is much ground to be covered in terms of identifying biomarkers of ASD, the hope is that combining robust neurophysiological indices of basic sensory processing with well-established clinical measures of autism, will help get us closer to this point.