Introduction

The Autism Diagnostic Observation Schedule (ADOS; Lord et al. 1999) is a semi-standardized assessment instrument for diagnosing individuals suspected of an autism spectrum disorder (ASD). The original ADOS (Lord et al. 1999) was comprised of four modules administered based on one’s verbal language ability. Each module consisted of structured activities and materials for an examiner to incorporate in order to observe an individual’s social interactions, communication, and play or creative use of materials at differing chronological ages and developmental levels. ADOS tasks became more conversational and less material-oriented in the higher modules due to the more sophisticated language required for administration of modules 3 and 4 (Lord et al. 1999). Behaviors coded through the ADOS were converted to an algorithm score based on research indicating these behaviors were more indicative of ASD and with high scores meaning more ASD-related symptoms were present. Diagnostic algorithm raw scores range from 0 to 28. The revised ADOS—the ADOS-2 (Lord et al. 2012) provided new diagnostic algorithm computations for Modules 1 through 3. In addition, this ADOS-2 update provided newly created standardized calibrated severity scores (CSS) (also referred to comparison scores in modules 1–3) for the same modules. These scores, based on the ADOS-2 total algorithm raw score, range from 1 to 10, with individuals with ASD ranging from 6 to 10, and were developed with empirical evidence to provide better sensitivity and specificity of diagnostic results (Gotham et al. 2009, 2007). The new CSS were created to provide a measure of an individual’s severity of autistic-like symptoms during the ADOS-2 when compared to other individuals with autism spectrum disorder (ASD) that would standardize the symptom severity across modules, regardless of chronological age and language level. The CSS were also proposed to be used as a measure of changes in one’s autistic symptomology over time. These updates were created to assist with making the diagnosis of ASD (American Psychiatric Association 2013) more effective and accurate. At the time of the ADOS-2 publication in 2012, CSS for Module 4 were not created due to the small sample sizes available for adults at this level.

Subsequent to ADOS-2 publication, Hus and Lord (2014) created a revised algorithm and CSS values for verbally fluent adults who would be tested using Module 4. These revisions were developed in order to increase sensitivity and specificity of diagnosis on this module, and to assist with score comparison with modules 1 through 3 on the ADOS-2. Pugliese et al. (2015) validated Hus and Lord’s new algorithm and CSS through a multi-site study. These results demonstrated increased sensitivity of the new ADOS-2, Module 4 algorithm within adults without intellectual impairment, but found varying sensitivity for subgroups within their sample. That is, when using the Hus and Lord algorithm compared to the previous ADOS algorithm, Pugliese and colleagues found better sensitivity and specificity in females compared to males and in individuals at or above age 16 years rather than individuals under 16 years. Lower specificity was found in individuals with higher verbal IQ scores.

As currently published, the ADOS-2 Modules 1–3 has one CSS based on the total combined score of the ADOS-2 algorithm domains Social Affect (SA) and Restricted and Repetitive Behaviors (RRB); current ADOS-2, Module 4 does not include a CSS. Recent publications using the ADOS-2 have supported the use of two CSS values that assess social affect (SA-CSS) and restricted repetitive behaviors (RRB-CSS; Esler et al. 2015; Hus et al. 2014). Like those developed for ADOS-2: Modules 1–3, the Module 4 newly created CSS by ASD domain provided a better measure of autism severity than the raw total score, and were less influenced by individual characteristics (Gotham et al. 2009; Hus et al. 2014).

Previous research comparing the ADOS (Lord et al. 1999) subdomain scores and Social Responsiveness Scale (SRS; Constantino and Gruber 2005) total parent report score ranged from r = .31 for social and communication to r = .36 for RRB (Constantino et al. 2007). Similarly, small correlations were found between ADOS scores and adaptive behavior functioning, as measured by the Vineland Adaptive Behavior Scales (VABS; Sparrow et al. 1984) for two clinical sites (Klin et al. 2007). These results suggest that there are weak associations between levels of autism symptomology, as measured by the ADOS, and levels of adaptive functioning, as measured by VABS.

To date, little research has been conducted to determine how these newly created CSS align with other standardized measures commonly conducted during a comprehensive ASD evaluation. Here we examined the psychometric properties of the ADOS-2 Module 4, and asked whether the CSS correlate with standardized assessments of social aptitudes of ASD, such as the Social Responsiveness Scale 2nd edition (SRS-2; Constantino and Gruber 2012) and the Adult Autism Spectrum Quotient (AQ; Baron-Cohen et al. 2001). We also asked whether the CSS correlate with non-social symptoms of ASD, such as general comorbidity Symptom Checklist (SCL-90; Derogatis 1994) and intellectual quotient (IQ). The SCL-90 provides an index of a Global Severity Index (GSI) of somatization, obsessive–compulsive, interpersonal sensitivity, depression, anxiety, hostility, phobic anxiety, paranoid ideation and psychoticism. These symptoms can be associated with ASD and do not represent core symptoms of ASD deficits.

We hypothesized that ADOS-2 Overall Total CSS would be associated with social measures as depicted by SRS-2 scores and AQ, and will not be related to non-social scores of IQ and SCL-90. These relationships were hypothesized due to the core social deficits in ASD maintain across all ranges of intellectual functioning and degree of co-morbid conditions.

Method

Participants

Forty males with ASD participated (M age = 26 years, 11 months, SD = 5 years, 1 month) in the study. Participants were recruited from a clinical database and adult treatment program at the Emory Autism Center in Decatur, GA, as well as through community-based town hall meetings to explain study criteria. To be eligible for participation in this project, individuals recruited needed to have met inclusionary criteria of, (a) being male, (b) having a previous diagnosis on the autism spectrum; (c) being between ages of 18–45 years old; (d) having a previous IQ score greater than 60; and (e) having a parent or primary caregiver available to provide information about developmental history. Males were recruited as part of a bigger project that is investigating the role of oxytocin in social cognition. Only men were selected given that oxytocin can have different effects in women based on their hormonal cycle, and also given the low probability of recruitment of a significant number of women with autism (ratio is 1 female for every 4–5 males diagnosed with ASD). Clinical assessments were completed as part of an eligibility and screening evaluation for the larger research project on oxytocin and social cognition at Emory University School of Medicine (not all data are presented in this paper). ASD diagnosis was confirmed using the ADOS-2, Module 4 and the Autism Diagnostic Interview-Revised (ADI-R; Rutter et al. 2003). We analyzed the SA-CSS, RRB-CSS, and Overall Total-CSS. Certified psychologists and specialists conducted the diagnostic tests. All assessors reached research reliability on the ADOS-2 and ADI-R prior to the start of this study. Written informed consent to participate in the larger study, and specifically to undergo the assessment procedures described here was obtained from each participant, following procedures approved by the Institutional Review Board (#IRB00064623). Participants completed the IQ test (Wechsler Abbreviated Scale of Intelligence, 2nd edition, WASI-II, Wechsler and Zhou 2011) to determine a brief IQ estimate. For the IQ, we computed standard scores of Full Scale-4 (FSIQ4), Full Scale-2 (FSIQ2), Verbal Comprehension (VC) and Perceptual Reasoning (PR). Participants and at least one parent completed the SRS-2. Table 1 provides a more detailed sample description.

Table 1 Summary of Sample (n = 40) Standardized Assessment Results

In order to have a global measure of social functioning for the analyses in this paper and given that the scores of parents’ and participants’ reports were highly correlated, we computed a final SRS-2 score for each participant by averaging the participant’s and parents’ report. By combining these two scores, the newly created SRS-2 score served as a proxy for overall social behavior which enabled closer comparison to the ADOS-2, Module 4 CSS. Participants also completed a self-report AQ of 50 total statements. Subjects are asked to rate how strongly they agree or disagree with each of the statements. This questionnaire aims to investigate social deficits and autism diagnosis for individuals with preserved intellectual capacities. Subjects also completed the SCL-90 checklist that consists of a self-report questionnaire that is designed to evaluate a broad range of psychological problems and symptoms. It consists of 90 items and yields scores of global severity or distress indices. It consists of primary dimensions that assess somatization, obsessive–compulsive, interpersonal sensitivity, depression, anxiety, hostility, phobic anxiety, paranoid ideation and psychoticism. The GSI on this test can provide us with general information on patients’ comorbid symptoms.

Data Analysis

In line with the a priori hypothesis, we performed two separate dimension analysis for the social and the non-social measures respectively. For the social dimension analysis, we performed Pearson correlations between the ADOS-2 Overall Total-CSS and the AQ and SRS-2 T scores. For the non-social dimension analysis, we performed Pearson correlations between the ADOS-2 Overall Total-CSS and the Full IQ and GSI of SCL-90. Post-hoc analyses were performed only on significant dimensional results. We investigated the sub-dimensions of ADOS-2 CSS (social affect and repetitive behavior scores) and the WASI-II composite scores (VC and PR). Data were analyzed using IBM SPSS Statistics 24 software. For the dimensional analysis, we employed Bonferroni-corrected alpha values of 0.05/5 = 0.01 as thresholds for statistical significance.

Results

Correlations Between ADOS-2, Module 4 Calibrated Severity Scores and Social Measures

Pearson correlations show no significant relationship between ADOS-2 Overall Total-CSS and the final computed SRS-2 scores (r = −.019, p = .907, Fig. 1c). The ADOS-2 Overall Total-CSS did not correlate with the AQ score (r = −.121, p = .458, Fig. 1b). These results are not in line with our hypothesis. Of note, the SRS-2 total score is highly and positively correlated with the AQ score (r = .595, p < .001). We have also looked into the correlation between the ADOS-2 Overall Total-CSS and SRS-2 scores separately for the self-report and parents’ reports. Pearson correlations show no significant relationship between ADOS-2 Overall Total-CSS and the SRS-2 self total score (r = −.105, p = .526, Fig. 2a) or the SRS-2 parents total score (r = −.066, p = .710, Fig. 2b). As mentioned above, SRS-2 self and parent total scores were highly correlated (r = .438, p = .01).

Fig. 1
figure 1

Absence of correlation between ADOS-2 Module 4 overall total-CSS and a GSI scores as well as b AQ, and c SRS-2 total averaged scores

Fig. 2
figure 2

Relationship between ADOS-2, Module 4 overall total-CSS and a SRS-2 self report total score and b SRS-2 other report total score for 40 adult males with ASD

Correlations Between ADOS-2, Module 4 Calibrated Severity Scores and Non-Social Measures

For this analysis, we found that the ADOS-2 Overall Total-CSS negatively correlated with the total IQ total score (FSIQ2: r = −.34, p = .031, Fig. 3a). Interestingly, the correlation between ADOS-2 Overall Total-CSS and the FSIQ4 was weaker and did not reach a significant threshold (p > .05). In other words, patients with higher ADOS-2 Overall Total-CSS (or higher severity) show lower overall intellectual performance. In a further sub-dimensional analysis, when we look into the different sub-domains of the ADOS-2 CSS, we found that the ADOS-2 SA-CSS specifically correlates with the IQ scores (r = −.446, p = .004, Fig. 3b), whereas the ADOS-2 RRB-CSS did not (r = −.75, p = .645). Furthermore, the ADOS-2 SA-CSS is negatively correlated with the verbal IQ (r = −.443, p = .004, Fig. 3c) but not the performance IQ (r = −.244, p = .129). We did not find any significant correlation between the ADOS-2 Overall Total-CSS and the SCL-90 GSI (r = −.046, p = .78).

Fig. 3
figure 3

Relationship between ADOS-2, Module 4 CSS and IQ measures in 40 adult males with ASD: a non-social dimensional analysis of ADOS-2, Module 4 overall total-CSS and FSIQ2; b ADOS-2, Module 4 SA-CSS and FSIQ2 score; and c ADOS-2, Module 4 SA-CSS and verbal and performance IQ scores

Discussion

Our findings indicate that the Overall Total-CSS of ADOS-2, Module 4 does not correlate with standardized questionnaire of social responsiveness in adult males with ASD. Nor does it correlate with another standardized questionnaire-based social measure, the AQ score. However, the Overall Total-CSS and the total IQ score are approaching a significant negative correlation. In particular, there is a strong correlation between the SA-CSS and the verbal comprehension IQ sub-score. This suggests that expressive language ability in ASD might play an important role in the display of severity of social symptoms assessed by the ADOS-2. Individuals with more severe language delay might display more impaired social aptitudes during the ADOS-2 test. The results suggest that ADOS-2 Overall Total-CSS does depend to some degree on participants’ general intellectual function. Thus, verbal intellectual ability appears to interact with severity scores of core deficits of ASD.

Surprisingly, ADOS-2 CSS did not correlate with social standardized assessments that are used in the ASD literature to assess social aptitudes. It did not correlate with the AQ or with the SRS-2 total scores. This results could be due to the fact that SRS-2 and AQ are not specifically assessing severity of symptoms, or alternatively, that ADOS-2 CSS do not necessarily reflect severity of social core symptoms of ASD.

Interestingly, the CSS did not correlate with the GSI of SCL-90. This observation suggests that severity of the core deficits of ASD, as estimated by the Overall-CSS, does not merely reflect the severity of other non-core comorbid symptoms frequently associated with ASD.

The development of the CSS for the newly revised ADOS-2 was a first attempt to standardize the ADOS-2 assessment based on an individual’s age and expressive language ability (Lord et al. 2012). More research is needed to study CSS limitations in terms of its relationship to intellectual aptitudes of patients and in particular, their verbal comprehension skills. Although not analyzed in this study, it is possible that verbal IQ can be used as a covariate in the analysis of the CSS to adjust for the potential influence of intellectual capacity in determining severity of ASD. Future research efforts should investigate verb IQ co-variation to determine the amount of variance that can be explained by core symptoms of ASD in the ADOS-2, Module 4 CSS.

Limitations and Future Directions

These results should be viewed as preliminary given the moderate sample size used, as well as the ascertainment criteria, which limited the sample to males with IQ higher than 60. Replications with individuals who have IQ scores below 60 will be important to see the effects of verbal intellectual ability on the ADOS-2 severity ratings. Also important will be investigations into adults suspected of having an ASD, but that do not already have a formal diagnosis. These investigations will be able to determine the range of severity ratings obtained in a broader range of individuals with a suspected, but unconfirmed, ASD diagnosis.

Future studies using the ADOS-2 CSS as a measure of change due to intervention may need to control for verbal ability in adult individuals with ASD. To optimize use of the ADOS-2 CSS to measure of change, researchers may need to control for intellectual ability in intervention studies that target core ASD symptoms. For example, a common treatment approach for adults with ASD, cognitive behavior therapy, relies heavily on verbal abilities in order to participate (Eack et al. 2013; Gaus 2007; Wong et al. 2014). These results show a negative relationship between verbal IQ and SA severity, and it would be interesting to see if participation in changes this relationship after controlling for pre-treatment verbal abilities in participants.

Future research should examine ADOS-2 severity ratings across genders. Future research using samples that are larger and more diverse, as well as those that include females with ASD, are needed to evaluate these results further and to determine if these findings are specific to males with ASD who have an IQ of 60 or greater. This is especially important given results reported by Pugliese and colleagues (2015) who found that use of the revised algorithms and CSS developed by Hus and Lord (2014) produced better sensitivity and specificity in females compared to males, as well as in individuals 16 years of age and older. Lower specificity was also reported for those individuals with higher verbal IQ scores.

Our findings provide valuable information for how the ADOS-2 CSS tend to correlate with intellectual capacities and does not correlate with general comorbid symptoms. CSS were not correlated with main social measures that are known to assess social aptitudes of ASD. More research is needed to better assess the value of this score and to study the implications of verbal intellectual comprehension in the determination of severity of ASD core symptomatology.