Introduction

With greater public awareness of autism spectrum disorders (ASD) over the past several years, professionals have become more adept at identifying the early symptoms of this disorder; as a result, we have witnessed a decrease in the age at first diagnosis (Mandell et al. 2005). While autism alone can now be reliably identified as early as age 2, diagnoses of Pervasive Developmental Disorder—Not Otherwise Specified (PDD-NOS) and atypical autism are less stable until around age 3 (Lord et al. 2006). Data collected on a group of 8-year olds indicated that the median age of earliest ASD diagnosis ranged from 48 to 61 months (Wiggins et al. 2006). Yet it is clear that age at diagnosis is also related to type of ASD diagnosis. Results from one large, internet-based survey in which parents were queried about the ages at which their children received first ASD diagnoses found that children with autism were diagnosed at an average age of 3.4 years, those with PDD-NOS at 4.2 years, and those with Asperger’s syndrome at 7.5 years (Goin-Kochel et al. 2006). Other literature suggests that if and when a child is diagnosed with an ASD is, in part, dependent on a child’s race, with factors such as socioeconomic status and degree of cognitive and language impairment also influencing identification and age at diagnosis (Mandell et al. 2007, 2010).

The increasing incidence of ASD diagnoses in recent years, along with detection at earlier ages, have prompted various academies of medicine to reconsider their practice guidelines when it comes to screening for neurodevelopmental disabilities. The American Academy of Pediatrics (AAP) now recommends routine developmental screening at 9-, 18-, and 24- or 30-month visits, regardless of whether a concern or risk has been identified (American Academy of Pediatrics et al. 2006). Johnson et al. (2007) further describe a specific screening algorithm for the 18- and 24-month visits, which includes use of a standardized, ASD-specific screening tool. Similar practice guidelines have been established by the American Academy of Neurology (Filipek et al. 2000) and the American Academy of Child and Adolescent Psychiatry (Volkmar et al. 1999). Generally speaking, each of these practice guidelines promotes routine developmental screening, described as Level 1, conducted by pediatricians or other general service providers. Within Level 1, specific screenings recommended include developmental instruments (e.g., the Ages and Stages Questionnaire, BRIGANCE Screens, Social Communication Questionnaire, etc.), audiologic assessment, and lead screening. If a child fails screening at Level 1, a comprehensive diagnostic assessment would be implemented at Level 2. Level 2 screening/assessments may involve parents, pediatricians, geneticists, neurologists, speech-language pathologists, audiologists, child psychiatrists, occupational therapists, physical therapists, as well as educators.

Screening Issues

Screening for ASDs in this manner is a time- and cost-effective method of identifying those children who should be evaluated further. Unfortunately, many physicians may be failing to incorporate this practice into their routines (dosReis et al. 2006); however, to be fair, many families may not consistently adhere to this well-baby-visit schedule and, instead, only go to the doctor when their children are sick. This obviously limits the effectiveness of broader screening programs. Thus, children who are ultimately diagnosed as being on the spectrum can be missed at earlier ages.

Screening efforts, regardless of the child’s age, require the use of appropriate measures that reliably and validly capture whether an individual is functioning outside of the normal range. Good screening tools are judged by how well they identify cases who truly have the condition in question while simultaneously weeding out the ones who do not have the condition. As discussed in Sattler (2008), screenings should be evaluated in terms of the instrument’s reliability and validity. One type of validity—criterion-related validity—is based on how a test correlates with a specified criterion or outcome measure. Particularly important to criterion-related validity are sensitivity (the true positive rate) and specificity (the true negative rate). Values regarding adequate sensitivity and specificity ratings are subjective and are influenced by the perceived consequences of inaccurately identifying a problem or condition when one exists (i.e., missing a diagnosis of an ASD when one is present). Rates between 70 and 80% and above for sensitivity have been recommended for universal screening, or what might be considered Level 1 screening (Glascoe 2005; Kamphaus et al. 2010). However, there is a reciprocal relationship between sensitivity and specificity in that as sensitivity increases, specificity decreases. Generally speaking, greater emphasis is placed on maximizing sensitivity so as to miss fewer possible cases.

Parents are undoubtedly considered the primary source for obtaining developmental and behavioral data about their children, and there exists a substantial body of peer reviewed-literature related to the use of parent-completed ASD screening instruments (e.g., Allen et al. 2007; Charman et al. 2007; Eaves et al. 2006; Witwer and LeCavalier 2007). However, there is growing appreciation for the contributions that teachers can make to the screening and diagnostic processes. Ozonoff et al. (2005) specifically advocate the inclusion of educators within a multidisciplinary team approach where data are assimilated using multiple methods, from a variety of informats who interact with the child in different contexts. In this situation, teachers can be ideal reporters because they have the unique opportunity to observe children throughout the day, in different situations, and have comparative knowledge about the social-skills development of typically developing children (Ehlers et al. 1999). Unfortunately, the literature on ASD-screening instruments designed for use in school-based settings is scant. This suggests a missed opportunity in the broader screening process, for it is not uncommon for children not to receive any type of systematic screening prior to entry into schools. This may often be the case when symptoms of ASD are mild, subthreshold, or attributed to other conditions, such as attention-deficit disorder or obsessive–compulsive disorder. In addition, because of the importance of early intervention, if a student has not been identified before he/she enters grade school, it is essential for school professionals to be keen to the signs of ASD and use effective tools for screening when concerns arise. Gathering reliable and quantifiable teacher information is therefore an integral and essential part of the identification of children with ASD; yet, educators’ contributions to this process are compromised by the lack of information about which tools best capture their unique perspectives and how to use these instruments.

ASD Screening Instruments for Teachers

One way to screen for ASDs in school-age children is through teacher nomination. Given limited resources, teacher nomination may be a highly feasible and economical way to screen students who may have behaviors consistent with ASD. Hepburn et al. (2008) compared the use of classroom-based teacher nominations in screening children for ASD in a general-education setting with results from the Autism Spectrum Screening Questionnaire (Ehlers et al. 1999). The nomination form provided to teachers listed characteristics of ASD and instructed teachers to nominate two students most fitting those characteristics in his or her classroom. In screening over 1,300 children, the proportion of overall agreement between the two methods was 93–95%, depending on the how the nomination was conducted (i.e., “forced” vs. “unforced” choices in nominating two students). Further validation by clinical evaluations would be warranted, but the nomination procedure appeared to be a time-efficient, cost-effective method of screening for ASD in the schools. One downside to the nomination procedure, however, is that, compared to a more universal screening, it has the potential to overlook students who may be exhibiting more subtle or subthreshold symptoms.

Apart from the nomination procedure, a few objective measures have been created and validated for use by teachers to assist in screening for ASD. One instrument noted previously, the Autism Spectrum Screening Questionnaire (ASSQ; Ehlers et al. 1999) is a 27-item measure designed for parents and teachers that screens for high-functioning autism and Asperger’s syndrome in children and adolescents. The ASSQ was found to have high test–retest reliability over a 2-week period for both parents (r = .96; n = 86; p < .0001) and teachers (r = .94; n = 65; p < .0001), as well as a higher-than-expected correlation between parent and teacher ratings (r = .66; n = 105; p < .0001). Ehlers et al. recommend cutoff scores of 13 with parents and 11 with teachers (yielding true positive rates of 91 and 90%, respectively); however, these cutoffs yielded fairly large false-positive rates (23 and 42%, respectively), identifying students with social difficulties that may be related to other disorders. The ASSQ has been used in at least four studies with teachers, examining the prevalence of ASD (Kadesjo et al. 1999; Mattila et al. 2009; Posserud et al. 2006; Webb et al. 2003). However, as noted in Reilly et al. (2009), the ASSQ has been critized for its low specificity and has been used primarily for research purposes.

Another instrument commonly used to screen for ASD is the Social Responsiveness Scale (SRS; Constantino and Gruber 2005), a 65-item questionnaire designed to be completed either by a parent (SRS parent report) or a teacher (SRS teacher report). It is appropriate for use with individuals who are between the ages of 4 and 18 years and queries behaviors observed during the past 6 months. SRS items cover dimensions of communication, social interaction, and repetitive/stereotyped behaviors that are associated with ASD. Respondents rate the frequency of observed behaviors on a scale of zero (not true) to four (almost always true). Item scores are totaled and a severity score is calculated. Strong correlations of .82 (mother) and .91 (father) have been reported between parent- and teacher-reported SRS scores, and this instrument has also demonstrated good reliability (coefficient alphas above .93 for parent and teacher ratings; Constantino and Gruber 2005). With regard to specificity, Constantino et al. (2007) found that a t-score of >60 on the SRS teacher report was consistent with a clinically identified ASD diagnosis in 90% of cases. Lee et al. (2010) examined the utility of the SRS parent and teacher forms in an assessment battery for ASD. They observed very high positive predictive value for the SRS teacher form, but relatively low sensitivity compared to the SRS parent form. This may be an indication that threshold scores were too high. Additionally, results may have been influenced by the lower response rate from teachers in this study. In similar work, Kanne et al. (2009) used the Child Behavior Checklist (CBCL; Achenbach 1991a), the Teacher Report Form (TRF; Achenbach 1991b), and the SRS to compare parent and teacher ratings of psychiatric-symptom severity among children with ASD and their siblings. They found that, while differences existed between parent and teacher report in their ratings of psychiatric symptoms, there was stronger agreement between informant ratings for the sibling group than the ASD group. The authors interpreted this finding to suggest that environmental context may differentially affect individuals with ASD; therefore, using ratings from both parents and teachers may help to get a more complete picture of functioning for this unique population. Although the SRS teacher form has been used in genetics research studies (e.g., Campbell et al. 2010; Duvall et al. 2007), to our knowledge, there are no other studies that have investigated the utility of the SRS teacher report as a broader screening measure.

The Social Communication Questionnaire (SCQ; Rutter et al. 2003), formerly known as the Autism Screening Questionnaire (Berument et al. 1999), is a 40-item measure designed to screen for ASD in individuals 4 years and older who have a mental age of at least 24 months. It is based on the Autism Diagnostic InterviewRevised (ADI-R; Rutter et al. 2003) algorithm items and solicits information about reciprocal social interaction, language/communication, and repetitive and stereotyped behaviors. There are two versions of the SCQ: the SCQ Lifetime Form (SCQ-Lifetime), designed to be completed by a person with knowledge of the individual’s developmental history, including the time between the ages of 4 and 5, and the SCQ Current Form (SCQ-Current), which includes similar queries with a focus on present behavior, specifically during the past 3 months. The authors recommend a cutoff score of 15 for the SCQ-Lifetime (with no specific recommendation for the SCQ-Current cutoff score) and that children who meet or exceed this threshold receive a comprehensive ASD evaluation. According to the manual, the SCQ-Current produces results that can be helpful in treatment planning, educational intervention, and measurement of change in symptoms over time.

The SCQ has been investigated as a parent-report screener in several research studies, typically among younger children (ages 2–6) but some among school-age children (ages 5–16) (Allen et al. 2007; Chen et al. 2009; Corsello et al. 2007; Eaves et al. 2006; Johnson et al. 2010; Lee et al. 2007; Wiggins et al. 2007). While the authors of the SCQ recommend a cutoff score of 15 (Rutter et al. 2003), results from these studies suggest that the optimal cutoff score for the SCQ may vary depending on the age of the child and other characteristics (e.g., ASD severity), which has direct implications for the sensitivity and specificity of the measure. For example, Corsello et al. (2007) found that sensitivity in discriminating ASD from non-ASD on the SCQ-Lifetime increased from .71 to .82 when the cutoff was lowered from 15 to 12. Similarly, Allen et al. found that a cutoff score of 11 on the SCQ-Lifetime yielded a sensitivity rating of .93 among their sample of 2- to 6-year-olds; for those between the ages of three and five, sensitivity was 1.00. Again, though, these investigations only used the SCQ as a screener completed by parents. We are not aware of any studies that have specifically examined the utility of the SCQ-Current as a screener completed by teachers.

Purpose of Current Study

While there has been an emphasis on early screening and identification of ASD, there continues to be a gap between ASD-screening research and clinical application in school settings (Noland and Gabriels 2004). At times, screening and identification of children with ASD (particularly those with average intellectual functioning) often does not occur until the elementary years (Mandell et al. 2005), signaling the need for effective screening procedures within the educational system to potentially identify these children earlier. As noted by Noland and Gabriels (2004), there is scant information about screening programs within schools for children with ASD; they referenced only one major school-based, ASD-screening project conducted by Smith and Brees (1983) that provided training and coordinated a screening process for a rural public school district in Iowa. There clearly continues to be a need to identify appropriate ASD-screening instruments for use within school settings.

It has been well established that the SCQ-Lifetime demonstrates strong psychometric properties (Berument et al. 1999; Charman et al. 2007; Rutter et al. 2003); however, while the SCQ-Current was originally developed as a screening instrument (Rutter et al. 2003), we are not aware of any empirical research that has been conducted to determine the appropriateness of its use as such with teachers. Given results from the Kanne et al. (2009) study, it is possible that, as with the SRS, the SCQ-Current may highlight contextual effects as being more salient in the ASD versus sibling group. Additionally, the utility of the SRS teacher report as a brief screening measure has been investigated in one study (Constantino et al. 2007), but this finding has not been replicated. With this in mind, the goal of the present study was to examine the utility of both the SCQ-Current and SRS measures as teacher-reported screening instruments for school-age children with an ASD.

Method

Participants

The current sample contained 3,375 children from 1,655 different families who participated in the Simons Simplex Collection (SSC; http://sfari.org/simons-simplex-collection). The SSC is a multi-site, genetic research project in North America that includes families with only one child with an ASD and no other first-through third-degree relatives with ASD or suspected ASD (i.e., simplex families; Fischbach and Lord 2010). Data are also collected on the unaffected, full-biological siblings of these children to serve as related controls. Among the 3,375 children in the current study, 1,663 were the identified probands with ASD (86.5% male), while the remaining 1,712 were the unaffected siblings (47.8% male) of these children. Most were white (79.4%) and of non-hispanic ethnicity (90%). In terms of annual household income, 14.6% came from homes making <$20,000–$50,000; 42.5% came from homes making between $51,000 and $100,000; and 42.9% came from homes making $101,000–>$161,000. Because data collection for the SSC occurred in stages, the ages at which various instruments/assessments were completed on children varied somewhat. Given that our focus was on ASD-screening measures, the average ages at which these particular instruments were completed for the sample are provided in Table 1. Siblings were significantly older than children with ASD by an average of 8.13–16.13 months, depending on the measure, although effect sizes for age differences were negligible, ranging from .01 to .03. These age differences were largely an artifact of inclusion/exclusion criteria set forth by the SSC in that probands could not be older than 18 years (range = 4–17 years, 11 months) but siblings could (range = 4 and older).

Table 1 Descriptive statistics and ages at administration (in years) for the SCQ-Current, SCQ-lifetime, SRS teacher report, and SRS parent report

Data for the SSC were collected through a multi-site network, consisting of 12 university-based study sites. Families at each site were recruited through a variety of means, including autism and developmental disability clinics, community events (e.g., autism walks), media advertisements, and parent-support-group meetings. Once interested families completed the initial eligibility screen, they were given/mailed a packet of questionnaires to complete on all participating family members; this packet contained the SCQ-Current and the SRS parent form that parents were asked to complete on probands and siblings. Permission was also sought to contact their children’s teachers to obtain the SCQ-Current and SRS teacher forms on probands and siblings. When families consented, these instruments were then mailed to their children’s teachers, with instructions for them to complete the forms and mail them back to the study team. Families who remained eligible throughout the initial stages of the study were invited to come to their assigned study site to participate in in-person evaluations with the proband and screening for the designated sibling (described below). Additional information about how families were recruited into the SSC and the procedures for collecting data can be found in Fischbach and Lord (2010).

A wide array of data were collected on both probands and siblings, and specific measures selected for the current study are described below. In order to be included in the SSC, probands necessarily met criteria for an ASD (autism, Asperger’s syndrome, PDD-NOS) based on scores from standardized diagnostic instruments and in clinical opinion. Because a diagnosis or suspicion of ASD in a sibling was considered exclusionary for participation in the SSC, all siblings were screened in person, by trained clinicians, for ASD and additional exclusionary psychiatric conditions; all were determined not to meet criteria for an ASD or display evidence of the broader autism phenotype.

Measures

Participants in the SSC were administered several phenotypic measures. For the purposes of this study, the following proband measures were examined: the parent-completed Autism Diagnostic InterviewRevised (ADI-R; Rutter et al. 2003), the Autism Diagnostic Observation Schedule (ADOS; Lord et al. 2009), the Social Responsiveness Scale (SRS; Constantino and Gruber 2005) teacher and parent reports, the SCQ-Lifetime (completed by the parent), and the SCQ-Current (completed by a teacher) (Rutter et al. 2003). For all probands and siblings, the SCQ-Lifetime and the SRS parent report were completed first, prior to administration of the ADI-R and ADOS. Usually, this was done by mail so that families completed these forms about their children and mailed them back to the study team at their assigned site, prior to their in-person appointment. Both the ADI-R and the ADOS were administered by two different, research-reliable clinicians at the study site where a given family participated. A calibrated severity score (CSS) was computed for each proband’s ADOS result (Gotham et al. 2009). The CSS transforms an individual’s ADOS results into a metric used to gauge autism severity; this allows comparison of outcomes across different ADOS modules.

Data examined from unaffected siblings included teacher ratings on the SCQ-Current and SRS teacher report, as well as parent ratings on the SCQ-Lifetime and SRS parent report. For both probands and unaffected siblings, completion of teacher-rating forms was voluntary, thus only those children who had SCQ-Current and SRS teacher reports entered into the SSC database were included in final analyses. All probands and siblings who had teacher data also had parent data on the SCQ-Lifetime and the SRS parent report.

Procedure

Permission was granted (a) from the Baylor College of Medicine’s Internal Review Board and the University of Houston’s Committee for the Protection of Human Subjects and (b) the Simons Foundation to analyze data for the current study. SSC data for the project were downloaded from the SFARI website (http://sfari.org/sfari-base) and imported into the SAS® analytical program, version 9.2, and SPSS 18 for statistical analysis. For analytical purposes, all probands were assigned a value of 1 to indicate their ASD-case status (because, by definition, they must have an ASD to be included in the study) and all siblings were assigned a value of 0 to indicate their non-ASD-control status (because, again, by definition, they could not be suspected of/diagnosed with an ASD and be included in the SSC).

Statistical Analyses

Descriptive statistics were computed for scores on the aforementioned instruments. As noted previously, teachers voluntarily completed the SCQ-Current and SRS teacher report, and sometimes parents did not give permission for teachers to report on their children, so these data were not available for every child in the SSC. Additionally, protocol changes during SSC data collection led to missing measures for some families who had already completed their participation. One-way analyses of variance (ANOVAs) were used to establish whether ages and scores for each measure differed between probands with ASD and unaffected siblings.

Pearson correlations were calculated to determine potential associations between the SCQ-Current, SRS teacher report, SCQ-Lifetime, SRS parent report, domain scores on the ADI-R, domain scores on the ADOS, as well as the CSS from the ADOS. A receiver operator characteristics (ROC) curve was conducted to examine optimal cutoff scores to maximize sensitivity and specificity. Additionally, positive predictive value (PPV) and negative predictive value (NPV), as well as false-positive and false-negative rates, were calculated.

Results

Descriptive statistics for the SCQ-Current, SRS teacher report, SCQ-Lifetime, and SRS parent report are presented in Table 1. In each case, proband scores were significantly higher than unaffected sibling scores, and effect sizes were quite large, ranging from .42 to .77. Probands’ average CSS score per the ADOS was 7.38 (SD = 1.73, range = 4–10).

Pearson correlations among the SCQ-Current, SRS teacher report, SCQ-Lifetime, SRS parent report, ADOS subdomain scores, CSS, and ADI-R subdomain scores are provided in Table 2. Most correlations were moderate, though nearly all were statistically significant. Of particular interest were the correlations between (a) the SCQ-Current and SRS teacher report, (b) both teacher-report and parent-report screening measures, and (c) the SCQ-Current, SRS teacher report, and the ADOS subdomain scores. There was a large, strong correlation of .73 between the SCQ-Current and the SRS teacher report, demonstrating good convergent validity. However, associations between teacher and parent ratings were more modest, at .32 and .35 for the SCQ-Current with the (a) SCQ-Lifetime and (b) SRS parent report, respectively. Similar associations were observed between the SRS teacher report and (a) the SRS parent report, at .27, and (b) the SCQ-Lifetime, at .22. Correlations between the SCQ-Current and each ADOS domain, including the CSS, were moderate, ranging from .27 to .46. These were slightly larger than comparable correlations between ADOS scores and the SRS teacher report, which ranged from .25 to .38. However, they were more than twice as large as those between ADOS scores and either the SCQ-Lifetime (range = .11–.21) or the SRS parent report (range = .09–.16).

Table 2 Pearson correlations among teacher-rated screening tools, parent-rated screening tools, and scores on the ADOS and ADI-R

Based on the applied cutoff scores for each instrument (from the manual or existing literature), 37.9, 39.9, 34.4, and 41.7% of the collective sample exceeded thresholds on the SCQ-Current, SCQ-Lifetime, SRS teacher report, and SRS parent report, respectively.

A receiver operator characteristics (ROC) curve was applied to data gathered on the SCQ-Current (teacher) and SCQ-Lifetime (parent) forms, as well as the SRS teacher and parent reports. The ROC curve determines the cutoff score on each measure that maximizes both sensitivity and specificity. The ROC curves display the diagnostic accuracy for each measure with all possible cut scores in plotting the sensitivity on the y-axis and one minus specificity on the x-axis. Table 3 displays the corresponding Area Under the Curve (AUC) for each measure. The AUC indicates the ability of the tests to correctly classify those individuals with and without an ASD; among the current sample, the parent- and teacher-reported SCQ and SRS measures were able to accurately classify 89–99% of youth correctly. Table 4 provides the cutoff scores associated with the highest sensitivities and specificities for each measure. Youden’s index was computed to assist in identifying the cut score that maximizes sensitivity and specificity for each measure. Because some investigators may find it helpful to know the sensitivity, specificity, false-negative rate, false-positive rate, positive-predictive value, and negative-predictive value for these instruments within the current sample when the original, recommended cutoff scores were applied, we included these statistics in Table 5.

Table 3 Corresponding AUC for each ROC curve
Table 4 Maximum sensitivity, specificity score, false-negative rate, false-positive rate, positive predictive value, and negative-predictive value for teacher and parent screening tools based on ROC curve analysis
Table 5 Sensitivity, specificity, false-negative rate, false-positive rate, positive-predictive value, and negative-predictive value for teacher and parent screening tools

Discussion

The purpose of this study was to examine the utility of teacher ratings on the SCQ-Current and the SRS teacher form as a way to screen for ASD in school-aged children. Overall, the two instruments had a strong, significant correlation, demonstrating good convergent validity. In examining the teacher ratings from the SCQ and SRS, both forms were significantly associated with ratings from parents on the same instruments, as well as with the scores from the parent-reported ADI-R and clinician-observed behaviors on the ADOS. In fact, some of the strongest correlations were observed between ADOS domain scores and teacher ratings on both the SCQ-Current and the SRS teacher form, and these were considerably larger than comparable correlations between the ADOS domain scores and parent ratings. This suggests that teacher report is more consistent with current, clinical observation; however, to be fair, the SCQ-Lifetime (completed by parents) queries behaviors that the child has ever exhibited and/or displayed specifically between the ages of 4 and 5. It is not uncommon, then, for past behaviors to be different from current behaviors. Therefore, the stronger associations noted between teacher reports on the SCQ/SRS and clinician observations on the ADOS may result, in part, from the fact that these instruments focused on current behavior. However, the SRS parent form did query current behaviors, and these correlations were more than twice as small as comparable correlations between ADOS-domain and SRS teacher-form scores. This finding lends support to the possibility that (a) behaviors exhibited by the children with ASD are contextually related and may be more congruent across educational and clinical settings and (b) teachers and clinicians are seeing and reporting these behaviors similarly.

Correlations between the SCQ-Lifetime and all ADI-R domains were considerably stronger than those between the SRS parent report and the ADI-R domains; this makes sense, as the SCQ was derived from items on the ADI-R that had the most discriminant validity, and parents completed both of these instruments. Similar correlations were observed between the SCQ-Current and the ADI-R domains when compared against those between the SRS teacher report and the ADI-R domains. Again, part of this may result from the fact that ADI-R scores were based on historical information about the children with ASD, whereas the SCQ-Current and the SRS teacher report ask about current behaviors, and children can appear very different over time. However, part of this may also result from differences between informant groups, regardless of the measure. Finally, ADI-R and ADOS scores were only modestly correlated, but this, again, may be a function of both (a) the ADI-R focusing on historical behaviors while the ADOS concerns current behaviors and (b) the instruments being completed by different informants who are reporting on behaviors exhibited in different contexts.

Based on the data obtained from the ROC curve analysis, cutoff scores were derived to maximize sensitivity and specificity for each measure. In examining the AUC for the four instruments, all demonstrated a significant difference (compared to chance) in accurately identifying those individuals with ASD. For example, the SCQ-Lifetime form’s AUC of .995 indicates that the score of an individual with ASD, when randomly compared to an individual without ASD, would be larger 99.5% of the time (Zweig and Campbell 1993). The SRS parent form was able to accurately identify 98.8% of all participants. An AUC of .5 indicates the measure is no better than chance and that there is no relationship between the measure and diagnostic category. As seen in Table 3, ratings from parents on the SCQ and SRS had the highest AUC values, but teacher ratings were also highly acceptable. Teacher ratings from the SCQ-Current were able to accurately classify 89.7% of youth, while teacher ratings from the SRS correctly identified approximately 93.5%.

Based on the ROC curve, cutoff scores were generated that maximized the sensitivity and specificity for each instrument. As noted by Silberglitt and Hintze (2005), the ROC curve analysis provides flexibility in that rules can be rewritten to specifically increase or decrease sensitivity and specificity related to the purpose of the screening/assessment. Given the reciprocal relationship between sensitivity and specificity, the clinician must make a judgment as to the importance of accurate classification. Decisions must be made regarding the relative impact of failing to identify a condition (possibly delaying diagnosis and treatment services when needed) and identifying a condition when it is actually not present (considerable stress on individuals and resources needed to further assess when unnecessary). Actuarial techniques or statistical prediction rules (such as ROC) are tools that that may aid clinicians in improving diagnostic decision making (see Swets et al. 2000).

Parent report on the SCQ-Lifetime and SRS yielded excellent sensitivity and specificity that were higher compared to findings in the extant literature (e.g., Allen et al. 2007; Chen et al. 2009; Corsello et al. 2007; Eaves et al. 2006; Johnson et al. 2010; Lee et al. 2007; Wiggins et al. 2007). Based on the current sample and analysis, the recommended cutoff score for the parent SCQ-Lifetime was 7 or 8, which is much lower than the cutoff score of 15 recommended by the manual. One explanation for this relates to the higher-functioning nature of the sample and, more specifically, the original inclusion/exclusion criteria for the SSC, which required that children with ASD have nonverbal IQ scores ≥60 if between the ages of 4 and 5, ≥40 if between the ages of 5 and 8, and a mental age of at least 36 months if older than 8. These criteria influenced recruitment efforts, as participating sites targeted higher functioning cases so as to be assured children would meet these criteria. The IQ requirements for the SSC were lowered as of October, 2009, but the majority of cases in the current dataset were enrolled under the original criteria. This means that our sample was a relatively higher functioning group, which indicates that lowering scores on these screening measures may be necessary in order to correctly capture higher functioning individuals with ASD.

Presently, there are no other published data recommending a cutoff score for the SCQ-Current when completed by teachers. The current analysis recommends a cutoff of 12; however, this should be viewed as tentative and requires further replication. It is likely that a cutoff of 12 may be lower than what may be obtained in further investigations and with different samples.

Regarding the SRS, the current analysis indicated a cutoff score of 59–61 for the SRS parent form, which is similar to the recommended cut score of 60 reported by Constantino and colleagues (2007). In examining the SRS teacher report, data from this study indicate a maximal cutoff score of 54, which is slightly lower than the recommended cutoff of 60 by Constantino et al. (2007). Although the SRS teacher report appears to have more diagnostic utility than the teacher-reported SCQ-Current, its usefulness as a screener within the educational setting may be limited, given the length of the instrument.

Overall, when comparing screening properties between the teacher-reported SCQ-Current and SRS teacher report, the SRS teacher form appears to be the superior screening measure. Not only were sensitivity and specificity greater, but the false negative/positive rates and the positive/negative predictive values were more desirable. Some of this could be attributed to differences between the two instruments. The SCQ-Current was based on an existing diagnostic measure (ADI-R) and has fewer items, with responses dichotomized as yes/no, whereas the SRS has many more items and uses a 4-point likert-type scale to capture responses, so it represents more of a continuum for ASD symptoms/behaviors. Additionally, many items on the SCQ-Current are either (a) not appropriate for older children because they are at developmental levels where these behaviors would not be expected to occur (e.g., “Does she/he ever spontaneously join in and try to copy the actions in social games, such as The Mulberry Bush or London Bridge is Falling Down?” “Does she/he play any pretend or make-believe games?”) or (b) not appropriate to expect within the student–teacher relationship (e.g., “Does she/he ever offer to share things other than food with you?” “Does she/he ever try to comfort you if you are sad or hurt?”). This means that many children are scoring on these items, but these scores may not be clinically meaningful (i.e., children score out because they do not do the behavior, which is appropriate for their age or relationship with the teacher).

However, in addition to having strong psychometric properties, the most useful universal (or Level 1) screening instruments should be feasible for use at the population level. Thus, the SCQ-Current form, with 40 items and a yes/no format, may be more feasible for screening larger groups (e.g., classroom- or school-wide). In contrast, if an individual teacher has concerns about one particular child, the SRS teacher report may be more appropriate to use. Further research should examine the feasibility, utility, and acceptability of using these screening measures with teachers, as well as the development of additional tools that may be even more suitable for widespread screening in the educational setting.

In determining appropriate cutoff scores, additional research is warranted in investigating where the lines should be drawn among specific subgroups of children (e.g., based on level of verbal or cognitive ability, age, etc.). Previous studies with parents completing the SCQ (Corsello et al. 2007; Eaves et al. 2006; Johnson et al. 2010; Lee et al. 2007; Wiggins et al. 2007) have highlighted this need. Given the importance of early intervention and that, based on the current findings, higher cutoff scores may miss higher-functioning children with ASD, greater emphasis on sensitivity (>.80) may be desirable. Hypothetically, a combination of age, cognitive functioning, gender, and/or type of school setting (preschool, elementary, secondary) may influence the sensitivity of the measure.

Future research may wish to examine the feasibility and benefits of screening for various age groups. While there is an emphasis on early screening and identification, the benefits of targeted screening within schools is currently unknown. Many pre-school and school-age students from culturally and linguistically diverse and/or lower socioeconomic backgrounds may have less access to screening and evaluation. There may be diminishing returns on screening for older children/adolescents, though targeted screening may be warranted. Further study is recommended for additional school screening measures, as well. Analyses using item response theory methods, hierarchial linear modeling methods, or receiver operating characteristic (ROC) curves may shed light on these potential interactions while ascertaining cutoff scores that would maximize sensitivity for different subgroups of children (i.e., sex, age, ethnicity, etc.), as well as the appropriateness of certain items for use with teacher screeners (i.e., items from the SCQ-Current and SRS teacher form). While parents may be better raters of their children’s anxious or depressive symptoms (Klein et al. 2005; Silverman and Ollendick 2005), teachers have been found to be more accurate raters of attention-deficit/hyperactivity disorder (Tripp et al. 2006) and have been shown to add different yet important information about how children with ASD may differentially behave in the classroom (Kanne et al. 2009). However, more information is needed regarding comparison of ratings of ASD symptoms by teachers and parents.

Limitations

The inclusion/exclusion criteria for the SSC were very strict, and this likely affects generalizability of our findings to the general population. Probands were carefully selected to have an ASD, and siblings were specifically screened to rule out ASD and exclusionary psychiatric conditions. As noted by Riccio and Reynolds (2003), children with multiple problems, other than ASD, are likely to have elevated scores at times on the SCQ and SRS, given symptoms that may be similar to ASD, which may decrease the percentage of correctly classified cases (e.g., poor social relationships/skills in children with depression and attention-deficit/hyperactivity disorder). Further population-based studies would help to clarify appropriate sensitivity and specificity of the SCQ and SRS teacher forms in more diverse samples.

It should be noted, as well, that many children with ASD in the SSC had received diagnoses in the autism spectrum prior to their participation; only a portion received a first-time ASD diagnosis as a result of the SSC evaluation. It is possible, then, that many teachers were aware of their students’ ASD diagnostic status and were biased in answering their screening forms to be consistent with this diagnosis. However, a strength of this study was the confirmation of ASD diagnoses using standardized assessment, as well as screening of siblings to rule out ASD in this group. This helped to ensure that the children indicated as having ASD truly met diagnostic criteria for this condition and that siblings as controls truly did not.

Conclusion

The optimal levels of sensitivity/specificity for a given measures are dependent on (a) the nature of the screening and risk/benefit ratio of under-identifying conditions (lower sensitivity/higher specificity) versus over identification and (b) the additional resources needed for comprehensive evaluation of a student who, in the end, does not have an ASD. Higher sensitivity is desirable when (a) the condition being screened is serious and should not be missed, (b) the condition is treatable, and (c) false positive results do not lead to serious psychological or economic trauma to the individual. Knowing that early detection for ASD can lead to treatments and supports that improve functioning and quality of life, there is value in determining where sensitivity for such measures can be enhanced.

The current study extends our knowledge about specific measures appropriate for completion by teachers in screening for ASD among school-age children. It is the first known study to examine the SCQ-Current and the SRS teacher report as teacher-informed screening instruments. As Noland and Gabriels (2004) noted, more information is needed regarding school systems’ capabilities to provide effective screening and identification of ASD. Safran (2008) stated that there appears to be an underrepresentation of children with ASD within special education, given the current prevalence of ASD in the United States population. While there may be many reasons for this disparity (e.g., other eligibility categories are used, some are home-schooled or enrolled in private/specialized schools, error in epidemiological estimates, differences in diagnostic criteria, etc.), inadequate screening and diagnosis may be a significant contributor and requires further examination in schools.