Introduction

In late 2009, a report by the CDC estimated that the prevalence of autism spectrum disorders (ASDs) had risen to 1 in 110, an increase from previous estimates of 1 in 150 (Autism and Developmental Disabilities Monitoring Network Surveillance Year 2000 Principal Investigators; CDC 2009). This widely publicized study served as a reminder of a very real health concern that has attracted increasing attention in both medical and public spheres. The origins and epidemiology of autistic conditions remain poorly characterized, however, with ASD definitions continually evolving as understanding of the disorders improves. Autism historically has been diagnosed by an “all-or-none” approach, using strict categorical criteria to define the presence or absence of a disorder. Recent studies suggest, however, that a different approach may be warranted (Constantino et al. 2000, 2010; Constantino and Todd 2003; Piven et al. 1997a; Spiker et al. 2002).

Researchers today believe that ASDs represent one end of a larger spectrum of quantitative impairment that is continuously distributed in the general population. This idea is based on studies of non-ASD groups who nonetheless show autistic social impairment at subthreshold levels. Family members of individuals clinically diagnosed with ASDs, for example, frequently demonstrate such milder autistic phenotypes (Constantino et al. 2010; Gamliel et al. 2007, 2009; Piven et al. 1997b; Spiker et al. 2002; Toth et al. 2007). Individuals with non-ASD developmental disorders also have been shown to present with subthreshold social and communicative deficits. This co-occurring condition is most established with ADHD: a number of studies demonstrating ADHD-like traits among individuals with ASD (Gadow et al. 2006; Goldstein and Schwebach 2004; Holtmann et al. 2007; Lee and Ousley 2006; Yoshida and Uchiyama 2004), and another group of studies showing the reverse, with increased autistic traits among children diagnosed with ADHD (Carpenter Rich et al. 2009; Clark et al. 1999; de Boo and Prins 2007; Jensen 2001; Jensen et al. 2001; Mulligan et al. 2009; Nijmeijer et al. 2009). Social and communicative deficits characteristic of autism have also been shown to occur at elevated rates among children with other psychiatric conditions including mood disorders (Pine et al. 2008; Towbin et al. 2005), hyperkinetic disorder (Santosh and Mijovic 2004), conduct/oppositional defiant disorder (Gilmour et al. 2004; Mulligan et al. 2009), and specific language impairment (Conti-Ramsden et al. 2006; Leyfer et al. 2008; Loucas et al. 2008).

Reflecting this changing and expanded view of autism, the not-yet published DSM-5 has proposed elimination of the existing categories within the Pervasive Developmental Disorder diagnosis in favor of an all-encompassing title of “autism spectrum disorders.” (American Psychiatric Association 2010) This revision also proposes to invoke a diagnosis of social communication disorder for children with autistic traits who do not meet full criteria for an ASD. The social responsiveness scale (SRS) is an instrument that characterizes quantitative impairments in social communication and repetitive behavior/restricted interests that define the autistic syndrome, and provides a more subtle characterization of individual symptoms than are possible using traditional classification systems. This tool thus has potential for assessing autistic traits in large, population-based studies, as well as for affirming previous findings of subtle autistic traits in select groups.

A further advantage of the SRS is its potential for use in cross-cultural studies. The SRS is particularly well-suited for this type of investigation, an easily and rapidly administered questionnaire that takes only about 15–20 min to complete. It does not need to be administered by a specially qualified health professional and thus can be more easily incorporated into existing local assessment protocols. Moreover, many institutions may lack the resources to administer time-intensive autism diagnostic scales such as the ADOS (Autism Diagnostic Observation Schedule) and ADI-R (Autism Diagnostic Interview-Revised), significantly limiting their ability to compare local investigations of ASDs with standardized values. Prior to applying the SRS to international investigations, however, this measure must be appropriately translated and its psychometric properties tested in a variety of settings. To our knowledge, the only existing such investigation was in a 2008 study of over 1400 European individuals that found good cross-cultural validity of the German-language scale (Bolte et al. 2008).

The purpose of this study was to investigate the cross-cultural validity of the social responsiveness scale in a Taiwanese population. Objectives of this study are (1) to report the psychometric properties of the Chinese Mandarin SRS scale, and (2) to compare the results of data collections conducted in Taiwan, Germany, and in the US.

Method

Participants

Participants of the clinical study groups were the parents and other primary caretakers of children aged 4–6 years referred for developmental evaluation at the Taipei Veterans General Hospital and Taipei Municipal Gan-Dau Hospital in Taipei, Taiwan from July 2009 to April 2010. Taipei Veteran General Hospital is a large-scale medical center and Taipei Municipal Gan-Dau Hospital is a regional hospital. Study participants, both clinical and control samples, mainly resided in Taipei City and County with wide range of SES. All were covered by Taiwan’s National Health Insurance Program (NHIP), a program implemented in 1995 that provides all permanent residents and citizens with mandatory comprehensive medical care coverage. By the end of 2008, 99.48% of the population was enrolled in the program (Department of Health [DOH] 2012). Well child care and developmental evaluation for children with behavioral concerns are all covered by the insurance program.

Caretakers of a total of 307 children were consented and participated in the study. They fell into a total of five study groups, including four clinical groups and one typical control group. The ASD group (n = 32), included diagnoses of autism syndrome, Asperger syndromes or pervasive-developmental disorder- not otherwise specified (PDD-NOS). Fifty-one children were included in the ADHD group, which included inattentive, hyperactive-impulsive, and combined types. Fifty-one children were also placed in the developmental delay (DD) subgroup, which primarily included children with speech/language, motor or learning delays. The last case group included children diagnosed with combined ADHD and DD (n = 33). The control group consisted of parents or primary caretakers of 140 aged 4–6 children enrolled in two local kindergartens. These children were without known developmental or medical conditions, as determined by a brief questionnaire completed by parents/caretakers.

Procedures

Diagnoses of children in the clinical study groups (i.e. ASD, ADHD, DD, and ADHD + DD groups) were made according to Diagnostic Statistical Manual of Mental Disorders-IV (American Psychiatric Association 2000) and ICD-10 criteria. A team of experienced clinicians, including child psychiatrists, child neurologists, psychologists, physiatrists and general pediatricians, came to a consensus diagnosis based on all available information obtained during assessment, which included clinical interviews with both caretakers and children, psychometric testing, such as Peabody Developmental Motor Scales-II (Folio and Fewell 2000), Movement Assessment Battery for Children (Henderson and Sugden 1992), Wechsler Preschool and Primary Scale of Intelligence-Revised (Wechsker 1967), and the Peabody picture vocabulary test-revised (Dunn and Dunn 1981), and information from multiple questionnaires filled out by caretakers, such as the Child Behavioral Checklist (Achenbach 1966; Achenbach and Edelbrock 1981) and a Modified Screening Tool for Autism (Robins et al. 2001).

Participants completed the Chinese Mandarin version of the SRS while in the waiting room of the Developmental Delay clinic. Clinicians were blinded to the SRS scores of the children in the clinical study groups when making diagnoses. IQ testing of the clinical participants was performed by experienced psychologists and research assistants.

SRS scores of children in the control group were assessed using questionnaires completed by their parents and caretakers at home. Teachers at participating kindergartens sent participants both an assessment questionnaire designed to determine if the child had an existing neuropsychiatric diagnosis, as well as a copy of the Chinese Mandarin SRS. Caretakers completed the questionnaires at home and later returned them to the kindergarten teachers. IQ testing was performed by an experienced psychologist who visited the kindergartens and met individually with participating children.

Measures

Social Responsiveness Scale (SRS)

The SRS is a 65-item questionnaire filled out by parents or teachers and designed to give a quantitative assessment of autistic traits across a wide spectrum in children aged 4–18 years. By assigning these numerical values, the SRS is able to provide a more nuanced characterization of an individual’s autistic impairment than categorical diagnosis alone. This quality is especially important when evaluating individuals with a PDD-NOS diagnosis, as a wide range of symptom severity exists within this group, as well for the diagnosis of those with subclinical autistic traits. The SRS can be implemented in large populations, which allows it to be standardized across different settings and against different norms and subgroups such as by gender, age, or survey respondent.

The SRS can be used as a diagnostic tool, distinguishing clinically-significant ASDs from varying levels of social impairment in other psychiatric disorders, and as a population-level screening instrument. Initial studies using the scale have shown a continuous distribution of autistic traits in the general population, and significantly elevated scores among children diagnosed with ASDs, in line with prior research findings comparing the scale with other diagnostic tools (Constantino and Todd 2003; Constantino et al. 2007). In US and European samples, children with other psychiatric disorders also generated higher SRS scores compared to controls as predicted by prior studies, but the SRS total score was able to reliably distinguish these other conditions from ASDs (Bolte et al. 2008; Constantino et al. 2000; Pine et al. 2006; Reiersen et al. 2007). Specific psychometric properties of the US and German studies can be seen in Table 1.

Table 1 Comparison of psychometric properties in German and English language SRS

IQ Test

Full-scale IQ was determined by an abbreviated, Mandarin-language IQ test designed to assess children aged 4–8 years old (Wang 1999). Intelligence was assessed individually by experienced psychologists using a Mandarin version of IQ test called “The Easy-and-Quick Intelligence Scale for Children” (English translation.) The Easy-and-Quick Intelligence Scale for Children was designed in Taiwan to identify and assess children aged 4–8 years old for intellectual disability, developmental delay or other special education needs (Wang 1999). The Intelligence Scale consists of six subtests: Vocabulary, Copying, Quantitative, Assembling, Memory for words, Matrices and the results include verbal IQ, performance IQ and full IQ. It is a pencil-paper test that takes 30–40 min for each person. A national norm based on this IQ test has been established for a sample of 476 young children in Taiwan aged four to seven. The reliability coefficients of the Scale indicated by split-half correlation (0.91) and Cronbach alpha (0.88) are satisfactorily high. The validity of the Scale indicated by correlation with the language art (0.70), math (0.71) and the Cognitive Abilities Test (0.76) are also high. The difference of test scores among four age groups is apparent.

Data Analysis

One-way Analysis of Variance (ANOVA), with Scheffe test to adjust for multiple comparisons, was carried out to compare SRS scores and IQ scores between study groups. Statistical significance was determined by two-tailed p value of <0.05. The ability of the SRS to predict diagnostic category for each of the cutoffs was examined using receiver operating characteristic (ROC) curve. The area under the curve is expressed as a percent, with 100% indicating perfect prediction and 50% indicating chance prediction. The sensitivity and specificity of SRS were estimated at each of three different cutoffs (see Table 5).

Results

Table 2 shows selected demographics of the study population. Males were more highly represented in the clinical study groups than in the control group (74 vs. 54%), especially for the autism subgroup where 94% of participants were male. The ASD group was not significantly different from the other four study groups in age. No significant difference was found in full scale IQ between the ASD group (mean 88.29 ± 21.62) and the other three clinical study groups (ADHD 97.09 ± 14.05, DD mean 79.88 ± 16.63, ADHD + DD mean 87.03 ± 15.50); however, typical controls had significantly higher full-scale IQ than children in all four clinical groups. Maternal respondents were heavily favored in all groups including controls, ranging from 80 to 88% of the participants in each of the diagnostic categories.

Table 2 Selected demographic characteristics, and IQs by study groups

Generally, the total and subscale scores of SRS successfully distinguished ASD from all four study groups; and the scores also differentiated the other three clinical study groups (i.e. ADHD, DD, ADHD + DD) from the typical control group, with p values all <0.0001 (two-tailed) (see Table 3). Mean total SRS score of the ASD group was 99.31 ± 25.83, compared to 55.35 ± 21.11 (ADHD), 56.92 ± 24.95 (DD), and 62.33 ± 30.74 (ADHD and DD combined). Internal consistencies, measured by Cronbach’s alpha, of subscales by study group are presented in Table 3. Overall, the coherence within full scale was good and excellent (Cicchetti 1994), especially for the clinical groups (alphas ranged 0.87–0.94). Alphas for the subscales varied across study groups, with Social Communication (ranged 0.83–0.88 for clinical groups) and Autistic Mannerism (ranged 0.73–0.90 for clinical groups) having higher alphas, and Social awareness with lower alphas (ranged 0.43–0.55). SRS score analysis by respondent (father versus mother versus other caretaker) was not conducted because data from non-maternal sources were limited.

Table 3 Descriptive statistics of SRS scales and internal consistency alpha by study group

Cross-Cultural Comparison

Raw total and subgroup SRS scores obtained from studies in the United States, Germany, and Taiwan are shown in Table 4. There was no significant difference in raw and subgroup scores between the three countries for children diagnosed with ASDs or ADHD/ADD. Mean SRS score for children with ASD in the Taiwan sample was 99.31 ± 25.83, compared to 107.2 ± 30.2 (n = 271) in the US sample and 102.3 ± 31.8 (n = 160) in the German sample. US but not German total scores for typical controls were significantly greater than total scores for comparable individuals in Taiwan (p < 0.001).

Table 4 Comparison of SRS total raw scores in Taiwan, US, and Germany

Sensitivity, Specificity, and ROC

To assess the ability of the SRS to discriminate ASDs from other developmental disorders (i.e. ADHD, DD, and ADHD + DD), the four clinical study groups were analyzed for sensitivity, specificity and ROC (Table 5, ROC seen in Figs. 1, 2, 3, 4). SRS scores were compared with clinical diagnoses to evaluate the scale’s sensitivity and specificity. The SRS manual recommends using a cut-off SRS score of 85 for evaluating children in high-risk clinical settings, and of 70 for screening lower-risk general population groups. Using the higher 85 cut point, the SRS had a sensitivity of 66% and a specificity of 89% for detecting all ASDs among children in the clinical sample. Optimal cutoffs for screening in higher risk population and for clinical classification were also suggested based on the sum of sensitivity and specificity. Specifically, we selected the optimal cutoff for screening as the one with the highest value of sensitivity + specificity among cutoffs <85 (cutoffs favoring sensitivity are desirable so the screening can be more inclusive); and the optimal cutoff for clinical classification as the one with the highest values of sensitivity + specificity among cutoffs ≥85 (cutoffs favoring specificity are desirable so the clinical classification can be more specific). Based on the above-mentioned selection rule, we suggested the optimal cutoff for screening is 65 (sensitivity 94% and specificity 70%), and 87 for clinical classification (sensitivity 66% and specificity 90%). The levels of sensitivity and specificity can be read as <70% = Poor, 70–79% = Fair, 80–89% = Good, and 90–100% = Excellent. (Cicchetti et al. 1995)

Table 5 Comparison of sensitivity, specificity, and area under ROC using different cutoffs to discriminate ASD from other developmental disorders in Taiwan, US, and Germany
Fig. 1
figure 1

Receiver operator curve of ASD versus ADHD. AUC: 0.904 (0.841, 0.967), under the non-parametric assumption

Fig. 2
figure 2

Receiver operator curve of ASD versus ADHD and ADHD + DD combined. AUC: 0.894 (0.837, 0.952), under the non-parametric assumption

Fig. 3
figure 3

Receiver operator curve of ASD versus all other clinical groups combined. AUC: 0.879 (0.821, 0.937), under the non-parametric assumption

Fig. 4
figure 4

Receiver operator curve of ASD versus controls. AUC: 0.997 (0.000, 1.000), under the non-parametric assumption

Discussion

Psychometric Properties

As shown in prior US and German studies, the SRS was able to distinguish Taiwanese children with ASDs from typical controls as well as from individuals with other psychiatric diagnoses. This observation held both for the total SRS score and each of the subscale scores at a highly significant p < 0.0001, except in only one case where significance was still obtained at a slightly higher p value. This result supports prior research findings of the scale’s diagnostic and discriminant validity, and also adds the first evidence verifying the tool’s cross-cultural validity in an East-Asian population. The results of this study favor the appropriateness of the SRS as a tool in global investigations of autistic traits, especially in locations where more extensive diagnostic instruments are not available or where time and convenience are limiting factors.

The results of this study also confirmed previous research findings of subclinical autistic traits in non-ASD psychiatric populations. While significantly lower than average SRS scores in individuals clinically diagnosed with ASD, mean SRS scores of children in the ADHD + DD, ADHD alone, and DD alone groups were still notably higher than those of typical controls. The quantitative nature of the scale readily detects these sub-threshold, but nonetheless impairing, social and behavioral deficits. This quality supports a role of the SRS beyond serving as a screening or diagnostic tool. Among populations with identified psychiatric conditions, the scale can help identify those individuals with co-existing autistic tendencies who may benefit from special intervention.

Notably, this study found no significant difference in IQ between the 3 clinical samples except for between those diagnosed with ADHD and those with DD. A significant difference was found between typical controls and each of the three clinical groups. This finding bears further discussion given the common finding in the literature of lower intellectual functioning among those with ASD (Bölte et al. 2010). In this study, the lack of a significant IQ difference between those with ASD and those with other clinical diagnoses like ADHD was likely a reflection of the small sample size available for the ASD group. The standard deviation for IQ among the ASD group was greater than 20 points, a finding which may have masked underlying differences in IQ between that subgroup and the ADHD group, the latter of which would be expected to function at a higher intellectual level.

Cross-Cultural Validity

Overall, mean SRS scores in different diagnostic categories were similar between individuals from Taiwan, the US, and Germany. Some modest differences should be noted, however. Raw SRS scores of typical controls in Taiwan were significantly lower than in the US but comparable to those observed in Germany, especially to scores from German mother-raters. Raw SRS scores for the ASD clinical group were slightly lower in the Taiwan sample compared to US and German samples, and scores for the ADHD group were higher than in the US, though in these two case groups the differences did not reach statistical significance. This variability may suggest an influence of culture and language on SRS results. Studies have shown, for example, that Taiwanese parents demand more obedience or compliance from their children and are more directive than American parents (Jose et al. 2000). Children are expected to follow what have been told or advised by parents or grandparents, rather than express their own opinions. Proper behaviors and shyness are viewed as less deviant or problematic than hyperactivity in this culture, which can potentially delay in detection of some social development concerns. In addition, while academic learning and performance are highly valued in Taiwanese culture, social skills are not equally emphasized. The lower raw scores in the Taiwan study groups compared to the US and German groups perhaps thus can be explained, in part, by these cultural values in social development. Appropriate cross-cultural application of the scale in the future will therefore benefit from using control data specific to the population being studied.

Sensitivity and specificity analyses of raw SRS scores in the US, German, and Taiwan studies were largely similar between the three research groups when using a cutoff-score of 85. Optimal cutoffs for screening and for clinical diagnosis differed between the three locations Future researchers conducting either population-based or clinical research using the SRS will benefit from choosing the appropriate cutoff scores for the particular cultural or geographic group they are working with.

Study Limitations

The overall sample size in this study was modest, with 167 clinical participants—including 32 diagnosed with an ASD—and 140 typical controls. Maternal respondents were highly represented in the study (85%), especially among the clinic population. This skew likely had an influence on SRS score results, given that US studies have shown a difference in scoring between mother, father, and teacher raters. Moreover, in the Taiwan study 24 children were either scored by a non-parental caretaker or by an unspecified caretaker. This is not uncommon in Taiwan where both parents often work, leaving primary-care responsibilities to a 3rd-party such as a grandparent. It is unknown how the inclusion of such respondents, which were not described in the original US validation literature, may have affected SRS score results. As with mother responders, male children were highly represented in this Taiwan study, especially in the clinical groups. Given previous findings of higher SRS scores for male versus female individuals, this skew may have influenced the average SRS scores obtained in this study. This study is also limited by its lack of test–retest reliability or inter-tester reliability measures. Due to time and resource restraints, we were unable to recall study participants for re-testing and therefore unable to obtain SRS results from non-parent/caretaker respondents such as teachers. Finally, this study only included children aged 4–6 years old, a limitation that has important implications for the applicability of its results to all children populations across different ages.

At the time of data collection, few Taiwanese were trained and research-reliable in the use of standard US diagnostic instruments such as the ADOS and ADI-R. Furthermore, many other commonly used diagnostic questionnaires or scales had not yet been translated or adapted into Chinese Mandarin or had not yet been validated in an Asian population. This limitation required our study to use clinical-judgment, based on DSM-IV, as the method for establishing diagnosis, without the aid of standardized diagnostic tools used in other SRS validity studies. Because clinical diagnoses are by nature somewhat subjective, it is possible that some of the discrepancies observed between SRS score and ASD diagnosis were due to variability in how children were assessed and diagnosed in the clinical setting. All diagnoses in this study were made by highly-experienced clinicians; thus this concern should not have significant impact on the study’s findings.

Directions for Future Research

Future investigations of autistic traits in this Asian population should focus on capturing data from a wider demographic range. Particular emphasis should be on including groups of older children and female children, both of which were limited in this study. For the purposes of establishing a baseline understanding of autism prevalence in Taiwan, larger population-based screening might be implemented using either the SRS or another cross-culturally validated screening tool that would provide comparable results to data using the same tools collected in other countries. At the time of this writing, such an investigation is underway in southern Taiwan, using a combination of the SRS, ADOS, and ADI-R to screen the general population.

To further validate the SRS as a screening and diagnostic tool in Taiwan, future studies should also obtain data from non-mother respondents, including fathers and teachers. US data have shown significant differences in SRS scores provided by parents versus teachers, while the German study found differences in scores from fathers versus mothers. Additionally, future investigations should obtain test–retest reliability data on the Mandarin SRS to assess the temporal stability of participant scores. With more complete information, normalized SRS values for the Taiwan population can be established and used for comparison in clinical and research investigations.