Introduction

There has been a significant increase in the number of children with autism spectrum disorder (ASD) including among children with ASD without intellectual disability (ID) who currently constitute approximately two-thirds of those diagnosed (Baio et al. 2018). Disparities in the prevalence of ASD by sex have long been documented, with the most recent data indicating a 4:1 male-to-female ratio for the broad ASD population and an even greater disparity among those with ASD without ID (Baio et al. 2018). The substantial prevalence discrepancy has led researchers to examine sex differences for a range of phenotypic variables (e.g., diagnostic symptoms, adaptive skills, etc.; Giarelli et al. 2010; Harrop et al. 2015; Rodgers et al. 2019). These studies have sought to inform assessment and treatment practices, as well as identify potential causal mechanisms for ASD and the male–female prevalence discrepancy (Kuusikko et al. 2008; May et al. 2014, 2016).

Beyond the core diagnostic symptoms, individuals with ASD have been found to exhibit a range of comorbid psychiatric symptoms that can further interfere with daily functioning (e.g., Gadow et al. 2005; Volker et al. 2010). Elevations in comorbid symptoms including ADHD-related symptoms have been reported in these studies despite the exclusionary parameters for a comorbid diagnosis of ADHD contained in the DSM-IV and DSM-IV-TR (APA 1994, 2000) that were frequently used to enroll participants. The common occurrence of comorbid symptoms in ASD has prompted studies into potential sex differences in externalizing and internalizing symptoms. Although the research is somewhat limited, the majority of studies have tested sex differences in these symptoms using functionally-heterogeneous samples (variable cognitive levels), with fewer studies using more functionally-homogeneous samples (ASD without ID; Mandy et al. 2012).

Studies of sex differences in externalizing symptoms using functionally-heterogeneous samples with ASD have yielded inconsistent results. For example, a large-scale review of behavioral records found higher levels of parent-rated externalizing symptoms (aggression, hyperactivity, and inattention) for boys with ASD compared to girls with ASD (Giarelli et al. 2010). In contrast, Frazier et al. (2014) found higher levels of parent-rated externalizing behaviors for child and adolescent girls with ASD relative to affected boys (total externalizing problems and irritability). Still others have found no sex differences on a range of externalizing symptoms in children and adolescents with ASD (e.g., oppositional behaviors, disruptive behaviors, conduct problems, hyperactivity/inattention, aggression; Brereton et al. 2006; Mandy et al. 2012; Postorino et al. 2015). Contradictory findings have also been reported for sex difference in internalizing symptoms, with some data indicating higher levels of parent-rated emotional problems for female youth with ASD compared to male youth with ASD (Mandy et al. 2012) and other data indicating no internalizing symptom (depression, anxious/depressed, withdrawn) sex differences for youth with ASD (Brereton et al. 2006; Postorino et al. 2015). Several authors have asserted that the contradictory findings are likely associated with the wide and variable range of cognitive/functional levels in samples that can mask important differences, and that studies are needed using functionally-homogeneous ASD samples (Lai et al. 2011; Mandy et al. 2012).

Given the potential effect(s) of cognitive level on results, some researchers have begun to test sex differences in comorbid symptoms in more homogeneous samples consisting of youth specifically with ASD without ID. These studies have also produced inconsistent findings (Oswald et al. 2016). For example, when examining sex differences in parent-rated externalizing symptoms, Worley and Matson (2011) reported no differences in tantrum behaviors or conduct problems for children and adolescents with ASD without ID. Holtmann et al. (2007) found female children and adolescents with ASD without ID had higher parent-rated inattention problems than males with ASD without ID; however, females and males did not differ on delinquent or aggressive behaviors. May et al. (2014) found no sex differences in ratings of inattention or aggression symptoms for children with ASD without ID but males had more symptoms of hyperactivity. In a subsequent study, May et al. (2016) also found more symptoms of hyperactivity/impulsivity, as well as inattention for boys with ASD without ID than girls with ASD without ID. Studies of internalizing symptoms have also produced inconsistent findings however no studies were identified indicating more internalizing symptoms for male youth than female youth with ASD without ID. To illustrate, several studies found no sex differences in parent-rated anxiety, depression, and/or withdrawal symptoms for children and adolescents with ASD without ID (Holtmann et al. 2007; Kuusikko et al. 2008; Solomon et al. 2012; Worley and Matson 2011); however, Solomon et al. (2012) also reported that when they examined only the subgroup of adolescents in their sample (ages 12–18 years), the females had higher levels of anxiety than males (no differences were reported for depression symptoms among the adolescents). May et al. (2014) also found higher parent ratings of social anxiety in girls with ASD without ID than affected boys and Oswald et al. (2016) reported higher parent-rated depression symptoms for adolescent females than adolescent males with ASD without ID (anxiety symptoms were comparable).

Although these studies provide important information on potential sex differences in comorbid symptoms in youth with ASD without ID, the findings have yet to render clear conclusions and a number of variables and limitations have likely contributed to the disparate results (Solomon et al. 2012). For example, the studies of youth with ASD without ID described had very small samples of females (ranging from 12 to 32) and many included both children and adolescents. The small samples and inclusion of youth from broad age ranges were likely associated with significant difficulty recruiting sufficient numbers of females due to the male–female prevalence disparity which is even greater among youth with ASD without ID (Harrop et al. 2015; May et al. 2016). However, broad age ranges can obscure sex differences (Worley and Matson 2011). Many studies also failed to control the statistical error rate despite conducting numerous comparisons, although some applied corrections to control family-wise error rates. A number of studies utilized matched samples (age and/or IQ) which is important given the significant challenges in enrolling females with ASD without ID (Frazier et al. 2014; Lai et al. 2011; Oswald et al. 2016). Lastly, some of the variability in results may be related to the scores used in the analyses as some studies used norm-referenced standard scores and others used raw scores. Although standard scores may assist in determining severity and/or the clinical range of those in the sample, they may also mask potential sex differences (Solomon et al. 2012); studies should include both types of scores (Frazier et al. 2014). This study addressed these limitations and was conducted to contribute to the research by testing sex differences in comorbid symptoms using a relatively large and matched sample of girls and boys with ASD without ID and a narrower age range. The study utilized a multivariate approach and statistical corrections to control experiment-wise error and the analyses were conducted for both standard scores and raw scores. Given the highly discrepant findings in the existing studies, no specific hypothesis was evident for externalizing symptom levels but it was anticipated that boys in the sample would not receive significantly higher ratings of internalizing symptoms than girls in the sample.

Method

Participants

The total sample was comprised of 80 children, ages 6–12 years, with ASD without ID including 40 girls and 40 boys matched on age and IQ. Data for this study were derived from databases of prior clinical studies testing various psychosocial treatments for children with ASD without ID. Recruitment for those trials was done via public announcements. Specifically, recruitment flyers were distributed by public school personnel and local clinicians (counselors, psychologists, etc.) to parents of potential participants in the community. Interested parents then contacted a study coordinator to learn about the studies and eligibility requirements.

Eligibility for the studies was determined using a multiple-gate screening procedure. Initially, parents submitted documentation of a prior clinical diagnosis of autism, Asperger’s, or Pervasive Developmental Disorder-Not Otherwise Specified (PDDNOS), as well as prior psychological and special education reports. All the children received their diagnoses from 2002 to 2012 per the DSM-IV-TR (APA 2000). Next, the documentation and reports were independently reviewed by two senior members of the research team using a standardized checklist documenting prior IQ scores (if available; minimum IQ score of 70) and evidence of social/social-communication impairments and circumscribed and repetitive behaviors and interests; consensus between the two reviewers indicating that the criteria were met was required. Children meeting criteria then participated in a formal assessment session that included cognitive testing using a 4-subtest short-form of the Wechsler Intelligence Scale for Children – Fourth Edition (WISC-IV; Wechsler 2003) and informal observations of their symptoms, skills, and behaviors. The WISC-IV 4-subtest short-form consisted of the Vocabulary, Similarities, Block Design, and Matrix Reasoning subtests. The short-form composite score had an internal consistency reliability of 0.95 and correlated 0.92 with the Full Scale IQ of the complete test. The methods described by Tellegen and Briggs (1967) were used to calculate the composite reliability, correlation with the full test, and deviation quotient formula based on standardization information in the test manual. Following the formal assessment session, each complete file was again independently reviewed by two senior research team members using the standardized checklist and consensus was required that the child met inclusion criteria (i.e., ages 6–12 years, WISC-IV short-form IQ > 70, and clinical consensus supporting the prior diagnosis) to be enrolled in the trials. One of the studies included an additional exclusionary criterion involving a history of psychosis (per parent report). Other comorbid diagnoses were not assessed as part of the original studies, no specific data were collected on those, and there were no specific exclusionary criteria for other comorbid diagnoses.

The matched-samples for this study were predominately Caucasian, had mean IQ scores in the average range, and had comparable parent education levels. No significant differences were found between the girl and boy samples on major demographic characteristics (see Table 1 for demographics and results of between-groups demographic comparisons). The number of girls and boys taking a psychotropic medication was also very similar across the groups (15 of 40 girls and 17 of 40 boys) with ADHD medication by far the most common in both groups (11 of the 15 girls and 12 of the 17 boys). Data on comorbid symptoms used in this study were collected between 2008 and 2017 and, as noted, all the children received their diagnoses per the DSM-IV-TR, were recruited from the community, and were enrolled in a psychosocial (clinical) treatment study.

Table 1 Demographic characteristics of the study samples

Measure

Behavior Assessment System for Children, Second Edition, Parent Rating Scales (BASC-2 PRS)

The BASC-2 PRS (Reynolds and Kamphaus 2004) is a standardized multidimensional rating scale used to assess a range of clinical symptoms in order to assist with differential diagnosis, intervention planning, and outcome monitoring. The BASC-2 PRS is available for three age groups, with this study utilizing the Child (6-to-11 years; PRS-C) and Adolescent (12-to-21 years; PRS-A) forms to assess externalizing and internalizing symptoms. The BASC-2 has consistent scales across age levels which, “provides a basis for consistent interpretation of scales” (Reynolds and Kamphaus 2004, p. 2). Both forms include nine clinical behavior scales (i.e., Aggression, Anxiety, Attention Problems, Atypicality, Conduct Problems, Depression, Hyperactivity, Somatization, and Withdrawal); this study used the Hyperactivity, Conduct Problems, and Aggression scales to assess externalizing symptoms and Depression and Anxiety scales to assess internalizing symptoms. Parents rate each item on a 4-point frequency scale from 0 (Never) to 3 (Almost always) and item scores are summed and converted to standard T-scores (M = 50, SD = 10). Higher scores on the clinical scales indicate more problematic symptoms/behaviors. Clinical scale T-scores between 41and 59 are considered average, scores between 60 and 69 are classified as at-risk, and scores ≥ 70 are classified as clinically significant (Reynolds and Kamphaus 2004).

Coefficient alpha reliabilities for the PRS-C clinical scales used in this study reportedly ranged from 0.83 to 0.88 and for the PRS-A ranged from 0.81 to 0.87. Validity evidence supporting the grouping of externalizing or internalizing scales is reflected in moderate intercorrelations between scales within each grouping. Intercorrelations between the externalizing behavior scales (i.e., Hyperactivity, Aggression, and Conduct Problems) ranged from 0.67 to 0.76 for the PRS-C and 0.72 to 0.78 for the PRS-A and between the two internalizing behavior scales (i.e., Depression and Anxiety) was 0.54 for the PRS-C and 0.59 for the PRS-A. Concurrent validity was supported in moderate-to-high correlations between the BASC-2 scales used in this study and comparable clinical scales on other established rating scales (Reynolds and Kamphaus 2004). Additionally, studies have shown that the BASC-2 is sensitive to externalizing and internalizing symptoms in children and adolescents with ASD without ID when compared to typically-developing peers (Lopata et al. 2010; Volker et al. 2010).

Procedures

The treatment trials that generated the data used in this study were approved by the Institutional Review Board and conducted according to the approved procedures (including attainment of written parental consent and child assent). For each trial, parents completed a battery of pretreatment (baseline) measures that included the BASC-2 PRS. Once completed and returned, each protocol was immediately examined for any errors (e.g., items with multiple responses, omitted items, etc.) and promptly reviewed with the parent to correct the error(s). All protocols were scored by research assistants using the BASC-2 ASSIST Plus computer scoring software, which includes a second entry check for accuracy. Protocol and demographic data were initially entered into the study database by a research assistant and independently checked by a second research assistant, with any discrepancy resolved by a third member of the team.

Data Analysis Plan

Several statistical procedures were used to examine the externalizing and internalizing symptoms of girls and boys in the sample. Initially, descriptive statistics were calculated for the girl and boy samples including demographic data (Table 1) and scores on each of the externalizing and internalizing scales (both standard T-scores and raw scores; Table 2). Next, one-sample t-tests were calculated using the standard scores to compare the symptom levels of girls and boys separately against the BASC-2 normative estimates. These comparisons were conducted to characterize the symptom levels and assist with subsequent interpretation of the principle tests of sex differences in symptoms. A Bonferroni correction was applied for each sex-based set of comparisons (girl sample adjusted alpha ≤ 0.01 [i.e., 0.05/5 comparisons] and boy sample adjusted alpha ≤ 0.01 [i.e., 0.05/5 comparisons]). Effect size estimates (Cohen’s d) were also calculated.

Table 2 BASC-2 PRS means and standard deviations for standard scores and raw scores by group

The primary research questions involving sex differences in externalizing and internalizing symptoms were tested using Multivariate Analysis of Variance (MANOVA). Two separate MANOVAs were calculated for the standard scores (one for externalizing symptoms and one for internalizing symptoms) and two for the raw scores (one for externalizing symptoms and one for internalizing symptoms). In this study, the Hyperactivity, Aggression, and Conduct Problems scales comprised the externalizing behavior sets and the Anxiety and Depression scales comprised the internalizing behavior sets. To control the experiment-wise error rate at 0.05, each MANOVA was tested at the adjusted alpha < 0.0125 (i.e., 0.05/4 MANOVA tests). Assumptions of the MANOVA models (outliers, normality, linearity, and homogeneity of variance–covariance matrices) were assessed and all were met. Linear correlations among the externalizing scales and among the internalizing scales were all of moderate magnitude (Table 3). MANOVA tests were done using Pillai-Bartlett trace, and follow-up univariate F tests were calculated. Partial eta squared effect sizes were also calculated (Table 4).

Table 3 BASC-2 PRS scale correlations for standard scores and raw scores
Table 4 Multivariate and univariate results for sex comparisons for externalizing and internalizing symptoms

Results

Initial tests compared the externalizing and internalizing symptom levels of girls and boys in the sample separately against the BASC-2 normative estimates. Results of the one-sample t-tests yielded significantly higher externalizing symptom levels for girls and boys in the sample on the Hyperactivity (girls t = 8.20, p < 0.01, d = 1.38 and boys t = 6.75, p < 0.01, d = 1.22) and Aggression (girls t = 2.93, p = 0.01, d = 0.46 and boys t = 3.83, p < 0.01, d = 0.66) scales but no differences on the Conduct Problems (girls t = 1.29, p = 0.21, d = 0.22 and boys t = 0.45, p = 0.66, d = 0.08) scale. Results also indicated significantly higher internalizing symptoms for girls and boys in the sample for the Anxiety (girls t = 3.67, p = 0.01, d = 0.67 and boys t = 3.01, p = 0.01, d = 0.57) and Depression (girls t = 4.73, p < 0.01, d = 0.91 and boys t = 5.11, p < 0.01, d = 0.96) scales. For the four scales on which significant differences were found, the effect sizes were generally medium-to-large in magnitude.

Potential sex differences for the externalizing and internalizing symptom scales were first tested using the standard scores. Results of the separate MANOVA analyses revealed no significant multivariate effect of sex for externalizing or internalizing symptoms, indicating similar levels of parent-reported symptoms across the sex groups (Table 4). Given the absence of a significant multivariate effect for either the externalizing symptoms or internalizing symptoms tests, the univariate F tests are reported but the significance tests are not interpreted. A review of the univariate effect sizes, however indicated small-to-negligible effects for each of the individual externalizing and internalizing scales. For the same tests using the raw scores, the MANOVA analyses also indicated no significant multivariate effect of sex for either externalizing symptoms or internalizing symptoms. As such, the univariate F tests are reported but the significance tests are not interpreted. Consistent with results based on the standard scores, the raw score univariate effect sizes were small-to-negligible.

Discussion

Prior research has suggested that children with ASD experience a range of comorbid symptoms. Investigations into sex differences in these symptoms have yielded contradictory results both within and between studies (Solomon et al. 2012). A number of factors may have contributed to the inconsistent findings such as the testing of comorbid symptoms in cognitively-/functionally-heterogeneous and/or broad age-range samples which can mask important sex differences (Mandy et al. 2012; May et al. 2016; Worley and Matson 2011). Significantly fewer females with the diagnosis, especially among those with ASD without ID has also resulted in small female samples in the existing studies (Harrop et al. 2015; May et al. 2016). This study aimed to contribute to the research by testing sex differences in a relatively large sample of girls (ages 6–12 years) specifically with ASD without ID compared to age- and IQ-matched boys with ASD without ID.

To assist with the interpretation of the sex-based comparisons, the externalizing and internalizing symptom levels of the girls and boys in the sample were initially compared to population estimates. Results reflected significantly elevated comorbid symptoms in two of the three externalizing symptom areas (hyperactivity and aggression) and both internalizing areas (anxiety and depression). These findings are largely consistent with results of prior studies of parent-rated externalizing and internalizing symptoms in children with ASD without ID (e.g., Gadow et al. 2005; May et al. 2016; Oswald et al. 2016) including studies using the BASC-2 (Lopata et al. 2010; McDonald et al. 2016; Solomon et al. 2012; Volker et al. 2010). These results continue to suggest increased vulnerability for externalizing and internalizing symptoms, however the elevations were not uniform across the scales. Specifically, the mean Hyperactivity and Depression scores fell in the at-risk range, whereas Aggression and Anxiety scores fell in the mid-upper average range. Although the normative comparisons suggested vulnerability for these symptoms, the clinical interpretive ranges suggest that the risk might be somewhat greater for hyperactivity and depression symptoms among children with ASD without ID.

The primary purpose of this study was to assess sex differences in parent-rated comorbid symptoms and results revealed similar levels of overall externalizing and internalizing symptoms between girls and boys with ASD without ID in the sample. The lack of overall differences was found for both the standard and raw score comparisons, and was also indicated in the small-to-negligible effect sizes at the individual scale level. This reflects a consistent pattern of similarity in symptom levels for girls and boys in the sample. Interpreting the current findings relative to other studies is a challenge due to a range of methodological differences and/or mixed findings within a given symptom category in other studies (e.g., differences between individual externalizing symptom scales). Despite these challenges, the current findings are consistent with those of Holtmann et al. (2007) and Worley and Matson (2011) who found similar levels of parent-reported aggression and conduct problems in females and males (ranging from 4 to 20 years) with ASD without ID, as well as with May et al. (2014) who reported no sex differences in aggression among 7–12 year olds with ASD without ID. In contrast, May et al. (2014, 2016) found elevated symptoms of hyperactivity in 7–12 year old boys compared to girls with ASD without ID. Although the reason(s) for the differences in results between the current study and May et al. (2014, 2016) are unknown given the similarity in age and functional level of the samples, their studies were conducted with Australian samples and used a hyperactivity measure that more directly tested ADHD symptoms.

Consistent with the larger research base, the current study did not find males with ASD without ID had elevated internalizing symptoms (anxiety and depression) relative to females with ASD without ID. In fact, the current findings are in line with many studies that found no sex differences for children and/or adolescents with ASD without ID (e.g., Holtmann et al. 2007; Kuusikko et al. 2008; Solomon et al. 2012; Worley and Matson 2011). Despite this consistency, two studies found adolescent females exhibited elevations in a specific internalizing symptom (e.g., depression, Oswald et al. 2016; anxiety, Solomon et al. 2012) and May et al. (2014) found elevated social anxiety in girls with ASD without ID compared to boys. Again, some of the differences in results might be a function of differences in study characteristics (e.g., the age of the samples, country of origin of the samples, use of different measures and specificity of the measures [social anxiety], etc.).

Given the methodological strengths and consistent pattern of findings in this study, the results may have some practical implications. One possible implication derives from the finding that the comorbid externalizing and internalizing symptom levels did not differ significantly between girls and boys with ASD without ID in the sample. This suggests that clinicians might not necessarily have to enter the assessment process anticipating different patterns of elevations based on sex. Further, the lack of sex differences should not be misconstrued as a lack of vulnerability for comorbid symptoms for these children. In this study, both girls and boys with ASD without ID showed significantly elevated symptoms of hyperactivity, aggression, anxiety, and depression when compared to population estimates. This suggests that clinicians, parents, and teachers should be cognizant of the susceptibility to externalizing and internalizing symptoms for many children with ASD without ID (Brereton et al. 2006; May et al. 2014). As such, it may be advisable for clinicians to include a broad screening measure of comorbid symptoms as part of any assessment of children with ASD without ID. Significant elevations in symptoms might then warrant a more in-depth diagnostic assessment to determine the presence of a comorbid diagnosis. In addition to assessment implications, results suggest that supplemental treatment (e.g., cognitive-behavioral, behavioral) might be needed to address the comorbid symptoms (Kuusikko et al. 2008; Oswald et al. 2016). This is often not the primary focus of interventions given the severe impairment caused by the ASD diagnostic symptoms and the need to enhance social functioning; however, these comorbid symptoms can further impair daily functioning and hinder other intervention efforts. Although both girls and boys appeared equally susceptible to comorbid symptoms, it is important to interpret those findings within the context of the sample. Specifically, the lack of sex differences found in this study could be characteristic of this particular sample, which was enrolled in psychosocial treatment studies and/or due to the overall elevated symptoms for this particular sample and therefore might not be representative of the larger ASD population.

The current findings represent an important step in understanding comorbid symptoms specifically in children with ASD without ID. Although this study had several strengths (e.g., relatively large matched sample with ASD without ID, narrow age and cognitive inclusion parameters, testing of sex differences using both standard and raw scores, statistical adjustments to control experiment-wise error, etc.), several limitations warrant mention. A primary limitation involved the characteristics of the sample. While utilizing narrow age (6–12 years) and cognitive inclusion parameters was considered a methodological improvement over prior studies (May et al. 2016; Worley and Matson 2011), those criteria limit the generalizability of the findings to others outside those parameters. The sample was also predominantly Caucasian which further restricts generalizability. This study was also limited by the use of parent raters only. Teacher reports may provide additional insight into potential sex differences in externalizing and internalizing symptoms within structured educational settings. Beyond these, there are limitations inherent in the use of rating scales to assess comorbid symptoms. For example, rating scales are based on parents’ perceptions and their ratings may be influenced by potential biases. In addition, rating scales only yield information on symptom levels/severity and are not sufficient for a diagnosis of a comorbid disorder. As such, it is unknown how many of the girls and boys in the sample would have met full criteria for a comorbid diagnosis. It is also important to note that the participants in this study were diagnosed using the DSM-IV-TR which precluded a comorbid diagnosis of ADHD for those with autism (APA 2000). The DSM-V (APA 2013) has removed this restriction and encourages the identification of comorbid diagnoses (including ADHD) when present, and included autism, Asperger’s, and PDDNOS under the heading of ASD, which could affect the results and make-up of samples in future comorbidity studies. Another limitation involved the fact that all the children were participants in specific psychosocial treatment studies, which might affect the generalizability of the findings. The study also utilized existing data (retrospective) and future prospective studies may want to use the effects found in this study to inform their sample sizes. Further, although this study had one of the largest samples of girls with ASD without ID, it was nonetheless limited. Considering these limitations, future research should seek to replicate the current findings using larger and more racially/culturally diverse samples. Given the need to study such phenomenon in functionally-homogeneous samples, future research might also examine sex differences in symptoms for younger and/or older youth with ASD without ID. Longitudinal studies will also be useful in documenting sex differences in the developmental trajectory of comorbid symptoms from childhood through adolescence. Finally, studies might benefit from the use of a diagnostic measure to determine the presence of a comorbid diagnosis (not simply symptom levels). It is clear that ongoing studies are needed as comorbid symptoms constitute a significant barrier to daily functioning of youth with ASD without ID.