Introduction

Despite changes in diagnostic classification systems over time to broader and more inclusive definitions of autism and the ‘autism spectrum’, the preponderance of males with the diagnosis has remained remarkably consistent at a ratio of 2.5–4:1 (Autism and Developmental Disabilities Monitoring Network 2007; Bryson et al. 1988; Ehlers and Gillberg 1993; Fombonne 2008; Ritvo et al. 1989; Scott et al. 2002; Yeargin-Allsopp et al. 2003). Although a sex ratio favoring males has long been recognized (indeed, 8 of the 11 children in the original Kanner (1943) case series were boys), the mechanisms underlying this phenomenon remain poorly understood. An increasing number of rare variants involving genes on the X chromosome has been reported in association with ASD, including neuroligins 3 and 4 (Blasi et al. 2006; Jamain et al. 2003; Vincent et al. 2004), PTCHD1 (Filges et al. 2011; Noor et al. 2010), TMLHE (Celestino-Soper et al. 2011) and MeCP2 (LaSalle and Yasui 2009), as well as a small proportion of cases occurring in association with fragile-X syndrome (Rogers et al. 2001). Other mechanisms have been proposed, including testosterone-related effects on brain development in pre- and post-natal life (Auyeung et al. 2010; Baron-Cohen et al. 2005) and epigenetic effects on social cognition related to paternally-imprinted X-linked genes (Skuse 2000). Notably, in an analysis of four pooled multiplex samples, a somewhat lower sex ratio was observed for later-born children with ASD than for first affected children with ASD (2.9:1 and 4.7:1, respectively; Jones et al. 1996). The recent finding that male:female ratio decreases with increasing paternal age (Anello et al. 2009) may contribute to such differences. Regardless of the etiology, awareness of elevated sex ratio in ASD might influence clinicians’ expectations, both in terms of identifying boys as an ‘at-risk’ group, and conversely, lowering their ‘index of suspicion’, potentially leading to under-identification of girls with ASD (Nichols et al. 2008).

Sex differences in cognitive abilities have also been reported in numerous studies, particularly those conducted prior to the implementation of ICD-10 and DSM-IV diagnostic criteria. These studies report that girls with ASD tend to have lower mean IQ and higher rates of severe intellectual disability compared to boys (Lord and Schopler 1985; Lord et al. 1982; Tsai and Beisler 1983; Volkmar et al. 1993). Accordingly, cognitive ability has often been found to have a moderating effect on reported sex ratio in ASD. The ratio of males to females has been reported to be <2:1 among children in the lower end of the IQ distribution (IQ < 50–55; Lord and Schopler 1985; Lotter 1966; Tsai and Beisler 1983; Volkmar et al. 1993; Wing 1981; Yeargin-Allsopp et al. 2003), but as high as 8:1 among children with ASD with average or above-average intellectual function (Scott et al. 2002). However, two recent studies, both of which focused on very young children diagnosed with ASD, have reported less pronounced sex differences in cognitive abilities. Carter et al. (2007) compared a sample of 68 boys and 22 girls with ASD diagnosed between the ages of 18 and 33 months and subsequently recruited to a longitudinal study. These toddlers had similar composite scores on the Mullen Scales of Early Learning (MSEL) but somewhat different profiles, with girls showing a relative strength in visual reception skills, and boys, a relative strength in language skills. Hartley and Sikora (2009) compared 157 boys and 29 girls with ASD (mean age 35 months) who had been diagnosed at a tertiary care clinic, and found no differences between boys and girls with ASD on any of the MSEL subscales. Both sexes showed relative strengths in fine motor and visual reception compared to language skills (Hartley and Sikora 2009). Furthermore, Banach et al. (2009) detected sex differences in cognitive level in females from single incidence (i.e., simplex) but not multiple incidence (i.e., multiplex) families. Spiker et al. (2001) also found no difference in cognitive level between males and females with ASD from multiplex families. Thus, in contrast to prior reports, recent studies suggest less consistent sex differences in cognitive profiles, particularly in multiplex families.

There are also conflicting data on whether boys and girls with ASD differ in symptom severity. IQ differences may confound comparison of symptoms between boys and girls with ASD (Pilowsky et al. 1998; Volkmar et al. 1993), as symptom severity is generally correlated with degree of cognitive impairment (Szatmari et al. 1996). However, even in studies controlling for IQ, findings have been inconsistent. Pilowsky et al. (1998) reported no sex differences in ASD symptoms on the Autism Diagnostic Interview—Revised (ADI-R; Lord et al. 1994) and Childhood Autism Rating Scale (CARS; Schopler et al. 1980) in an age- and IQ-matched sample of 19 males and 19 females ranging from 3 to 30 years of age. Holtman et al. (2007) also reported similar scores on the ADI-R and Autism Diagnostic Observation Schedule (ADOS; Lord et al. 2000) in a high-functioning sample of 2- to 21-year-olds (23 males and 23 females), although females were reported to have higher levels of comorbid psychopathology (e.g., attention problems). However, other studies have reported sex-related differences in ASD symptom severity and profiles, even when controlling for IQ. Lord et al. (1982) reported higher levels of restricted and repetitive behaviors in boys within a sample of 475 3- to 8-year-olds with ASD. McLennan et al. (1993) also reported that boys have more repetitive behavior symptoms than girls, as well as higher levels of social and communicative impairment, in a sample of 6- to 36-year-olds (n = 42). In contrast, Carter et al. (2007) and Hartley and Sikora (2009) report small but statistically significant increases in communication symptoms on the ADOS in girls compared to boys with ASD. These latter two studies differ from the earlier studies by their focus on much younger children, raising the possibility of sex differences in ASD symptom trajectories, and whether different symptom domains influence clinical detection and subsequent referral of boys and girls with ASD.

Recent longitudinal studies of infants at increased risk of ASD (younger siblings of children with the diagnosis; hereafter, ‘high-risk infants’) have generated new insights about the earliest signs of ASD and strategies for earlier diagnosis (Zwaigenbaum et al. 2009). Prospective research with high-risk infants also provides unique opportunities to examine sex differences in ASD rates and in cognitive and symptom profiles among children identified at an early age. In our multi-site prospective Canadian cohort, the initial group of high-risk infants later diagnosed with ASD (Bryson et al. 2007), only 5 of 9 were males, but these numbers were far too small to draw any conclusions. As 3-year outcome data accumulated, we were interested in examining sex differences among diagnosed and non-diagnosed children. Hence, the main objectives of this study were to: (1) compare rates of ASD diagnoses among male and female high-risk infants followed to age 3, and (2) compare cognitive and adaptive functioning and ASD symptom severity in boys and girls with confirmed diagnoses. Non-diagnosed siblings and a low-risk comparison group were also included to distinguish general sex differences and those related to familial risk from those specific to ASD. Our study is unique in its focus on children with ASD from a narrowly defined age group (i.e., all assessments competed at age 3), ascertained from a high-risk cohort (i.e., not determined by clinical referral), and the inclusion of appropriate high- and low-risk comparisons.

Methods

Participants

Infant siblings of children with ASD (hereafter, ‘high-risk’ or HR infants) were recruited to a prospective study of early development in autism from four multidisciplinary autism diagnostic and treatment centers in Canada, including the Glenrose Rehabilitation Hospital in Edmonton, McMaster Children’s Hospital in Hamilton, The Hospital for Sick Children in Toronto, and the IWK Health Centre in Halifax, and from clinicians in the surrounding regions. The study was approved by the Research Ethics Boards at the four participating centers, and parents gave written informed consent for their children to participate. A total of 319 HR infants have been followed to age 3 years. Diagnosis of ASD in the older sibling (i.e., the proband) was confirmed by review of the clinical diagnostic report using DSM-IV-TR criteria, which in most cases included the ADOS (Lord et al. 2000), and in some cases, the ADI-R (Lord et al. 1994). A comparison group of 113 low-risk (LR) infants with no known 1st, 2nd or 3rd degree relatives with ASD was recruited from the same geographic areas as the sibling sample. All participants were born at 36–42 weeks gestation, had a birth weight greater than 2,500 g, and had no genetic or neurological disorders. Participants were assessed at age 6 (if possible), 12, 18, 24 and 36–42 months (hereafter, ‘3 years’). To maximize diagnostic stability, outcome data reported in this paper are limited to the 3-year assessment.

Measures: High and Low-Risk Infants

Mullen Scales of Early Learning (MSEL; Mullen 1995)

The MSEL consists of four scales: Visual Reception, Receptive Language, Expressive Language and Fine Motor (a fifth scale that measures gross motor development is only administered with children younger than 30 months). An Early Learning Composite (ELC) can be calculated based on scores from these four scales for children aged 0–69 months. Inter-rater and test–retest reliability of the MSEL are excellent. Thus, the MSEL was selected as it could be used to assess high- and low-risk infants longitudinally beginning at age 6 months, which was the goal of the over-arching research program (Zwaigenbaum et al. 2005).

Vineland Adaptive Behavior Scales (VABS; Sparrow et al. 1984)

This is a semi-structured parent interview designed to assess adaptive behaviour in four subdomains—Communication, Daily Living, Socialization, and Motor skills, outlined by typical developmental milestones that are anchored to specific ages. The scale has excellent reliability and concurrent validity, and is sensitive to impairments experienced by children with ASD (Volkmar et al. 1993; Carter et al. 1998).

Autism Diagnostic Observation Schedule (ADOS; Lord et al. 2000)

The ADOS uses standardized activities and ‘presses’ to elicit communication, social interaction, imaginative use of play materials, and repetitive behaviors, allowing the examiner to observe the occurrence or non-occurrence of behaviors important to the diagnosis of ASD (Lord et al. 1989). Inter-rater reliability of the ADOS is excellent (Lord et al. 2000). The scoring algorithm was recently revised to optimize discrimination of ASD from other developmental disabilities and is organized into two domains, Social Affect (including Communication and Social items), and Restricted Repetitive Behaviors (Gotham et al. 2007). The ADOS consists of four modules, each of which is appropriate for individuals of differing language levels (Module 1 = minimal or no language, Module 2 = regular use of non-echoed 3-word phrases, Module 3 = child with fluent language; and Module 4 = adolescent or adult with fluent language), the first three of which were used to assess participants in this study. To optimize comparability across modules (and thus, across language levels), an overall ADOS severity metric score was calculated, as recommended by Gotham et al. (2009).

Autism Diagnostic Interview-Revised (ADI-R; Lord et al. 1994)

The ADI-R is an investigator-directed interview used to elicit information about social development, verbal and non-verbal communication skills and repetitive, stereotyped interests and behaviors required to make an ICD-10 or DSM-IV diagnosis of autism. The questions are designed to distinguish qualitative impairments from developmental delays, identifying behaviors that would be considered deviant at any age and examining current and most abnormal behaviors for those strongly influenced by maturational age. The ADI-R discriminates well between autism and other forms of developmental disability, and inter-rater reliability is excellent (Lord et al. 1994). Criteria for ASD proposed by Risi et al. (2006) were used for this study; that is, ADI-R scores were (1) within the autism range for social and within 2 points of the autism range for communication, or (2) within the autism range for communication and within 2 points of the autism range for social, or (3) within 1 point of the autism range on both social and communication domains.

Diagnostic Procedure

At 3 years of age (mean 38.1 months; SD = 2.5), an independent diagnostic evaluation of each participant was conducted by an expert clinician blind to assessments from previous study visits. ASD diagnoses were assigned using DSM-IV-TR criteria, based on the best judgment of the clinician (developmental pediatrician, child psychiatrist or clinical psychologist, all with at least 10 years of diagnostic experience), taking into account information from the concurrent ADI-R and ADOS and assessments of cognitive, language and adaptive skills. Some children with a clinical diagnosis of ASD had sub-threshold algorithm scores on the ADOS and/or ADI-R, but met DSM-IV-TR criteria based on expert review of all available data. This approach is consistent with current best practice (Baird et al. 2011), and informed by a solid evidence base that indicates that both a structured interview such as the ADI-R and interactive assessment (such as the ADOS) are essential to ASD diagnoses, but that neither alone or in combination, trumps clinical judgment, particularly in this age group (Kim and Lord 2012).

Measures: Probands

Confirmation of ASD diagnosis in the probands was based on a review of the child’s clinical diagnostic report by the local site investigator. Measures were the same as those obtained for HR and LR infants, with the exception that cognitive skills were assessed using the Stanford–Binet Intelligence Scales: Fifth Edition (SB5; Roid 2003) rather than the MSEL. The SB5 is a standardized test of cognitive abilities designed to assess individuals from 2 years of age through adulthood (whereas the MSEL is standardized to only 69 months). The SB5 is made up of two sub-scales with five subtests in each; one measures nonverbal abilities and the other, verbal skills. Together these provide a Full Scale IQ (FSIQ). We determined that the Abbreviated Battery IQ (ABIQ) scale, which consists of two routing subtests: one nonverbal (Object Series/Matrices) and one verbal (Vocabulary), is highly correlated with the FSIQ (r = .95, based on 63 children with ASD; Coolican et al. 2008). Thus, cognitive skills in probands were measured using the SB5 ABIQ.

Analyses

Descriptive statistics were generated for probands, HR infants (stratified by 3-year diagnostic status: ASD or non-ASD) and LR infants regarding sex ratio, ASD symptoms, and cognitive and adaptive skills. Rates of 3-year ASD diagnoses among male and female high-risk infants were compared using Chi-Square Tests (χ2) and then further stratified by the sex of the proband (i.e., older sibling with ASD). Sex differences in ASD rates are reported as relative risk rather than simple sex ratio, as the HR group contained unequal numbers of boys and girls. We also assessed whether ASD rates were related to age of recruitment, to assess potential effects of participation biases (e.g., related to later-emerging parental concerns). Next, cognitive and adaptive functioning (indexed by the MSEL and VABS subscale scores, respectively) and ASD symptoms (indexed by the ADOS severity index and ADI-R subscale scores) were compared in boys and girls with ASD, as well as HR and LR participants not diagnosed with ASD. Separate MANOVAs were used to compare MSEL and VABS scores. The VABS was added later to our assessment protocol, so was available for only about two-thirds of our sample. MANCOVA was used to compare ADOS and ADI-R scores in boys and girls with ASD, using the MSEL Early Learning Composite as a covariate to adjust for potential confounding between ASD symptoms and cognitive level. Finally, we assessed whether ASD symptom severity and cognitive skills were correlated within families for probands and HR siblings, and whether similar correlations were observed whether the HR sibling was a boy or a girl.

Results

Sample Description

Descriptive statistics for probands, high-risk infants (stratified by 3-year diagnosis: ASD or non-ASD) and low-risk infants are summarized in Table 1. Proband mean Stanford-Binet ABIQ was 85.3 (SD = 23.9) and Vineland ABC was 70.3 (SD = 19.9). The sex ratio among probands was 4.5:1. Details regarding 3-year ASD symptoms, cognitive and adaptive scores of high-risk and low-risk infants are provided below. Resource constraints limited our ability to complete comprehensive diagnostic and developmental assessments on every proband, thus, SB5 and ADOS data are limited to 139 probands out of 319. There were no differences between families of probands with and without these data with respect to proband sex [81.3 vs. 82.2 % male, respectively; χ2 (1, N = 319) = .045, p = .83], sex of the younger sibling [55.4 vs. 55.0 % male; χ2 (1, N = 319) = .005, p = .94], nor 3-year outcomes in the younger siblings, including rates of ASD diagnoses [25.9 vs. 27.2 %; χ2 (1, N = 319) = .070, p = .79], ADOS severity scores [mean = 3.55, SD = 2.7 vs. mean = 3.45, SD = 2.6; t 317 = 0.35, p = .73], MSEL ELC scores [mean = 105.9, SD = 22.6 vs. mean = 102.4, SD = 21.3; t 312 = 1.40, p = .16] and VABS ABC scores [mean = 90.5, SD = 15.9 vs. mean = 88.1, SD = 14.9; t 247 = 1.21, p = .23].

Table 1 Sample characteristics

Of 319 HR infants followed to age 3 years, 218 were initially assessed at age 6 months (68.3 % of the total sample), and 101 at 12 months. Rates of ASD diagnoses were higher in HR infants initially assessed at 12 months (35 of 101; 34.7 %) than those initially assessed at 6 months (50 of 218; 22.9 %); χ2 (1, N = 319) = 4.85, p = .028. However, similar proportions of boys and girls were initially assessed at 6 months (65.9 and 71.3 %, respectively; χ2 (1, N = 319) = 1.07, p = .30), so recruitment age is unlikely to confound comparisons between boys and girls on clinical outcomes.

Recurrence Rates and Relative Odds for ASD by Sex of the Proband and Younger Sibling

Of 319 high-risk infants, 85 (27.0 %; 95 % CI = 22.1–31.9 %) received a clinical diagnosis of ASD at age 3,Footnote 1 including 57 of 156 boys (32.4 %) and 28 of 143 girls (19.6 %); χ2 (1, N = 319) = 6.62, p = .01 (see Table 2). The relative risk of ASD diagnosis in boys compared to girls were 1.65 (95 % CI = 1.11–2.45). Only 1 of 129 children in the low-risk comparison group was diagnosed with ASD (a boy; because just a single case, excluded from further analyses and not discussed further in this paper). Relative risk (RR) of ASD in boys versus girls were significantly higher than 1.0 in families with male probands (RR = 1.83; 95 % CI = 1.18–2.85), but not in families with female probands (RR = 1.06; 95 % CI = 0.43–2.58). However, there were far fewer female proband families (as per the 4.5:1 sex ratio among probands), limiting power to compare differences formally in relative risk of ASD in male versus female HR siblings by proband sex.

Table 2 ASD outcome rates by sex of proband and infant sibling

Cognitive and Adaptive Function at 3 Years

Cognitive data from the MSEL were available on all participants at 36–42 months except for one female sibling with ASD. The VABS was added to our assessment protocol 3 years after the launch of our study, and thus was available for only about two-thirds of our sample (see Table 3 for details). ADOS and ADI-R data were available for all participants.

Table 3 Cognitive and adaptive function at 36–42 months by diagnostic status and sex

MANOVAs were completed for the MSEL (4 subscales) and VABS (3 subscales), examining main effects for outcome group (i.e., HR siblings diagnosed with ASD, HR siblings not diagnosed with ASD, and non-diagnosed LR comparison infants) and sex, and possible group × sex interactions (see Table 3). As expected, there are group effects for the MSEL (Wilks’ Lambda = 0.73, F(8, 852) = 18.0, p < .001, partial η 2 = 0.14) and VABS (Wilks’ Lambda = 0.70, F(6, 702) = 22.7, p < .001, partial η 2 = 0.16), and for each subscale on post hoc testing, mainly reflecting lower scores in the HR siblings diagnosed with ASD. Despite this main effect, the HR siblings diagnosed with ASD were a relatively high-functioning group, with a mean MSEL ELC standard score of 83.4 (SD = 24.1) and VABS ABC standard score of 78.2 (SD = 13.3; not shown in Table 3). The MANOVAs also indicated main effects for sex in both the MSEL (Wilks’ Lambda = 0.97, F(4, 426) = 3.78, p = .005, partial η 2 = 0.03) and VABS (Wilks’ Lambda = 0.97, F(3, 351) = 3.40, p = .018, partial η 2 = 0.03). Post hoc analyses indicated that sex differences were limited to the MSEL Fine Motor subscale (F(1, 429) = 13.0; p < .001, partial η 2 = 0.03), and two VABS subscales, Socialization (F(1, 353) = 6.82; p = .009, partial η 2 = 0.02), and Daily Living Skills (F(1, 353) = 6.57; p = .011, partial η 2 = 0.02). On all 3 of these subscales, girls scored higher than boys in each of the 3 outcome groups (i.e.,. HR-ASD, HR-non-ASD and LR). There were no group × sex interactions on the MSEL nor VABS in the primary analyses, nor on any subscale, reflecting relatively small differences between boys and girls in all 3 groups (i.e., not specific to the HR-ASD group). Effect size estimates within the HR-ASD group (i.e., mean differences between boys and girls divided by pooled SD, with positive values indicating higher scores in girls) ranged from −0.20 (for VABS Communication) to +0.36 (for MSEL Fine Motor; see Table 3 for details).

ASD Symptom Scores at 3 Years

As anticipated, the MSEL Early Learning Composite was correlated with ASD symptom severity, indexed by the ADOS severity metric (r = −.51, p = .001). Thus, MANCOVA was used to assess for sex differences in ASD symptoms, using the ADOS severity score and the 3 ADI-R domain scores as the dependent variables, outcome group and sex as independent variables and the MSEL Early Learning Composite as a covariate. There were overall main effects for group (Wilks’ Lambda = 0.47, F(8, 848) = 48.5, p < .001, partial η 2 = 0.31) and sex (Wilks’ Lambda = 0.94, F(4, 423) = 6.27, p < .001, partial η 2 = 0.06) on the MANCOVA. Post hoc examination of specific ASD symptom indices revealed sex differences on the ADOS severity metric (F(1, 426) = 7.91; p = .005, partial η 2 = 0.02), and the ADI-R Communication (F(1, 426) = 19.5; p < .001, partial η 2 = 0.04) and Social (F(1, 426) = 3.95; p = .049, partial η 2 = 0.01) domains (see Table 4). However, there was no significant group x sex interaction in the main analysis nor on post hoc comparisons on the ADOS severity metric and ADI-R subscales. Effect size estimates generally reflected modest elevation in symptom scores in boys compared to girls across all 3 groups (see Table 4). Within the HR-ASD group, effect size estimates indicated slightly higher level symptoms in boys (e.g., ES = 0.22 for ADOS severity metric), with the exception of repetitive behavior symptoms as measured by the ADI-R (ES = −0.14; see Table 4).

Table 4 ASD symptoms at 36-42 months by sex and diagnostic status

Correlations Between Probands and HR Siblings

Table 5 summarizes correlations between probands and HR siblings (at age 3) as a group, and limited to families in which the HR sibling was diagnosed with ASD, with respect to cognitive function and ASD symptom severity. The ADOS was available on 126 probands, and cognitive data on 120 probands. ADOS severity metric scores were not significantly correlated in probands and HR siblings (r = .035, p = .70). Cognitive function, indexed by the SB5 ABIQ in probands and the MSEL ELC in HR siblings, was modestly correlated in probands and HR siblings (r = .33, p < .001). Similar correlations were found in families in which the HR sibling was diagnosed with ASD (r = .33) as those in which the HR sibling was not diagnosed with ASD (r = .29).

Table 5 Correlations between probands and high-risk infants on ASD symptoms and cognitive scores

Discussion

This study of 3-year-olds with ASD ascertained from a high-risk cohort (i.e., with affected older sibling) yielded three intriguing findings. First, ASD rates were only modestly higher in boys than girls, with relative risk of 1.65 and an associated 95 % confidence interval of 1.11–2.45, just outside the lower end of the male-to-female ratio of 2.5–4:1 reported in almost all clinically referred and epidemiologic samples (Autism and Developmental Disabilities Monitoring Network 2007; Bryson et al. 1988; Ehlers and Gillberg 1993; Fombonne 2008; Ritvo et al. 1989; Scott et al. 2002; Yeargin-Allsopp et al. 2003). Second, although there were sex differences in cognitive and adaptive skills levels and in ASD symptom severity in the combined HR and LR sample, the magnitude of these differences did not vary by diagnostic outcome. That is, differences between boys and girls with ASD within the HR group generally mirrored differences between boys and girls in the non-ASD HR and LR groups. Finally, for the subgroup of families with available proband assessment data, there were modest correlations between probands and HR siblings with respect to cognitive level but not ASD symptom severity. Sibling-proband correlation on cognitive level was evident regardless of whether the sibling was diagnosed with ASD or not, and did not appear to be influenced by the sex of the sibling within the ASD subgroup. These findings raise interesting questions about how ASD cases identified from high-risk cohorts may differ from clinically referred children, and how genetic differences between simplex and multiplex families may influence sex-related differences in phenotypic expression.

Several factors may have contributed to greater similarity in ASD rates and cognitive and adaptive levels in boys and girls than reported in previous studies. First, by definition, all children diagnosed with ASD in our high-risk cohort belong to multiplex families. There is growing evidence that genetic mechanisms underlying vulnerability to ASD differ between simplex and multiplex families. For example, rates of functionally significant copy number variants are identified in 7–10 % of children with ASD from simplex families, but only 2–3 % of children with ASD from multiplex families (Christian et al. 2008; Itsara et al. 2010; Marshall et al. 2008, although also see Pinto et al. 2010). Moreover, in a previous analysis of four pooled multiplex samples, the sex ratio of later-born children with ASD was somewhat lower than that for first-born affected children with ASD (2.9:1 and 4.7:1, respectively; Jones et al. 1996). The recent finding that male:female ratio decreases with increasing paternal age (Anello et al. 2009) may contribute to such differences. Furthermore, both Banach et al. (2009) and Spiker et al. (2001) reported no cognitive difference between males and females with ASD from multiplex families.

However, the modest sex differences in cognitive functioning identified in our sample may not simply reflect a focus on multiplex families. Three recent studies involving mainly single-incidence families reported minimal or no sex differences in overall cognitive functioning (Carter et al. 2007; Hartley and Sikora 2009; Mandy et al. 2011). Carter et al. (2007) identified sex differences in cognitive profile on the MSEL in their toddler sample, with a relative strength in non-verbal cognitive skills in girls, and a relative strength in language skills in boys, whereas Hartley and Sikora (2009) reported similar MSEL profiles in boys and girls. Mandy et al. (2011) reported that boys and girls ascertained from a clinic for children with ‘higher functioning’ ASD had similar mean IQs on both verbal and non-verbal measures. Thus, our study adds to growing evidence that the substantial sex differences in cognitive functioning reported in the ASD literature 15–20 years ago (i.e., prior to the implementation of ICD-10 and DSM-IV) may not be reflected in children diagnosed using current criteria. Notably, the mean IQs of earlier cohorts were much lower than those of more recent cohorts (including ours), suggesting that current criteria target a higher proportion of higher-functioning children.

Our high-risk design may have influenced the relative ascertainment of boys and girls with ASD. Identification of children with ASD did not depend on clinical referral; rather, all children in the high-risk cohort were assessed. The relatively high cognitive level (mean MSEL Early Learning Composite = 83.4), lack of sex differences in mean cognitive level, and modest sex ratio in diagnosed cases combine to suggest that we may have identified an unexpected number of higher functioning girls with ASD. Some authors have suggested that girls with ASD, particularly those with average intelligence, may be less likely to be identified clinically than boys, due to milder social and communicative symptoms, relatively intact symbolic play skills, and less obvious atypicality of their obsessional interests (Kopp and Gillberg 1992; Nyden et al. 2000). Indeed, a recent secondary analysis of population-based data of 8-year-olds from the Autism and Developmental Disabilities Monitoring Network from 13 sites in the US found that girls with ASD were less likely to have received a community diagnosis than boys with ASD, despite meeting diagnostic criteria on independent educational and developmental service file review (Giarelli et al. 2010). Under-ascertainment in that study was particularly pronounced in girls with IQs above 70. An interesting possibility is that some higher-functioning girls from our HR sample who were diagnosed at age 3 years will no longer meet criteria for ASD as they get older, leaving fewer girls (hence a higher male:female ratio), lower mean IQ among remaining girls, and lower overall recurrence rate based on diagnostic status at an older age. Planned follow-up will help to determine whether cognitive profiles and diagnostic classifications vary in stability between HR boys and girls as they age, which may in turn change estimates of sex ratio for ASD.

Ozonoff et al. (2011), in a previous pooled analysis of 664 HR infant siblings from 12 sites of the Baby Siblings Research Consortium (including about 200 HR infants from our sample), reported a relative risk of ASD in boys versus girls of 2.8 (95 % CI: 1.9–4.0). This study, which focused on factors influencing recurrence rates of ASD, did not include data on non-diagnosed HR infants, nor did it examine possible sex differences in ASD symptoms or cognitive functioning in HR infants with ASD (no data were provided on adaptive function). Criteria for ASD diagnoses (i.e., consensus best estimate plus ADOS criteria met) reported by Ozonoff et al. (2011) were slightly different than those utilized in our study, in part because ADI-R data were not available at all sites. Although not specifically discussed in the paper, it is interesting to note that 84.2 % of the probands in the BSRC sample were male (i.e., sex ratio = 5.3), in contrast to the relative risk in later born males versus females of 2.8, consistent with the sex ratio difference in first- versus second-affected children within multiplex families reported by Jones et al. (1996). No other prospective studies of high risk infants by individual research groups have examined relative risk of ASD nor differences in ASD symptoms and cognitive functioning.

Notably, 27 % of 3-year-olds in our younger sibling cohort were diagnosed with ASD, which is a surprisingly high rate, although in keeping with the recent pooled recurrence risk estimate of 18.7 % from the BSRC study (Ozonoff et al. 2011). Although examining overall rates of ASD was not a primary objective of this study, the observed rates warrant further discussion. Several factors may have contributed to higher than expected rates of ASD. First, parents’ decisions to participate may be influenced by level of concern. As such, high-risk infants already displaying early signs of ASD may be over-represented in our sample, biasing estimates of recurrence upwards. Landa and Garrett-Mayer (2006) expressed similar caution when reporting rates of clinical ASD diagnoses in a comparable cohort. Indeed, we found a difference in ASD rates between children initially assessed at 6 months and 12 months (22.9 and 34.7 % respectively; χ2 (1, N = 319) = 4.85, p = .028). Our previous work has shown that developmental concerns are rarely present by 6 months in high-risk infants (Bryson et al. 2007; Zwaigenbaum et al. 2005); hence ASD rates in 6-month recruits are likely less susceptible to participation bias. Second, even though the 3-year assessments were blind to prior study data, the fact that examiners knew the nature of the sample could have introduced a bias that inflated rates of clinical ASD diagnoses. That being said, similar expectation biases presumably operate in studies involving clinically-referred children. Third, our ASD case definition allowed for children who were sub-threshold on the ADI-R or ADOS to be clinically diagnosed if they met DSM-IV-TR criteria (based on the clinical judgment of the expert clinician). If children who were sub-threshold on either the ADOS or ADI-R scoring algorithms were excluded 48 of 319 (15.0 %) would meet criteria for ASD, which is still higher than the oft-quoted estimates of 5–8 % (Sumi et al. 2006; Szatmari 1999; Ritvo et al. 1989). As well, most of these children (34 of 37) were sub-threshold on the ADI-R alone. The ADI-R has lower sensitivity for ASD in children 3 years and younger than in older age groups (Ventola et al. 2006), and thus would tend to exclude some children for whom subsequent assessment would confirm ASD. It is also important to acknowledge that published recurrence rates from studies completed in the 1980 s (e.g., Ritvo et al. 1989) were based on diagnostic classification systems (i.e., DSM-III) that used a narrower definition of the autism construct. In our sample, 30 of 319 high-risk sibs (9.4 %) met strict criteria for autistic disorder as defined by DSM-IV and confirmed using the ADI-R and ADOS. Although these criteria are not identical to those outlined in DSM-III, this rate is much closer to the recurrence rate of 8.6 % reported by the Ritvo et al. (1989) UCLA-University of Utah epidemiologic study. Population-based epidemiologic studies are needed to provide accurate estimates of recurrence rate based on current case definitions for ASD. However, it is reasonable to anticipate that the broader diagnostic ‘spectrum’ construct will generate higher recurrence rates than in the past, in the same way as overall prevalence rates have increased over the same time period (Fombonne 2008).

We acknowledge several limitations to our study. First, although ascertainment of ASD cases involved the largest high-risk cohort reported to date, the actual number of children with ASD was still relatively small. Thus, our estimates of relative risk of ASD in boys compared to girls are associated with a wide confidence interval, and we may have failed to detect sex differences in cognitive profiles and ASD symptoms due to modest power. Further replication by other research groups, and ideally, pooling of data across comparable high-risk cohorts will be needed. Nonetheless, this is the first study to examine sex differences in children with ASD ascertained entirely through a longitudinal study of high-risk infants, and is comparable in size to previous studies of much more heterogeneous samples (Holtman et al. 2007; McLennan et al. 1993; Pilowsky et al. 1998). Moreover, our study is among the first to examine sex differences not only in an ASD cohort, but also in comparable non-ASD HR and LR cohorts, and to examine the specificity of such differences with respect to cognitive level and ASD symptoms. Second, additional follow-up is needed to confirm the stability of both the relative odds for ASD in male and female high-risk infants, as well as the overall recurrence rate. Previous longitudinal cohort studies (Charman et al. 2005; Lord et al. 2006) suggest that children rarely change diagnostic classification from ASD to non-ASD after age 3. However, we cannot assume that these findings will generalize to children ascertained from a high-risk cohort. Of particular interest will be the diagnostic stability of children with milder ASD symptoms (particularly those who were sub-threshold on the ADI-R or ADOS at age 3) and/or relatively intact language and cognitive skills, particularly girls. Third, cognitive assessment at later ages may identify sex differences in later-emerging problem-solving and abstract reasoning abilities that may go undetected on the MSEL at 3 years, again emphasizing the importance of ongoing follow-up.

In summary, the relative risk of ASD diagnoses in boys compared to girls in our longitudinal high-risk cohort were 1.65, which is lower than expected from previous clinical and epidemiologic samples. Modest sex differences in symptom severity and cognitive and adaptive abilities were detected in the HR and LR sample as a whole, none of which were specific to the ASD group. The modest relative risk of ASD in boys, combined with the relatively high cognitive level suggests that we are identifying an unanticipated number of higher functioning girls with ASD in our high-risk cohort. These findings raise interesting questions about how children with ASD from high-risk cohorts may differ from those who are clinically referred. Additional follow-up will be needed to confirm the stability of both the sex ratio and overall ASD rate in our high-risk cohort, and replication of these findings should be attempted in similar samples.