Introduction

Autism spectrum disorders (ASD) represent a diverse set of neurodevelopmental disorders with a wide range of presentations. DSM-5 criteria, supported by recent investigations (Frazier et al. 2010, 2012; Mandy et al. 2012), define ASD as a broad category with two symptom dimensions—social communication/interaction (SCI) and restricted/repetitive behavior (RRB). Specific genetic variations contributing to ~15–20 % of ASD cases have been identified, primarily through relationships with genetic syndromes (Abrahams and Geschwind 2008), copy number variation (Levy et al. 2011; Sanders et al. 2011a; Sebat et al. 2007), and small-scale gene-disrupting variants (Iossifov et al. 2012; Jiang et al. 2013; Neale et al. 2012; O’Roak et al. 2011, 2012; Sanders et al. 2012; Schaaf et al. 2011). It is possible that, as emerging bioinformatics technologies are applied and refined, a substantial proportion of idiopathic cases will harbor causative variation. However, at present, the etiology of the majority of ASD cases remains unknown. Epigenetic (Gregory et al. 2009; Hu et al. 2006) and environmental effects (Newschaffer et al. 2002) may also contribute to ASD. Epidemiological studies have identified a host of risk factors, including increased parental age (Grether et al. 2009; Reichenberg et al. 2006; Shelton et al. 2010), neonatal complications (Bilder et al. 2009; Gardener et al. 2011; Schmidt et al. 2012), and environmental exposures (McCanlies et al. 2012; Windham et al. 2006). However, these findings have tended to result in small increases in the risk of ASD (Newschaffer et al. 2007, 2002), are difficult to replicate (Croen et al. 2005; Holmes et al. 2003; Ip et al. 2004), or may operate via increased risk of germline mutation in offspring (Zhao et al. 2007). As a result, the presence, type, and magnitude of environmental influences on the development of ASD remain uncertain.

Clarifying etiologic factors is crucial for guiding future research. Twin studies have the potential to inform the relative contributions of genetic and environment factors to ASD etiology (Ronald and Hoekstra 2011). Early twin studies supported a strong genetic etiology to ASD (Bailey et al. 1995; Folstein and Rutter 1977; Ritvo et al. 1985; Steffenburg et al. 1989). More recent diagnostic concordance studies have confirmed high monozygotic concordance for ASD (>88 %), but also higher than previously appreciated dizygotic (>30 %) and sibling (>15 %) concordance rates (Ozonoff et al. 2011; Rosenberg et al. 2009; Taniai et al. 2008). Observations of high dizygotic concordance have raised the prospect of non-trivial environmental contributions to ASD. A recent study by Hallmayer et al. (2011), using a large, carefully ascertained ASD-affected twin sample, identified a substantial environmental contribution to ASD diagnosis (~58 %), although other studies have found minimal shared environmental effects (Bailey et al. 1995; Lichtenstein et al. 2010). Population and community-based quantitative trait studies conducted over the last 12 years have generally supported very strong heritability (40–87 %), with a modest shared environment component (0–32 %) (Ronald and Hoekstra 2011). Literature discrepancies likely reflect measurement, statistical, and sampling differences across studies. For example, Hallmayer et al. examined a liability threshold model for categorically-defined ASD using an affected twin sample. In contrast, population and community studies have modeled quantitatively-assessed autism symptoms, and presumably included only a minority of ASD-affected twin pairs distributed at the extreme trait levels (Constantino and Todd 2000, 2003, 2005; Ronald et al. 2006, 2005, 2006; Skuse et al. 2005; Stilp et al. 2010). The present study clarifies these methodological factors by simultaneously evaluating both quantitatively-assessed symptoms and categorically-defined ASD, using DeFries-Fulker regression and liability threshold models, and focusing on a large ASD-affected twin sample that also includes non-ASD twin pairs.

The two predominant behavioral genetic approaches to examining ASD etiology—assessing autism symptoms in the population versus concordance of ASD diagnoses in clinical samples—mirror two distinct views of ASD. The first viewpoint proposes that autism symptoms are best represented dimensionally (Constantino 2009), with broad autism traits being intermediate between severe and typical symptom levels. In the dimensional model, differences between typical and ASD symptom levels are a matter of degree (i.e., no distinct ASD category is present). The majority of data from population studies of quantitatively-assessed autism traits support a dimensional conceptualization (Ronald and Hoekstra 2011). In these studies, heritability estimates are generally consistent across typical and extreme symptom levels (Lundstrom et al. 2012; Ronald et al. 2006) and independent genetic effects influence each symptom domain (Robinson et al. 2012; Ronald et al. 2006, 2008).

The second viewpoint is that ASD represents a natural symptom category with related SCI and RRB sub-domains (Frazier et al. 2012; Mandy et al. 2012). In this model, ASD-affected and unaffected individuals show qualitative differences in symptom levels and relatives with broad phenotypic traits (Losh et al. 2009) could be considered sub-threshold for ASD. The categorical view of ASD has been supported by recent empirical investigations of autism symptoms (Frazier et al. 2010, 2012) and a population study of toddler twin pairs that found higher heritability for a more extreme threshold (Stilp et al. 2010). Understanding the genetic and environmental architecture of ASD will be crucial to resolving these views and speeding the search for etiologic mechanisms.

The present study evaluated genetic and environmental influences using the largest ASD-affected twin sample ascertained to date. Specific aims were: (1) to characterize and compare the heritable and environmental components of quantitatively-assessed autism symptoms and categorically-defined ASD, (2) to determine whether the magnitude of heritability estimates is consistent between extreme (group heritability) and typical symptom levels (individual differences heritability), and (3) to estimate the magnitude of common and independent heritable influences on SCI and RRB symptom domains.

Methods

Participants

Twin pairs with an ASD-affected member (ASD twins) and pairs without an ASD-affected member (non-ASD twins) were selected from the Interactive Autism Network (IAN) registry (IAN Data Export ID: IAN_DATA_2011-08-01). Data from 568 twin pairs (1,136 youth), including 471 ASD-affected and 97 non-ASD pairs were available. The proportion of twins in IAN (6.9 %) is higher than population expectations (1.8–3.2 %) (Martin et al. 2012), consistent with observations of increased rates of ASD in twins (Zachor and Ben Itzchak 2011). The proportion of monozygotic twins in IAN (22.5 %) was similar to rates identified in the population (12–20 %) (Bortolus et al. 1999) and the proportion of dizygotic opposite sex twins in IAN (42.2 %) was only slightly lower than population expectation (~50 %). Twin zygosity was reported by caregivers and higher-order multiple births were excluded. ASD was defined by collapsing specific DSM-IV-TR diagnoses following current epidemiologic surveillance protocols maintained by the U.S. Centers for Disease Control (Rice 2009). Online Supplement 1 provides a detailed description of the IAN registry and clinical ASD diagnoses.

Informed consent was obtained from parents/guardians before entry into IAN. Use of IAN data for the present study was reviewed and approved by the institutional review board of the Cleveland Clinic.

ASD Measures

Autism symptom data were provided using Social Communication Questionnaire (SCQ) (Rutter et al. 2003) and the Social Responsiveness Scale (SRS) (Constantino and Gruber 2005). The SCQ is a dichotomously keyed rating scale tapping DSM-IV-TR symptoms. The SRS is a 65-item, ordinally-scaled (1 = “not true” to 4 = “almost always true”) questionnaire that provides a quantitative assessment of autism traits. Convergence of findings from these measure increases confidence that results are not simply due to the measurement scale or item content.

Categorical ASD was defined in three ways: (1) caregiver-reported clinical ASD diagnoses, (2) SCQ total raw score ≥15, and (3) SRS total t-score ≥70. Quantitative autism symptoms were assessed using total raw scores on the SCQ, total T-scores on the SRS, and SCI and RRB domain scores derived from each instrument. Total raw scores for the SCQ and T-scores for the SRS were computed based on the published scoring (Constantino and Gruber 2005; Rutter et al. 2003). SCI and RRB domain scores were computed following published latent structure modeling (Frazier et al. 2012) and guidance from proposed DSM-5 criteria (American Psychiatric Association—DSM-5 Development 2011). SCI and RRB domain scores are useful for determining whether these domains are driven by common or unique etiologic factors (Online Supplement 2).

Statistical Approach

Descriptive statistics and Pearson Chi square analyses characterized the sample and compared zygosity groups (MZ-monozygotic, DZSS-dizygotic same sex, DZOS-dizygotic opposite sex). Possible differences between the SCQ and SRS sub-samples and the full IAN twin sample were evaluated using independent samples t-tests comparing twin pairs with complete data (SCQ 333 pairs, SRS 179 pairs) versus pairs with one or both twins having missing data (SCQ 235 pairs, SRS 389 pairs).

To examine etiologic influences on quantitatively-assessed autism symptoms, DeFries-Fulker (DF) regression analyses (Cherney et al. 1992; DeFries and Fulker 1985) of the total and domain-specific scores were computed in extreme sub-samples (>97th and >99th population percentiles) and in ASD-diagnosed twin pairs (Online Supplement 3). Basic and augmented DF models estimated: twin similarity independent of zygosity (B1), group heritability (\( {\text{h}}_{g}^{2} \); B2), shared environment (c2; B3), the difference between group heritability and individual differences heritability (\( {\text{h}}_{g}^{2} - {\text{h}}^{ 2} \); B4), and individual differences heritability (h2; B5). DF models were computed following procedures described by Stevenson (Stevenson 1992), using a zygosity-specific mean transformation (DeFries and Fulker 1988), consistent with recent studies (Robinson et al. 2011, 2012). Using these transformed scores, it is theoretically possible to obtain heritability estimates greater than 1.0 if the difference in regression to the mean for MZ and DZ cotwins is very large. DF regression analyses were used rather than model fitting (Purcell and Sham 2003) for direct comparison to other studies using DF models (Lundstrom et al. 2012). Because the non-ASD group was small (k = 49 pairs) and extreme DF analyses were under-powered, this group was not included in the above extremes analyses. Instead, we only included the non-ASD twin pairs when examining changes in group heritability across increasingly extreme scores using quantile regression DF analyses (Logan et al. 2012). These analyses have better power because quantile estimates are based on the full sample.

Common and unique genetic influences between SCI and RRB domains were estimated using a modified version of the basic DF model (Ronald et al. 2006; Stevenson et al. 1993). In these analyses, the proband’s SCI score was used to predict the co-twin’s RRB score and vice versa. Bivariate and group heritability estimates were used to compute genetic correlations following Knopik et al. (1997). Genetic correlations (rg) evaluate the proportion of heritable influences that are common across SCI and RRB domains.

To estimate etiologic influences on categorically-defined ASD, we first computed probandwise concordances and 95 % CI (Davie 1979) for MZ and DZ twins using clinical diagnosis, SCQ ≥ 15, and SRS ≥ 70 cutoffs. Next, liability threshold model parameters were calculated using maximum likelihood estimation (Online Supplement 4). These models were highly similar to those of Hallmayer et al. (2011) with two exceptions. First, ascertainment was assumed to be complete (Online Supplement 5); all ASD-affected twins were probands (Rosenberg et al. 2009). Second, only MZ and DZSS probandwise concordances estimated model parameters. Opposite sex dizygotic pairs were not used. Sex-limited models with combinations of additive genetic, shared environment, and unique environment/error (ACE models) were estimated. It has been argued that a dominant transmission pattern exists in at least a subset of ASD cases (Zhao et al. 2007). Therefore, we also fit a model with additive genetic, dominant genetic, and unique environment (ADE). The model with all four components is not identifiable and cannot be estimated from the data. Standard errors and 95 % CI were calculated using a bootstrap approach with 1,000 re-samplings.

Results

Sample Description

Table 1 presents demographic and clinical characteristics separately by zygosity group. Consistent with the larger IAN registry, twins were disproportionately described as white/non-Hispanic with highly educated parents. MZ twin pairs tended to have slightly less educated parents (Cohen’s d = .28), possibly reflecting greater research engagement in caregivers with MZ twins. There were no significant zygosity differences in the proportion of ASD-affected twin pairs, age, race, or parent age. Not surprisingly, male twins were more prevalent in the MZ and DZSS groups, consistent with autism sex ratios (Centers for Disease Control and Prevention 2012). SCQ and SRS completion rates were lower in the MZ group. This raises the possibility that some MZ pairs may have missing questionnaire data for one twin due to strong phenotypic similarity (Rosenberg et al. 2009) and may result in more conservative estimates of heritability. Levels of autism symptoms on the SCQ and SRS were congruent with previous reports in ASD-affected and non-ASD youth on these measures (Chandler et al. 2007; Constantino and Todd 2003) and reflect a broad range of autism severity. SCQ and SRS sub-samples were highly similar to the larger IAN twin sample and missing SCQ and SRS data did not significantly influence results (Online Supplement 3).

Table 1 Sample characteristics of 568 twin pairs by zygosity

Quantitatively-Assessed Autism Symptoms

Table 2 presents SCQ and SRS raw scores, z-scores, and transformed scores for extreme groups. Inspection of the MZ and DZ co-twin means reveals substantial regression to the population mean in the DZ co-twins relative to MZ co-twins (Fig. 1). Table 3 presents extreme group intra-class correlations and basic DF model results. Intra-class correlations were consistently 1.5–2 times higher in MZ relative to DZ twins, suggesting substantial genetic influences in the ASD and extreme scoring groups. Group heritability estimates were large and highly significant for both SCQ (B2 = .92–1.07, p < .001) and SRS (B2 = 1.01–1.20, p < .001) when extreme sub-samples and ASD-affected twins were selected. The very large extreme group heritability estimates were plausible, with confidence intervals spanning 1.0. As expected, these values are almost exactly twice the difference of the transformed co-twin scores in Table 2. The reason for these large values is even more clearly seen in the large regression to the mean in DZ co-twins but not MZ co-twins (Fig. 2) (DeFries and Fulker 1988). Removing DZOS pairs only slightly altered group heritability estimates (SCQ \( {\text{h}}_{g}^{2}\) = .71–.88; SRS \( {\text{h}}_{g}^{ 2} = . 9 4{-} 1. 1 3 \)). Further constraining heritability estimates to the MZ cotwin mean continued to produce very high heritability estimates (SCQ \( {\text{h}}_{g}^{2}\); = .52–.73 SRS \( {\text{h}}_{g}^{2} = . 70{-}. 9 7 \)).

Table 2 Untransformed and transformed MZ and DZ co-twin means and standard deviations for SCQ and SRS scores, across extreme group and non-ASD twins
Fig. 1
figure 1

SCQ total raw and SRS total T-scores (M ± 95 % CIs) from MZ and DZ probands with extreme scores (≥97th ‰) and their co-twins

Table 3 Extreme group correlations, twin similarity, and group heritability for SCQ and SRS total scores
Fig. 2
figure 2

Increases in group heritability across levels of quantitatively-assessed autism symptoms

To examine changes in heritability across the continuum of scores in this sample, quantile regression DF analyses were computed. These analyses were powered to detect moderate differences in heritability (≥.50) across score levels. Results indicated large increases in group heritability across levels of each quantitative symptom measure using quantile regression (Fig. 2). Group heritability estimates increased dramatically in the range of recent ASD prevalence estimates (~1 %). For the augmented DF model, shared environment (B3) and individual differences heritability (B5) coefficients tended to be small and non-significant for most sub-samples (Online Supplement 6). Table 4 presents results for domain-specific DF models. The same pattern of findings emerged for SCI and RRB domains. Group heritability was very strong and statistically significant for extreme sub-samples. Individual differences heritability and shared environment were weaker and non-significant (Online Supplement 7).

Table 4 Extreme group correlations, twin similarity, and group heritability for SCI and RRB symptom domains

Table 5 presents cross-construct DF models. Bivariate genetic correlations were very high and statistically significant across each domain and sub-sample. In each case, these correlations approached group heritability estimates from basic DF analyses. Genetic correlations were very high (SCI-RRB rg = .84–.99).

Table 5 Twin similarity and bivariate heritability for cross-construct analyses

Categorically-Defined ASD

Online Supplement 8 presents probandwise concordances across the three definitions of categorical ASD. MZ concordances were substantial across definitions (range = .50–.88). DZSS and DZOS concordances were smaller and variable (DZSS range = .17–.54, DZOS range = .22–.48), similar to those seen in a smaller subset of IAN (Rosenberg et al. 2009). Table 6 presents ACE model estimates for additive genetic, shared environment, and individual environment/error influences on categorical ASD. Results indicated very high shared environment estimates. Small, but significant, additive genetic estimates (0.21, 95 % CI 0.15–0.28) were seen for caregiver-reported ASD diagnosis. Additive genetic effects and the associated confidence intervals increased slightly for SCQ ≥ 15 (0.26, 95 % CI 0.16–0.39) and were larger still for SRS ≥ 70 (0.35, 95 % CI 0.20–0.56). Results for the ADE models (not shown) had implausible negative estimates of the dominance component, which is expected in twin models if a shared environmental component is present in the data.

Table 6 ACE model results for caregiver-reported ASD diagnosis, SCQ ≥ 15, and SRS ≥ 70

Discussion

To our knowledge, this study represents the largest sample of ASD-affected twin pairs that simultaneously includes both quantitative and categorical approaches to autism measurement. Four major findings emerged: (1) extreme levels of quantitatively-measured autism symptoms were strongly heritable with no significant shared environment. (2) Less extreme autism symptom levels showed lower heritability. (3) SCI and RRB symptoms had high genetic correlations, indicating that extreme scores on these domains are driven by common genetic sources. (4) Liability threshold model estimates of additive genetic effects tended to be much lower, but varied depending on model selection (i.e., ACE vs. AE vs. CE models). Each of these results has substantial implications for etiologic models of autism.

High extreme group heritability is consistent with previous quantitative symptom studies (Constantino and Todd 2005; Hoekstra et al. 2007; Ronald et al. 2006, 2005, 2006; 2008, Skuse et al. 2005). However, heritability was smaller at less extreme symptom levels. While this contrasts with the majority of previous investigations (Lundstrom et al. 2012; Robinson et al. 2012), a recent study of autism in toddlers also identified stronger genetic effects at more extreme symptom levels (Stilp et al. 2010), and a population study of language ability reported greater heritability for impaired language levels (Spinath et al. 2004). There are several plausible explanations for the differences between the present study and the majority of population studies. The present study likely under-estimated individual differences heritability due to rater contrast. Caregivers of ASD-affected children may rate unaffected siblings as uniformly unimpaired. With this caveat in mind, it is important to note that the present study is methodologically quite different than previous population studies and therefore, complements them rather than contradicting. Quantile regression analyses in the current investigation were not estimating heritability of autism symptoms in the population, but rather in ASD-affected and non-ASD twin pairs, as discussed further in Online Supplement 5. Even using extreme cutoffs, previous population studies sampled mostly non-ASD twin pairs. For example, selecting scores ≤1 ‰ on a screening measure (Lundstrom et al. 2012; Robinson et al. 2011), less than half the twin pairs in this extreme sub-sample will have a member with categorically-defined ASD, even if all ASD-affected children score high. In the more likely scenario, where some ASD-affected twins have less extreme scores, the ability to detect differences in group heritability across increasingly more extreme scores is further diminished. This implies that previous population twin samples are more likely to identify continuity of heritability across score levels and may be under-powered to detect extreme group heritability explicitly attributable to ASD. In the present study, extreme symptom levels consisted exclusively of ASD-affected pairs.

Common genetic effects were the primary drivers of SCI and RRB symptoms in the present study. This also contrasts with previous population quantitative symptom studies, where genetic correlations tended to be more modest (Robinson et al. 2012; Ronald et al. 2006). Again, this difference may reflect the presence of a large number of ASD-affected twin pairs in the present study. It is possible that SCI and RRB symptoms are separately influenced by a combination of polygenic, common environmental, and/or individual differences factors at typical population levels; but that powerful, pleiotropic effects drive extreme symptom levels. Findings of greater heritability at extreme symptom levels and common genetic effects across domains jointly point toward a categorical model of ASD. In this model, a qualitatively distinct symptom pattern is generated by strong, pleiotropic genetic influences. The categorical model has gained momentum from strong diagnostic stability across childhood (Chawarska et al. 2009; Lord et al. 2006; Moss et al. 2008) and empirically-driven symptom structure studies (Frazier et al. 2010, 2012; Ingram et al. 2008). It is also consistent with recent findings of powerful genetic effects, such as CNVs (Glessner et al. 2009; Levy et al. 2011; Sanders et al. 2011b; Sebat et al. 2007) and rare gene-disrupting mutations (Iossifov et al. 2012; Jiang et al. 2013; O’Roak et al. 2011; Schaaf et al. 2011; Vaags et al. 2012), driving a non-trivial proportion of phenotypic variance in ASD. To date, these stronger genetic effects have been identified in a minority of ASD cases (10–25 %). It is possible that, with emerging whole genome and bioinformatics technologies, a higher proportion of cases with mono- or oligogenic effects leading to autism will be identified. It is also conceivable that the remaining genetic effects are polygenic with a prominent phenotypic threshold effect or that the strong heritability observed in this study represents gene-environment interaction effects.

Intuitively, given the above findings for quantitatively-assessed autism symptoms, ACE model analyses of categorically-defined ASD would be expected to produce similarly large genetic effects. However, the opposite was observed. Categorically-defined ASD had low heritability and much higher estimates of shared environment. This finding was consistent with the recent Hallmayer study, which used similar corrections for ASD prevalence and proband ascertainment. However, there are several reasons to be very skeptical when interpreting results from liability threshold models of categorically-defined ASD. First, estimates from the liability threshold model are heavily influenced by small changes in categorical ASD classifications. For example, a subtle bias increasing ASD diagnosis would, in turn, increase DZ concordance, yielding smaller heritability and larger shared environment effects. Online Supplement 8 demonstrates the effect for only a handful of misclassifications using concordances obtained from Hallmayer and colleagues (Hallmayer et al. 2011). Second, higher shared environment estimates for categorical ASD may reflect correlated error within twin pairs. It is likely that clinical diagnoses are correlated within a twin pair for reasons beyond concordance. Diagnoses were often generated by the same evaluator/parent combination, and this may be true in other twin samples as well, even exerting influence when standardized instruments are administered. While correlated classification errors within twin pairs should theoretically be equally problematic for MZ and DZ pairs, in practice, the effect is likely to differentially increase DZ concordance and shared environment effects. This is because categorical ASD concordances are already quite high for MZ pairs—causing a ceiling effect, while DZ concordances have greater room to increase.

Third, in this data, liability threshold models produced unrealistically low estimates of unique environment and measurement error (often <3 %). While it is possible to have a categorical phenotype strongly influenced by common environmental effects and not influenced by individual environment or error, a difference this large seems unlikely. This odd behavior of the threshold model is the result of assumptions about the underlying liability distribution. For example, it has been noted that fitting a model which assumes an underlying major gene effect to the data may yield entirely different results than the standard model, which assumes multivariate normality (Kidd and Cavalli-Sforza 1973). Similarly, it has been shown that numerous realistic scenarios for the distribution of the underlying liability can lead to large asymptotic biases, while small samples sizes may be biased even if the model assumptions are met (Benchek and Morris 2013). If, as we have argued, ASD is more of a categorical concept that is heavily influenced by highly penetrant alleles or a polygenic threshold, then there is no reason to trust that the classic ACE liability model will yield valid results because the liability distribution will differ strongly from multivariate normal. Furthermore, liability threshold results may be influenced by the combination of ascertainment and prevalence parameters in model estimation. Modest changes in these values can influence model results. On these grounds, ACE liability modeling studies of categorical ASD, including the present results, should be viewed cautiously.

Classical twin studies assume that common environments affects both monozygotic and dizygotic twins the same (Rijsdijk and Sham 2002). There is certainly reason to suspect that some of the known environmental risk factors for ASD may be differentially present in dizygotic twins. For example, there is evidence that maternal age (Sandin et al. 2012) and the use of assistive reproductive technology (Zachor and Ben Itzchak 2011) are risk factors for both dizygotic twinning and for ASD. Such factors would disproportionately increase the concordance rates among dizygotic twins leading to lower estimates of heritability. Perhaps more importantly, it is likely that many of these so called “environmental” risk factors act by increasing the mutational load. There is evidence that even monozygotic twins are not truly genetically “identical” (Bruder et al. 2008). Thus, even the relatively few cases of non-concordant monozygotic twins may be explained by genetic differences between the twins. It follows that a trait may be strongly genetically determined, but not particularly heritable.

Future behavioral genetic investigations of categorical ASD should consider the limitations of liability threshold models in the planning stages. Crucial factors will include consideration of ascertainment, prevalence, and recruitment of very large samples yielding greater precision (see Online Supplement 5 for additional discussion). These studies will also need to implement diagnostic procedures that accurately classify cases across the full autism spectrum and are administered by clinicians who are blinded to other family member’s, reducing the potential for biased concordance. While this work will be labor-intensive and expensive, these methodological improvements are needed to advance our understanding of etiologic influences on categorical ASD.

Additional limitations of the present study included reliance on caregiver-reported ASD clinical diagnoses and zygosity, missing data for SCQ and SRS quantitative symptom measures, and the use of non-ASD twin pairs from ASD-affected families. Clinical ASD diagnoses are not as reliable and valid as diagnoses based on gold-standard semi-structured interviews or observational instruments. However, these gold-standard measures were often used as part of the diagnostic process and available data suggests that clinical diagnoses in IAN are quite accurate (Lee et al. 2010). Parent-reported zygosity is valid (Rietveld et al. 2000) and the proportions of MZ and DZ twins in this study were consistent with expectation and did not suggest a bias toward parents misclassifying DZ twins as MZ twins. Even if zygosity misclassifications existed, they should have decreased heritability estimates.

Missing questionnaire data were also potentially problematic. Fortunately, SCQ and SRS sub-samples did not markedly differ in terms of demographic and clinical features from the total sample and results were highly similar across the SCQ and SRS. This is comforting because the SCQ is derived from diagnostic criteria whereas the SRS is a quantitative trait instrument. Future work should include clinician ratings to exclude the possibility that findings are influenced by rater perspective. Lastly, the IAN registry does not include twins from families unaffected by ASD and higher SES families are over-represented. The latter factor may artificially inflate heritability estimates in DF models. Future work will need to include twin pairs from families without ASD and use samples more representative of SES in the broader population.

In spite of these limitations, the national scope and large number of ASD-affected twin pairs in the IAN registry provided a unique opportunity for evaluating etiologic influences on autism. Results supported strong heritability of extreme (clinical) levels of quantitatively-assessed autism symptoms, but also raise the possibility that extreme levels of symptoms may be substantially influenced by highly penetrant pleiotropic alleles or threshold effects rather than a graded polygenetic transmission. The current view of ASD genetics is a complex confluence of etiologies, including unique transmission patterns (de novo vs. inherited), different thresholds for males and females (Szatmari et al. 2012), and distinct mixtures of high and low penetrance genetic, epigenetic, and environmental influences. To assist in teasing out the relative importance of all these factors, large family studies using careful ascertainment methods are needed. Family pedigree designs measuring quantitative symptoms and categorical ASD can simultaneously evaluate additive genetic, dominant genetic, shared environment, and unique environment effects to clarify the relative importance of these distinct etiologic influences. Ultimately, behavioral genetics approaches will be most powerful when combined with comprehensive genomic studies that include sequence interrogation and gene expression.