Introduction

Autism spectrum disorders (ASDs) are characterized by behavioral deficits in social relatedness and communication, along with the presence of restricted interests and repetitive behaviors (RRBs). The autism spectrum, as per DSM-IV-TR, encompasses several diagnoses that share these symptoms, including autistic disorder (AD), pervasive developmental disorder-not otherwise specified (PDD-NOS), and Asperger’s disorder (American Psychiatric Association 2000).

Early identification and intervention are key in improving the prognosis of children with ASDs, and intervention significantly lowers long-term societal costs (Chasson et al. 2007). Diagnosis has been shown to be reliable and stable in children ages 18–24 months (Baron-Cohen et al. 1996; Charman et al. 2005; Moore and Goodson 2003; Lord 1995; Stone et al. 1999). Instruments such as the Checklist for Autism in Toddlers (CHAT) (Baron-Cohen et al. 2000), Screening Tool for Autism in 2-year-olds (STAT) (Stone et al. 2008), and Modified Checklist for Autism in Toddlers (M-CHAT) (Robins et al. 2001) and its revision, the M-CHAT-R (Robins et al. 2014), screen for ASDs in children as young as 16 months. However, some children in that age range are expected to have developmental age equivalents below 12 months, given the developmental delays commonly observed in children with ASDs. Estimated rates of cognitive impairment or intellectual disability in individuals with ASDs range from 24 percent (Chakrabarti and Fombonne 2001) to 70 percent (Fombonne 2003). Specifically, in Fombonne’s (2003) review of 32 epidemiological studies, 30 percent of individuals with ASDs presented with mild to moderate cognitive impairment, while 40 percent were severely to profoundly impaired.

The reliability of measuring IQ in children with ASDs has been scrutinized because the social communication impairments and related behavioral issues common to ASDs may confound the testing process (e.g., limited comprehension of instructions, lack of imitative and reciprocal interactive skills, preoccupations or repetitive behaviors, distractibility, lack of a desire to please). There is also a degree of diagnostic overlap between ASDs and intellectual disability (ID), as both are characterized by cognitive, adaptive, and social skills deficits and often involve challenging and stereotyped behaviors (Matson and Shoemaker 2009). Despite this, Osterling et al. (2002) demonstrated that children who were later diagnosed with an ASD could be distinguished behaviorally from typically developing children and those with ID by 12 months of age. Additionally, contemporary research has demonstrated adequate stability of nonverbal intelligence in children with ASDs (Lord and Schopler 1989a, b; Sigman and Ruskin 1999).

Despite previous research demonstrating stability of ASDs and ID, even in young children, the validity of autism diagnoses in children with concurrent low mental age, defined here as age equivalents below 12 months in all cognitive domains, has not yet been demonstrated. For example, the Autism Diagnostic Interview-Revised (ADI-R) is only valid for children with a mental age above 24 months (Rutter et al. 2005). The Autism Diagnostic Observation Schedule (ADOS) has a toddler module that allows for diagnostic assessment in children as young as 12 months (Luyster et al. 2009), but not at developmental levels below that point. If autism screening is implemented at 16 months or earlier (see First Year Inventory (Reznick et al. 2007) and Infant-Toddler Checklist (Wetherby et al. 2008), which are designed to screen in children as young as 12 months, and Early Screening of Autistic Traits (Swinkels et al. 2006), designed for use in 14-month-olds), developmental levels may not permit valid diagnostic assessment of autism symptoms. For example, the M-CHAT (Robins et al. 2001) assesses the presence of some behaviors that Inada, Kamio, and Koyama (2010) suggest are usually present in typical development by 12 months, but others that do not usually appear until 15 months of age. Thus, if the M-CHAT, or another measure, were used to screen a child with a mental age below 12 months, these behaviors could be absent on the basis of developmental delay alone, and this could potentially affect the validity of such a screener to detect an ASD, producing false positives. In addition to global delays, these children might show restricted and repetitive behaviors (RRBs) common in persons with ID (Oliver et al. 2012), further leading to invalid autism diagnoses. Matson and Shoemaker (2009) call for research that validates autism diagnostic procedures in children with intellectual impairment.

Current Study

Data collection began before the publication of DSM-5; therefore, DSM-IV-TR criteria (APA, 2000) were used for AD and PDD-NOS. To our knowledge, this is the first study to follow children with the ASD-low MA presentation, which we define as having (1) sufficient symptoms of an ASD to warrant at least a DSM-IV-TR PDD-NOS diagnosis and (2) receptive and expressive language and nonverbal reasoning age equivalent scores below a 12 month level. Children were initially evaluated at approximately 24 months of age and were then reevaluated about 2 years later to determine whether they retained an ASD diagnosis and to assess autism severity and degree of cognitive and adaptive progress.

Specifically, we predicted that:

  1. 1.

    Children who initially met criteria for ASD-low MA would show significantly smaller developmental gains across time than children with AD or PDD-NOS.

  2. 2.

    A smaller percentage of children with an initial diagnosis of ASD-low MA would remain on the autism spectrum compared to those who initially met criteria for AD or PDD-NOS, as their early symptoms and delays might be due to global delay rather than an ASD. Furthermore, we predicted that children with an initial diagnosis of AD would have the highest diagnostic stability. This hypothesis was based on the AD children achieving a mental age by which prosocial and reciprocal communication behaviors are generally apparent and exhibiting RRBs at the time of initial evaluation, as well as the fact that children with PDD-NOS demonstrate lower levels of impairment than children with AD (Cohen et al. 1986; Sevin et al. 1995) and are more likely to lose their autism diagnosis over time (Anderson et al. 2014; Lord et al. 2006; Sutera et al. 2007; Helt et al. 2008; Berry 2009).

  3. 3.

    The AD group would demonstrate the most severe autism symptoms at both time points and the least change in symptom severity across time compared to the PDD-NOS and ASD-low MA groups. This hypothesis was based on the fact that children diagnosed with AD on initial evaluation already showed the full syndrome, and their symptoms could not be attributed simply to global delay.

Methods

Participants

Participants were 219 children drawn from a larger, federally funded investigation of the early detection of ASDs. In order to be included in the current study, participants (1) screened positive on the M-CHAT (Robins et al. 2001) or M-CHAT-R (Robins et al. 2014) between the ages of 16 and 30 months, (2) were evaluated through the research project and diagnosed with an ASD (i.e., AD, PDD-NOS, or ASD-low MA) at the approximate age of 2 years (Time 1), and (3) returned for a follow-up evaluation at the approximate age of 4 years (Time 2). Exclusion criteria included significant sensory impairments (e.g., blindness) or deficits in motor functioning (e.g., severe cerebral palsy) that would impact a child’s ability to complete testing. Given that the larger study from which participants were drawn aimed to develop an autism-specific screening measure, children were also excluded from the current project if they had a prior diagnosis of an ASD by a qualified person. Validating an autism-specific screening tool using a sample of children who already have a diagnosis of an ASD is not appropriate, as a prior diagnosis may impact parents’ approach to screening questions, and screening is not needed if a diagnosis has already been established.

Demographic Data

Demographic information is presented in Table 1. The mean age of children in the ASD-low MA group at Time 1 was significantly different than that of the other groups, about 2.5 months younger. Consistent with autism prevalence data, males predominated, and there were no significant differences in gender between the AD, PDD-NOS, and ASD-low MA groups. There were no significant differences between groups on ethnicity or income.

Table 1 Participant demographics by group

Diagnostic Criteria

DSM-IV-TR diagnostic criteria for AD and PDD-NOS were used. In addition, to meet study criteria for either of these two diagnoses, at least one of the child’s age equivalent scores on the Mullen Scales of Early Learning (MSEL) visual reception, receptive language, or expressive language scales must have been at or above a 12 month level.

Children received a research diagnosis of ASD-low MA if they presented with at least one symptom in the social cluster other than a failure to develop peer relationships appropriate to the child’s developmental level, at least one symptom from the communication cluster, and/or at least one symptom in the RRBs cluster. Additionally, these children received age equivalent scores below 12 months on the MSEL visual reception, receptive language, and expressive language scales, as well as on the Vineland Adaptive Behavior Scales (VABS) communication and socialization domains (see Table 2).

Table 2 Mean (SD) age equivalent scores by group

At Time 1, 111 children received a diagnosis of AD, 83 received a diagnosis of PDD-NOS, and 25 received a diagnosis of ASD-low MA. To assess differences among these three groups on follow-up, participants were coded into six groups based on Time 2 diagnosis: AD, PDD-NOS, ASD-low MA, developmental delay, other diagnosis, and no diagnosis. Participants were also coded based on whether their Time 2 diagnosis was an ASD diagnosis (e.g., AD, PDD-NOS, or ASD-low MA) or not (e.g., developmental delay, other diagnosis, no diagnosis). Criteria for all diagnoses are listed in “Appendix”.

Procedures

Participants were screened with either the M-CHAT (Robins et al. 2001) or M-CHAT-R (Robins et al. 2014) at their pediatrician’s office or through their early intervention (i.e., non-ASD services) provider between the ages of 16 and 30 months. Children who screened positive on the initial questionnaire, as well as the follow-up phone interview, were offered a free developmental and diagnostic evaluation, which was conducted by a licensed psychologist or a developmental-behavioral pediatrician and a clinical psychology doctoral student. Most evaluations took place at the research team’s university clinic, and families who did not have transportation were provided with a free taxi service. In some cases, study staff traveled to conduct evaluations at participating pediatric offices in two large towns with a high proportion of low income patients. Diagnoses were based on clinical best estimate judgment of symptoms from observation, developmental history, and testing data on the Autism Diagnostic Observation Schedule (ADOS), Childhood Autism Rating Scale (CARS), MSEL, and VABS.

Children became eligible for a follow-up evaluation (Time 2) at age 42 months or older and were recontacted by letter or telephone. This evaluation included the same measures and diagnostic procedures as Time 1.

Measures

Mullen Scales of Early Learning (MSEL)

The MSEL (Mullen 1995) is a standardized test of cognitive ability for children between birth and age 68 months. The visual reception, fine motor, expressive language, and receptive language scales were administered in this study. Each scale yields an age equivalent score, which reflects the child’s current developmental level. The MSEL was normed on a nationally representative sample of 1849 children (48.7% female, 51.3% male) and is a frequently used measure of developmental level and cognitive functioning in both typically developing children and children with developmental delays. It demonstrates satisfactory internal consistency and test–retest reliability (Mullen 1995).

Age equivalents for each group are shown in Table 2. Because the three groups differed in chronological age by several months, a developmental quotient (DQ) was calculated (Table 3). The decision to use a DQ (i.e., ratio IQ, using the formula mental age equivalent divided by chronological age, multiplied by 100) instead of a traditional deviation-type standardized score was based upon other studies of children with low mental ages showing that standardized scores have restricted ranges in this population (DiLavore et al. 1995; Kanne et al. 2011; Sallows and Graupner 2005; Schopler et al. 1989).

Table 3 Mean (SD) developmental quotient scores by group

To estimate development between time points, Time 1 age equivalent scores for each MSEL scale were subtracted from corresponding Time 2 age equivalent scores. This difference can be considered the amount of mental growth made between evaluations. Number of months between Time 1 and Time 2 evaluations was then calculated for each child, and a ratio (i.e., mental growth divided by time elapsed) was used to assess rate of progress (Table 4). A growth rate >1 indicates that a child made more rapid progress than expected, a rate of 1 indicates expected progress, and a number <1 indicates slower progress than expected. For example, if a child made the equivalent of 6 months’ growth in receptive language over the course of a year, his growth rate would be 0.5 (i.e., half the growth expected in typical development), whereas if he made 18 months of mental growth over the course of a year, his growth rate would be 1.5 (i.e., faster growth than expected in typical development). This value is often called the learning rate (Howard et al. 2005; Eikeseth et al. 2012; Klintwall et al. 2015), although Bagnato and Neisworth (1980) called it the intervention efficiency index. Klintwall et al. (2015) summarized the advantages of using this value to gauge developmental progress: it can be compared even when different time periods have elapsed or children are of similar mental age but different chronological age, if a child makes no progress the Learning Rate equals 0 rather than a decrease as one would see if using standard scores, and it is easier to explain to parents and teachers, especially if graphed. Similar methods of indexing developmental progress have been used previously in the literature, particularly when evaluating a child’s developmental gains in response to early intervention (Bagnato and Neisworth 1985; Shonkoff et al. 1992; Wolery and Dyk 1985).

Table 4 Mean (SD) developmental progress between evaluations by group

Vineland Adaptive Behavior Scales, Interview Edition (VABS)

The VABS (Sparrow et al. 1984) is a standardized parent report interview that assesses a child’s adaptive skills. It includes the domains of communication, daily living, socialization, and motor Skills. The VABS has well-established reliability and validity (Sparrow et al. 1984) and is frequently used with varied clinical populations (Klin et al. 1997). For the range of ages included in the current study sample, the VABS demonstrates high internal consistency and test–retest reliability (Sparrow et al. 2005).

Age equivalents are shown in Table 2. The same methods used to calculate DQ scores and growth between evaluations for MSEL scales was used to create similar scores for VABS socialization and communication domains, whose content is relevant to autism (Tables 3, 4).

Autism Diagnostic Observation Schedule—Generic (ADOS)

The ADOS (Lord et al. 2000) is a semi-structured assessment designed to measure symptoms of ASDs. Only behaviors viewed during test administration are scored on this measure. The ADOS includes four modules, one of which is administered based on the child’s expressive language level and chronological age. The current study used Modules 1 and 2. All children in the current study met the autism spectrum cut-off on the ADOS, in addition to meeting DSM-IV-TR diagnostic criteria as determined by a senior clinician.

Childhood Autism Rating Scale (CARS)

The CARS (Schopler et al. 1980) is a behavior rating scale that consists of 15 items measuring behaviors associated with an ASD. The scale is based on a clinician’s direct observation of the child, incorporating parent report of behaviors. The CARS yields a numerical score of autism symptom severity. This score can be used to label a child’s symptoms as non-autistic, mild, moderate, or severe. In addition to these established cut-offs, Chlebowski et al. (2010) defined a cut-off score for PDD-NOS.

Data Analytic Plan

A series of univariate analyses of variance (ANOVA) were conducted to determine differences on DQs between the AD, PDD-NOS, and ASD-low MA groups at each time point (Table 3), as well as in the overall level of developmental gains made between the two evaluations (Table 4). Power analyses revealed sufficient power (i.e., power = .80, α = 0.05) to detect small to medium effect sizes (i.e., η2 = 0.24 – 0.29, depending on MSEL/VABS domain).

We also evaluated differences in diagnostic stability among the AD (n = 111), PDD-NOS (n = 82), and ASD-low MA (n = 25) groups with Chi-square tests, comparing the three groups by (1) diagnostic outcome, with all six Time 2 diagnostic possibilities included (Table 5), and (2) diagnostic outcome, differentiating ASD versus non-ASD (Table 6). When comparing all six diagnostic outcomes, power analyses revealed sufficient power (i.e., power = .80, α = 0.05) to detect a medium to large effect size (ϕ = 0.27), whereas comparison of autism spectrum versus non-ASD diagnoses revealed sufficient power to detect a small to medium effect (ϕ= 0.19).

Table 5 Diagnostic stability by group: all diagnostic outcomes at time 2
Table 6 Diagnostic stability by group: ASD versus non-ASD diagnostic outcomes at time 2

Finally, a mixed-model ANOVA was conducted to investigate the relationship between initial diagnosis and Time 1 and Time 2 autism symptom severity as measured by the CARS (Table 7). For all analyses, on a case-by-case basis, participants with missing data were excluded. Power analyses revealed sufficient power (i.e., power = .80, α = 0.05) to detect a small to medium effect size (i.e., η2 = 0.22).

Table 7 Mean (SD) ASD severity on CARS by group

Results

Differential Attrition

Three hundred eighty-nine children were diagnosed with an ASD at Time 1. Of these, 44 percent (n = 171) declined, or were unable to be contacted for, a Time 2 evaluation. Specifically, 47 percent (n = 100) of children diagnosed with AD, 43 percent (n = 61) with PDD-NOS, and 29 percent (n = 10) with ASD-low MA at Time 1 did not return for a Time 2 evaluation. When compared to children who returned at follow-up, those who did not return did not significantly differ on Time 1 age [t(387) = −1.751, p = .081], diagnosis (Χ 2 (2, N = 389) = 4.655, p = .098), autism severity, as measured by the CARS [t(372) = 1.400, p = .162], or nonverbal IQ, as measured by MSEL visual reception [t(321) = 0.417, p = .677]. However, the two groups did significantly differ on ethnicity [Χ 2(6, N = 389) = 28.817, p < .001] and gender [Χ 2(1, N = 389) = 4.607, p = .032]. Those who returned at Time 2 were more likely to be White and male. See Table 8 for a summary of differential attrition data.

Table 8 Differential attrition

Developmental Growth by Group

Age equivalents and DQs on MSEL scales and VABS domains at both time points are shown in Tables 2 and 3. In general, across domains, the ASD-low MA group performed lower on measures of cognitive and adaptive functioning, though children in the PDD-NOS group demonstrated somewhat higher language ability and social and play skills than those in both other groups (Table 3). Developmental growth rate, or Learning Rate, as previously explained, is shown by group in Table 4. We predicted that children with an initial diagnosis of ASD-low MA would show significantly smaller developmental gains across time than children with AD or PDD-NOS.

Overall, all domains were significantly different by group, except that VABS expressive language was only a trend. Post-hoc LSD analyses were performed for pairwise comparisons except where Levene’s test for equality of variances was found to be violated, in which case a Games–Howell post-hoc analysis was conducted.

A post-hoc LSD analysis revealed significant differences in MSEL visual reception progress between the AD and ASD-low MA (p = .007) and the PDD-NOS and ASD-low MA (p < .0005) groups, but not the AD and PDD-NOS groups (p = .129) (Table 4). The ASD-low MA group only demonstrated approximately half of the progress that was expected from the time elapsed, while the other two groups, especially children with PDD-NOS, made progress close to what was expected.

The mean growth rates for MSEL fine motor showed the same pattern, with significant differences in progress between the AD and ASD-low MA (p < .0005) and the PDD-NOS and ASD-low MA (p < .0005) groups, but not the AD and PDD-NOS groups (p = .74). Again, the ASD-low MA group made less progress than the other two groups.

On MSEL expressive language, a post-hoc LSD analysis revealed significant differences in progress between the AD and ASD-low MA (p = .001) and the PDD-NOS and ASD-low MA (p < .0005) groups, but not the AD and PDD-NOS groups (p = .33). Again, the ASD-low MA group exhibited the least progress over time.

The mean growth rates for MSEL receptive language were also significantly different by group, F(1, 162) = 11.37, p < .0005. A Games–Howell post-hoc analysis revealed significant differences in progress between the AD and ASD-low MA (p < .0005) and the PDD-NOS and ASD-low MA (p < .0005) groups, but not the AD and PDD-NOS groups (p = .467). The AD and PDD-NOS groups showed progress close to what would be expected for the time elapsed, while children in the ASD-low MA group made only about one-third as much progress as expected.

VABS expressive language showed a trend for an overall group difference; a post-hoc LSD analysis revealed significant differences in expressive language progress between the AD and ASD-low MA (p = .026) and the PDD-NOS and ASD-low MA (p = .04) groups, but not the AD and PDD-NOS groups (p = .82), with the ASD-low MA group making only about a quarter of the progress expected.

VABS receptive language growth was significantly different by group [F(1, 171) = 4.78, p = .01]. A post-hoc LSD analysis revealed significant differences in progress between the AD and ASD-low MA (p = .026) and the PDD-NOS and ASD-low MA (p = .002) groups, but not the AD and PDD-NOS groups (p = .18), again with the ASD Low-MA group making the least developmental progress over time.

VABS interpersonal relationships growth also differed by group [F(1, 166) = 6.94, p = .001]. A Games–Howell post-hoc analysis revealed significant differences in interpersonal relationships progress between the AD and ASD-low MA (p < .0005) and the PDD-NOS and ASD-low MA (p < .0005) groups, but not the AD and PDD-NOS groups (p = .42). The AD and PDD-NOS groups made less progress here than in language domains, and did not differ from each other, while the ASD-low MA group made almost no measurable progress.

The mean mental growth rates for VABS play and leisure also differed by group. A Games–Howell post-hoc analysis revealed significant differences in play and leisure progress between the AD and ASD-low MA (p = .001) and the PDD-NOS and ASD-low MA (p = .001) groups, but not the AD and PDD-NOS groups (p = .99), with slower progress than in language domains for the AD or PDD-NOS groups, and almost no progress for the ASD Low-MA group.

The mean mental growth rates for VABS coping showed an overall significant group difference [F(1, 116) = 8.49, p < .0005]. A Games–Howell post-hoc analysis revealed significant differences in progress between the AD and ASD-low MA (p = .002) and the PDD-NOS and ASD-low MA (p < .0005) groups. The AD and PDD-NOS groups also differed (p = .025). The PDD-NOS group made close to expected progress in this domain, while the AD group made only about half the expected progress, and the ASD-low MA group showed almost no progress.

Data for VABS daily living are not presented here, as they do not reflect core autism symptoms. Even so, all of the subdomains in daily living showed the same pattern outlined above, namely that progress did not differ between AD and PDD-NOS groups, but was lower than both of these in the ASD-low MA group.

Diagnostic Stability: All Outcomes at Time 2

To assess the prediction that fewer children initially diagnosed with ASD-low MA would remain on the autism spectrum, but would instead show non-autistic global delays, a Chi-square analysis was conducted to compare possible differences in the diagnostic stability of the AD (n = 111), PDD-NOS (n = 83), and ASD-low MA (n = 25) groups, when considering all six Time 2 diagnostic outcomes.

Outcomes for children with each diagnosis at Time 1 are shown in Table 5. This initial Chi-square analysis revealed that there was an overall difference in diagnostic stability based on Time 1 diagnosis (X 2 (10, N = 218) = 74.83, p = < 0.0005, Phi = 0.586). Only one of the ASD-low MA children moved off the spectrum, contrary to prediction, and most of them stayed in either the significantly delayed ASD-low MA group or moved into the AD category.

Diagnostic Stability: ASD versus Non-ASD Outcomes at Time 2

We also predicted that diagnostic stability (i.e., ASD vs. non-ASD) would be highest for the group of children initially diagnosed with AD. Of the 111 children that received an AD diagnosis at Time 1, 96 (86.5%) received an ASD diagnosis at Time 2, while 15 (13.5%) received a non-ASD diagnosis. Of the 82 children that received a PDD-NOS diagnosis at Time 1, 60 (73.2%) received an ASD diagnosis at Time 2, while 22 (26.8%) received a non-ASD diagnosis; this group therefore had the most positive outcome, defined as transitioning off the autism spectrum. Of the 25 children that received an ASD-low MA diagnosis at Time 1, 24 (96%) received a diagnosis of an ASD at Time 2, while 1 (4%) received a non-ASD (but still developmentally delayed) diagnosis. Chi-square analysis revealed a significant difference between the AD, PDD-NOS, and ASD-low MA groups in their likelihood to remain on the autism spectrum at Time 2 [X 2 (2, N = 218) = 9.35, p = .009, Phi = 0.207] (see Table 6).

Follow-up Chi-square analyses were then conducted to identify the significant differences between pairwise groups on whether they retained a diagnosis of an ASD or received a non-ASD diagnosis at Time 2. The AD and PDD-NOS groups were significantly different, as participants from the AD group (86.5%) were more likely to receive an ASD diagnosis at Time 2 than those in the PDD-NOS group (73.2%) [X 2 (1, N = 193) = 5.39, p = .026, Phi = 0.167]. The 86.5 percent of the AD group and the 96 percent of the ASD-low MA group remaining on the spectrum were not significantly different. The PDD-NOS and ASD-low MA groups were significantly different (X 2 (1, N = 107) = 5.92, p = .013, Phi = − 0.235).

Therefore, the prediction that children initially diagnosed with AD would show greater diagnostic stability than those in the PDD-NOS group was borne out. However, the prediction that a significant number of the ASD-low MA children would move from an ASD diagnosis to global developmental delay, because their apparent autism at Time 1 was due to global delays, was not confirmed. Instead, all but one of these children remained on the autism spectrum. Furthermore, most of these children moved into the AD category, and five were still functioning below a 12 month cognitive level at Time 2.

Autism Severity

CARS scores at Time 1 and 2 are shown in Table 7. A one-way ANOVA was performed to compare the AD, PDD-NOS, and ASD-low MA groups using the change in CARS total score between time points. The mean change in CARS score was significantly different among groups (F (2, 197) = 9.12, p < .0005). The AD and ASD-low MA scores are in the mild-to-moderate autism range, although the AD group improved to the lower end of this range, whereas the ASD-low MA group worsened slightly. The PDD-NOS group improved slightly but remained within the PDD-NOS range (Chlebowski et al. 2010). A post-hoc LSD analysis revealed significant differences in CARS score change between the AD and ASD-low MA (p < .0005) and the AD and PDD-NOS (p = .003) groups, but not the PDD-NOS and ASD-low MA (p = .089) groups. Therefore, although the AD group showed strong stability, as predicted, by largely remaining on the autism spectrum, its members did show more symptomatic improvement than the other groups. The severe disability in the ASD-low MA group is very likely contributory to the observed pattern, as are the relatively milder autism symptoms of the PDD-NOS group.

Discussion

The goal of the current study was to examine developmental growth, diagnostic stability, and autism symptom severity, over 2 years, in children with initial diagnoses of AD, PDD-NOS, and ASD-low MA. Overall, the ASD-low MA group demonstrated less developmental growth over time (i.e., slower progress) than the AD and PDD-NOS groups. Furthermore, results suggested high stability of autism spectrum diagnoses in children with Low MA. Children in the ASD-low MA group also showed consistent and severe autism symptoms over the 2-year study period, whereas children in the AD group exhibited some improvement in symptom severity from Time 1 to Time 2, and children in the PDD-NOS group had milder symptoms at their initial evaluation. Taken together, our findings provide clear support for the diagnosis of ASDs in very young children and those with cognitive delays and are important for understanding the expected developmental trajectory of these children.

Cognitive DQs on the MSEL generally showed improvement in the AD and PDD-NOS groups, with limited gains in the ASD-low MA group. However, developmental progress ratios below one suggested that all participants made less progress in mental age than what was expected based on the actual change in chronological age between evaluations. Growth that was 75 percent or more of what would be expected from the time elapsed was shown by the AD and PDD-NOS groups in MSEL visual reception, receptive language, and expressive language, and VABS expressive and receptive language, and by the PDD-NOS group in VABS coping. No group showed this much progress in MSEL fine motor, VABS interpersonal relationships, or play and leisure, and the ASD-low MA group did not show this much progress in any cognitive or adaptive domain.

The slower progress on the MSEL fine motor subdomain could be a result of its design. This scale relies on a child having a level of receptive language proficiency high enough to understand basic verbal cues, as well as the basic skills (e.g., eye contact, reciprocal interaction, imitation) to attend to a visual demonstration and replicate and sequence the observed actions correctly. Additionally, fine motor tasks may not be as much of a focus of autism-specific intervention services as are language and social skills. This hypothesis, however, would not explain the slower progress on VABS interpersonal relationships and play. One possibility is that these items are tapping into core autism impairments that are more difficult to ameliorate, even with intervention. Additionally, the VABS play and leisure subscale is largely composed of items addressing play with peers; it is possible that lack of interest in peers, also a core symptom of ASDs, is difficult to improve. Since the children in the current study were all approaching preschool age at Time 2, it is also likely that their intervention services between evaluations had focused on basic communication and socialization (e.g., eye contact), and thus perhaps they had not yet been treated in a setting with other children.

The ASD-low MA group had quite low growth ratios for all measured skill areas; they did make some progress on the cognitive and communication items, but almost none on the VABS socialization domains. This finding indicates just how limited developmental progress is for children in this diagnostic category, as these children on average made only a quarter to one-half of the progress expected of them between evaluations, even in cognitive and adaptive domains where they made the greatest gains. Thus, our first prediction, that the ASD-low MA group would make the least developmental progress over time, was strongly confirmed.

The second prediction hypothesized that our three groups would differ in diagnostic stability, with the AD group having the highest percentage of children remaining on the autism spectrum, and the ASD-low MA group the least, at follow-up. Both when considering all six specific Time 2 diagnoses, and when considering only autism versus non-ASD outcomes, the AD group did show a higher rate of diagnostic stability than the PDD-NOS group. Two-thirds of the AD group maintained their diagnosis, with most of the rest improving to PDD-NOS, and only roughly 10 percent losing an ASD diagnosis. In the PDD-NOS group, about one-third kept that diagnosis, about one-third increased in severity to AD (likely due to the emerging presence of RRBs later in the developmental period), and another third moved off the autism spectrum. Approximately half of these children still showed other developmental delays at Time 2. In the ASD-low MA group, contrary to predictions, all but one of the children maintained an autism spectrum diagnosis, with most meeting criteria for AD, 20 percent continuing to display characteristics of ASD-low MA, one child improving to PDD-NOS, and one child moving off of the autism spectrum but still showing significant developmental delays. Thus, our second prediction regarding diagnostic stability was only partially supported, suggesting validity of early (i.e., 2-year-old) autism diagnoses in children with concurrent low mental age.

We also predicted that the AD group would demonstrate the most severe symptoms of an ASD at both time points and exhibit the smallest rates of symptom severity change across time, when compared to the PDD-NOS and ASD-low MA groups. Contrary to our prediction, the AD group showed the most improvement in autism severity as measured by the CARS, although their average score was still solidly within the autistic range. The PDD-NOS group showed significantly less improvement than the AD group; however, at both time points, they were still on average in the PDD-NOS range, with lower severity scores than the AD group. The ASD-low MA group had the most severe scores at Time 1 and showed minimal improvement. Thus, our third prediction regarding symptom severity was only partially supported, again supporting the validity of the diagnosis of ASDs in children with low mental age.

These findings suggest that ASDs can in fact be detected accurately, even in the presence of co-occurring low mental age, and that the symptoms of autism are severe and highly stable across time in this subgroup. This high degree of symptom stability provides further support to the legitimacy of assigning an autism spectrum diagnosis in the presence of low mental age, suggesting that the observed behaviors are in most cases indicative of an ASD, and not solely a product of significantly delayed developmental (i.e., cognitive and adaptive) milestones.

A high proportion of children in the AD and PDD-NOS groups also received an autism diagnosis at follow-up, though our findings were consistent with prior research indicating that children with PDD-NOS are more likely to lose their diagnosis over time. The elimination of the PDD-NOS diagnosis, and the introduction of the requirement of RRBs in DSM-5, might result in a number of these children failing to meet DSM-5 autism spectrum disorder diagnostic criteria (Barton et al. 2013) and thereby missing out on necessary intervention services. The greater improvement shown by the AD group might be attributable to their showing the full syndrome at age 2 years and receiving more intensive, targeted intervention, resulting in some amelioration of deficits. However, it should be noted that there were almost no significant differences in learning rate on cognitive or adaptive measures between the AD and PDD-NOS groups, suggesting that although they may differ somewhat in symptom severity, they do not show notable differences in cognitive or adaptive growth, while the ASD-low MA group was significantly different from the other two autism groups on level of functioning and growth rate on every variable.

Limitations

There were a number of limitations that must be considered. While the overall sample size is large compared to many studies of ASDs, the ASD-low MA subgroup was small (n = 25). This caused the power for some analyses to be limited.

Additionally, across all three groups, a large minority (44%) of children who received a Time 1 evaluation declined, or were unable to be contacted for, a Time 2 evaluation. Differential attrition analyses indicated that children who returned for follow-up were more likely to be male and, perhaps more significantly, White, than those who did not return at Time 2. As a result, our study may not fully capture the developmental patterns shown in children from racial and ethnic minority backgrounds, or those of lower income. Further, it is quite possible that the parents of non-returning children had fewer concerns about their child’s development at the time a follow-up evaluation was offered, when compared to those that did return at Time 2. If true, this trend may have biased our data, as our sample would be comprised of children who were more likely to demonstrate delays or symptoms of an ASD at a follow-up evaluation. Of note, compared to children initially diagnosed with AD and PDD-NOS, a smaller percentage (29%) of those in the ASD-low MA group failed to return for follow-up, perhaps suggesting that parents of these children were more concerned, which may be a product of this group’s combined deficits in autism symptoms and cognitive functioning.

Another limitation to consider is the time frame of the study. Our longitudinal study was limited to data collected at ages 2 and 4 years. As a result, we do not have potentially informative data regarding long-term outcomes for these children. Furthermore, given the time frame of the recruitment years, DSM-IV-TR criteria was used to diagnose participating children. Since ICD-10 criteria are similar to those of DSM-IV-TR, the results of a parallel study using ICD-10 are likely to be comparable to the current results. However, our findings may not be generalizable to a study using DSM-5.

Finally, we were unable to account for the likely impact of intervention on the diagnostic and developmental outcomes of children in the current study. Through participation in the larger study, each participant in our sample received an autism spectrum diagnosis at a relatively young age (i.e., on average, around age 2 years) and was subsequently referred for intensive autism-specific intervention services. At follow-up, parents were asked to report on a developmental history form whether or not their child had received some type of intervention, but specific information regarding the type and structure of services, as well as intensity, frequency, and duration, was either unavailable or not detailed enough to allow for meaningful evaluation of the impact of these services on the child’s progress over time. Therefore, we were unable to directly assess if a child’s improvement across time was related to the intervention services they received between evaluations, increased parental understanding of their child’s deficits, or any other factors that may have contributed to a child’s progress.

Future Directions

These results support the position that ASD-low MA is a distinct autism subtype, insofar as its developmental growth and diagnostic stability are consistently different than AD and PDD-NOS. The developmental progress rates and autism symptom improvement in the AD and PDD-NOS groups suggest that, despite their delays and symptomatology at initial evaluation, the intensive intervention services that these groups likely received after their initial evaluation helped to address these delays and provided an opportunity to make improvements. However, the very low rate of developmental progress over time, and the stability of autism spectrum symptomatology, in the ASD-low MA group signal the severity of this particular subtype, which may merit more intensive services. These findings may also suggest that children with low mental age may struggle to engage with traditional autism-specific intervention services, thus limiting opportunities for developmental gains.

There is a clear need for replication of these results to ensure the reliability and validity of diagnosing ASDs in the presence of low mental age. If replicated, such findings would allow clinicians greater confidence in diagnosing ASDs regardless of mental age at the time of evaluation. Longitudinal studies should also follow these children into later childhood, when they are more likely to receive a comorbid diagnosis of ID (i.e., at age 6 years and older), to examine whether their autism symptoms continue to remain stable over time.