One of the serious difficulties faced by youth who meet criteria for attention-deficit/hyperactivity disorder (ADHD) is poor academic achievement (DeShazo Barry et al. 2002; DuPaul et al. 2001; Faraone et al. 1993; Frick et al. 1991; Lonigan et al. 1999; McGee et al. 1986; Rapport et al. 1999; Zentall et al. 1994). Although academic difficulties are one of the primary justifications for treating ADHD, much remains to be learned about the nature of the academic deficits of children with ADHD. Four issues are particularly important in considering the current data base regarding the academic deficits of children and adolescent with ADHD.

First, because children who meet criteria for ADHD have lower intelligence scores than the population mean on average (August and Garfinkel 1990; Faraone et al. 1993; Frazier et al. 2004; Szatmari et al. 1990), and because intelligence is highly correlated with standardized measures of reading and mathematics (Vanderwood et al. 2001), it is necessary to take intelligence into account when studying the academic achievement of children with ADHD. Similarly, because children with ADHD tend to exhibit more symptoms of psychopathology than children without ADHD, and these other forms of psychopathology may be associated with academic deficits, it also is important to consider co-occurring psychopathology and other confounds when assessing relations between ADHD and academic achievement.

Second, it is important to determine if each of the subtypes of ADHD is associated with academic deficits to the same extent. This is important because there is evidence that academic deficits are more strongly associated with inattention symptoms than with hyperactivity–impulsivity (Fergusson and Horwood 1995; McGee et al. 1986; Morrison et al. 1989; Rabiner, Coie and The Conduct Problems Prevention Research Group 2000). Because the DSM-IV subtypes of ADHD are based on levels of inattention and hyperactivity–impulsivity, it is important to examine differences in academic achievement among the different subtypes of ADHD.

Third, when studying academic achievement relative to intelligence, it is important to consider learning disabilities. Although the diagnostic construct of learning disability is controversial and not well defined (Stanovich 2005; Stanovich and Stanovich 1996; Stuebing et al. 2002), it is possible that some children with ADHD who underachieve relative to intelligence should be viewed as having a learning disability. That is, any average underachievement among children with ADHD might be the result of a higher prevalence of learning disability among children with ADHD instead of an influence of ADHD on achievement, per se. This is possibility is plausible as more children who meet criteria for ADHD have been found to have discrepancy scores (i.e, academic achievement scores subtracted from intelligence scores) consistent with a diagnosis of learning disability (DeShazo Barry et al. 2002).

Fourth, it is important to distinguish between cross-sectional and longitudinal studies of relations between ADHD and academic deficits. Cross-sectional studies address differences between children who meet criteria for ADHD and comparison children at a particular point in time. For example, a comparison of ADHD and normal preschoolers found that young children with ADHD scored significantly lower on a test of preacademic skills (DuPaul et al. 2001). Similarly, a cross-sectional study based on the first assessment wave of the present sample reported that children diagnosed with ADHD at 4–6 years of age had significantly lower academic achievement than comparison children after controlling for intelligence, co-occurring symptoms of other forms of psychopathology, and other confounds (Lahey et al. 1998).

In contrast, longitudinal studies are able to address the predictive validity of ADHD. That is, they are able to determine whether ADHD predicts academic deficits over periods in the future. For example, in a school sample of 7–16 year olds at baseline, Rapport et al. (1999) found that teacher ratings of attention problems and hyperactivity predicted lower academic achievement in reading and mathematics 3–4 years later when controlling for both intelligence and teacher rated conduct problems. Similarly, Rabiner et al. (2000) found that teacher ratings of attention problems, but not impulsivity–overactivity predicted reading achievement in fifth grade in a school sample, controlling for early reading achievement and intelligence. Fergusson and Horwood (1995) examined the relation between adult ratings of attention problems and academic achievement in children from 10 to 12 years of age. They found some evidence that attention problems cause academic deficits, but no evidence that lower academic achievement caused attention problems.

Only one previous study has examined the predictive validity of the diagnosis of ADHD when intelligence was controlled, however. Fischer et al. (1990) conducted an assessment of the adolescent outcomes of a group of children given an experimental diagnosis of hyperactivity prior to the publication of DSM-III-R when they were 4–12 years of age. Controlling for an estimate of intelligence and for maternal education, the hyperactive children had lower academic achievement in reading and mathematics than control children in adolescence. This supports the predictive validity of that experimental version of the diagnosis of ADHD by suggesting that it is associated with long-term academic deficits. No previous study has examined the predictive validity of ADHD in terms of future academic achievement using DSM-IV symptoms, however.

The present paper addresses the predictive validity of the DSM-IV symptoms for ADHD when they are applied to younger children using data from a longitudinal sample of children who met ADHD symptom criteria and one-setting impairment at 4–6 years of age in 1995. A recent report based on the same study reported that the diagnosis of ADHD was highly stable across years and predicted impairment in a number of domains of functioning (Lahey et al. 2004), but that report did not include data on academic achievement, however. Although Lahey et al. (2005) reported that the overall ADHD diagnosis remained stable over time, there was considerable instability in the subtypes of ADHD across time. In fact, the distinction between the hyperactive–impulsive and combined subtypes disappears over time, as most children in the former group shifted to the latter group. The distinction between the inattentive and combined subtypes also demonstrated instability, with children shifting from one to another over time. Children who met criteria for the inattentive type of ADHD at school entry were very likely to continue to exhibit symptoms and functional impairment over time, and were very likely to continue to meet criteria for ADHD, even though they met criteria for other subtypes of ADHD as they progressed through elementary school.

Based on previous studies using ratings of attention problems and the diagnosis of ADHD, we hypothesize that children who met modified criteria for ADHD in this sample will have lower academic achievement in both reading and mathematics than comparison children at 4–6 years of age, but significant impairment will be present in a school sample for those subtypes who exhibit significant problems with inattention (predominantly inattentive and combined subtypes) and not the predominantly hyperactive–impulsive subtype.

Materials and Methods

Participants

Participants were 255 children, 125 of which were recruited as ADHD probands and 130 of which were recruited as comparison children, who were 3 years–10 months through 7 years–0 months old at initial recruitment. Two cohorts of 130 ADHD participants were initially recruited during November through May of Years 1 and 2 of the study in Chicago and Pittsburgh. In Chicago, children presenting to a university child psychiatry clinic with complaints of inattention and/or hyperactivity were recruited. In Pittsburgh, 42% of the children who met symptom criteria for ADHD were recruited from a university child psychiatry clinic, while the remainder was recruited through flyers distributed at schools and newspaper advertisements. Children who were recruited through clinic referral did not differ significantly from children recruited through advertising on any demographic or impairment measure (Lahey et al. 1998). All children were enrolled in structured educational programs: 36% preschool, 43% kindergarten, 21% first grade, and 1% second grade.

Potential participants were eligible only if they lived with their biological mother and if they did not exhibit pervasive developmental disorders, psychosis, or clear neurological disorders. Four children who were recruited through advertisements were declared ineligible because their parent stated that they had been diagnosed with pervasive developmental disorder, mental retardation, or seizure disorder. The sample of ADHD probands was therefore 125.

Comparison children (N = 130) were recruited from the same schools as the probands or from schools that served similar neighborhoods. None had ever been referred for services for mental health problems, but were not excluded if they met criteria for a psychiatric diagnosis other than ADHD. Comparison children were selected from among those who volunteered to match the probands on the basis of gender, ethnicity, and age. Two children who met symptom criteria for ADHD in the wave 1 diagnostic assessment were recruited as controls and 12 children who were recruited as probands did not meet symptom criteria for ADHD. These children were included in the modified ADHD and comparison groups, respectively. This was viewed as the most conservative method of treating these participants, as it tends to increase similarities between the modified ADHD and comparison groups. Of the 315 eligible participants originally recruited from clinics and advertisements, 259 parents (82%) gave informed consent, and all children gave oral assent to participate. The demographic characteristics of the sample are presented in Table 1.

Table 1 Characteristics in the year 1 assessment of children who met criteria for ADHD (with impairment in at least one setting) in year 1 and comparison children

Child intelligence was estimated using the Stanford–Binet Intelligence Scale Short Form (Thorndike et al. 1986). In order to provide the best estimate of intelligence in this young age range, the children were tested in both waves 1 and 2 and their scores averaged. Four children were excluded from the present analysis because of a mean intelligence score below 70.

Longitudinal Assessments

Seven yearly diagnostic assessments were conducted over 8 years (no assessment was conducted in year 5). In each assessment wave, all measures were obtained during one visit to the clinic by trained lay interviewers who were blind to clinic-referred or comparison group status of participants. On several occasions, children were tested by individuals who were not blind to the children’s group status because of scheduling complications. In order to control for potential problems, whether the examiner was blind to the child’s status was entered as a covariate in all analyses (see Results). Interviews with children and mothers were conducted concurrently by two separate interviewers. Retention in the present sample was high, as the portion of children assessed for academic achievement in each wave was 99, 95, 92, 90, 85, and 84% for children in the comparison group, 100, 92, 87, 86, 86, and 81% for ADHD-Combined type, 100, 100, 92, 92, 88, and 92% for ADHD-predominantly hyperactive–impulsive type, and 100, 100, 93, 93, 86, and 71% for ADHD-predominantly inattentive type. Many children who were not assessed in one wave were assessed again in subsequent waves, so the above figures are an underestimate of the percent of children retained overall in the sample. The percent of children assessed either in wave 7 or in wave 8 was 87% for the comparison group, 90% for combined type, 100% for predominantly hyperactive–impulsive type, and 86% for the predominantly inattentive subtype. The groups did not differ significantly (at the 0.05 level) on retention rates for waves 7 or 8. Furthermore, there were only 26 children (10.2% of the sample) who were assessed in neither wave 7 nor wave 8; these children did not differ from the assessed children on any demographic characteristics.

Measures

Measures of diagnostic criteria

The NIMH Diagnostic Interview Schedule for Children (Shaffer et al. 1993) was administered to the biological mother. The DISC-2.3 obtained information on DSM-III-R diagnostic criteria for ADHD, oppositional defiant disorder (ODD), conduct disorder (CD), anxiety disorders, mood disorders, and tic disorders during the previous 6 months. In addition, a module from the DSM-IV Field Trials (Lahey et al. 1994) that contained questions about DSM-IV symptoms of the disruptive behavior disorders that were not in DSM-III-R was also administered. Youth self-report information on DSM-III-R CD, major depression, and dysthymia during the previous 6-month period was also obtained during the years 6 and 7 assessments using the DISC. Teacher report on disruptive behavior disorder symptoms was obtained using the DSM-IV version of the DBD Rating Scale (Pelham et al. 1992). Symptoms were counted as present if they were endorsed by either the child’s mother on the DISC 2.3 or the child’s teacher on the DBD Rating Scale, or both; if both parents and teachers endorsed a symptom, it was counted only once.

Children were said to meet modified criteria for ADHD if they met DSM-IV symptom criteria, the age of onset criterion, and were said to be impaired in at least one setting in wave 1. The new DSM-IV requirement of cross-situational impairment was not used for four reasons, however. First, as we have stated previously (Lahey et al. 2004; Lahey, Pelham et al. 2005), it is not clear why children who otherwise meet criteria for ADHD, and are seriously impaired in one setting, would not be eligible for the diagnosis of ADHD. No other DSM-IV disorder limits treatment to children who are impaired in multiple settings. Second, the two dimensions of ADHD symptoms are associated with different types of impairment, with inattention being more strongly associated with school problems and hyperactivity–impulsivity being more associated with home problems (Lahey et al. 1994; Lahey and Willcutt 2002). That is, one would only expect the combined type to be clearly associated with impairment across settings. Therefore, a study of all three subtypes of ADHD would be biased by requiring cross-situational impairment because children who met criteria for the predominantly inattentive and hyperactive–impulsive subtypes would be eliminated if they exhibited impairment only in the setting most associated with their pattern of symptoms. Stated differently, the children who meet criteria for the inattentive and hyperactive–impulsive subtypes who exhibit cross-situational impairment are likely to be ones who just miss meeting criteria for the combined subtype. By requiring impairment in only one setting, we have included all children who met symptom criteria for any subtype and who exhibit some impaired functioning.

Third, because the great majority (79%) of the participants in this sample were in preschool or kindergarten at the time of their diagnostic assessment in year 1, they might be expected to exhibit impaired functioning in fewer settings than they will when they grow older and demands on them increase at school and home. Indeed, we have shown that 78% of children in this sample who met symptom criteria for ADHD and exhibited impairment in only one setting in year 1 exhibited impairment in two settings in later assessments at older ages (Lahey et al. 2004). Thus, we believe that this criterion may be particularly problematic for younger children. Fourth, we have shown that children who met symptom criteria for ADHD and exhibited impairment in one setting in the first year of this study continued to exhibit significantly greater impairment than did comparison children over 3 years (Lahey et al. 2004). This suggests that impairment in a single setting in younger children is neither transient nor insignificant.

Two sources of information were available on impairment. First, parents were asked in the DISC interview if their child’s ADHD symptoms caused problems at home or with friends. Second, parents and teachers both completed the Impairment Rating Scale (Fabiano et al. 2006), in which responders are asked to rate the child’s need for treatment on a 7-point scale. A visual scale is rated, the endpoints of which range from “No problem; definitely does not need treatment” to “Extreme problem; definitely needs treatment,” in several areas of functioning and overall. Mean scores on all teacher-rated items were used to assess school impairment, while parent ratings were used for home impairment. Test–retest stability (different teachers 1 year apart) for the six IRS scales is r = 0.39 to 0.63 (p < 0.001). For the purpose of determining modified criteria for ADHD, children were said to be impaired (1) if parents reported problems at home and/or with peers, or if parents rated ≥3 on any IRS scale (excluding the school setting); or (2) if parents reported problems at school on the DISC, or if teachers rated ≥3 on any IRS scale. According to this definition, 125 children met modified criteria (symptom criteria plus impairment in one setting) for ADHD in wave 1. Four children who met symptom criteria for ADHD but exhibited neither home nor school impairment were dropped from all analyses. Of the six children in the inattentive subtype group who were impaired in one setting, five were impaired only in school, one was only impaired at home. One was in preschool, three were in kindergarten, and two were in first grade.

Cognitive ability and academic achievement

Intelligence was estimated in waves 1 and 2 using the standard Short Form of the Stanford–Binet Intelligence Scale, Fourth Edition (Thorndike et al. 1986). Academic achievement was assessed during each wave using the Letter-Word Identification, Applied Problems, and Dictation scales from the Woodcock–Johnson Psychoeducational Battery (Woodcock 1977). The Letter-Word Identification subtest is a measure of alphabet knowledge and word reading skills (including sight vocabulary). The task involves asking young children to name letters of the alphabet. Older children are asked to read increasingly difficult words. This task is robustly correlated with longer, more comprehensive measures of reading ability (Fuchs et al. 2001; Torgesen et al. 1997; Wolf and Katzir-Cohen 2001). The Dictation subtest is a measure of basic writing and spelling skills, and involves asking young children to write letters of the alphabet, and older children to spell words read aloud. This task involves spelling skills, which tap into both reading vocabulary and decoding skills. Finally, the Applied Problems subtest taps a range of basic mathematics and arithmetic skills, including counting, addition, and subtraction.

Longitudinal Data Analysis

Academic achievement in reading and mathematics was compared during waves 1–8 for children in the comparison group and children who met criteria for each of the subtypes of ADHD using longitudinal linear regression in general estimating equations (Harden and Hilbe 2003). GEE models the average value of the outcome variable for each subset of individuals who share the same value of the predictor variable. Because GEE estimates averages, and not the entire distribution of values, it is less restricted by distributional assumptions than other approaches to longitudinal data analysis. GEE allows specification of a within-person correlation structure to account for within-person correlations in the outcome variable over time. In all present analyses, an autoregressive correlation structure was specified. All statistical tests used the z-statistic and were based on the robust (“empirical”) standard error because it adjusts for dispersion and minimizes the effect of incorrect specification of the within-person covariance structure.

Results

Because the comparison children were recruited to approximately match the probands demographically, the groups were similar in terms of sex and race distribution. As shown in Table 1, however, the four groups differed significantly on age, because children in the predominantly inattentive group were significantly older than the children in the comparison and other modified ADHD groups in wave 1, as in previous studies of other samples. The four groups differed in terms of intelligence and the number of symptoms of ODD, CD, and internalizing disorders, indicating the need to control these variables when assessing the functional impairment associated with ADHD. Furthermore, ADHD combined and predominantly inattentive children had lower family incomes than children in the comparison group, indicating the need to control for family income as well. Finally, children in all modified ADHD subtype groups had more mother-reported symptoms of internalizing disorders; this was controlled in subsequent analyses.

Preliminary Longitudinal Analyses

Because the groups differ on some child characteristics and demographic variables, preliminary analyses were first conducted to select control variables for the longitudinal analyses of the response variables of academic achievement. The preliminary longitudinal model for each response variable included time (assessment waves), intelligence, methodologic variables (cohort, site, and whether the examiner in each wave was blind to the child’s diagnosis), demographic characteristics (age in wave 1, sex, family income, and race-ethnicity), child characteristics (number of symptoms of oppositional defiant disorder and conduct disorder, and number of symptoms of anxiety and depression), and whether the child had received psychoactive medication or psychosocial treatment in the 12 months prior to each assessment. Control variables were retained in the final models for each response variable if they were significant at p < 0.10. Time was included in all models to allow correct interpretation of group differences. For reading, intelligence, total family income, the number of internalizing symptoms (anxiety and depression) in wave 1, and whether the examiner was blind in each wave to the diagnosis (treated as a time-varying covariate) were retained as control variables. For mathematics, the control variables were intelligence, the number of internalizing symptoms in wave 1, and race-ethnicity. Race-ethnicity was specified using dummy variables to compare two groups of children classified by their mothers as African American or other race-ethnicity to children classified as Non-Hispanic white. For dictation scores, intelligence, site, the child’s age in wave 1, and total family income were retained as control variables. When either family income or race-ethnicity were significant in these preliminary models, however, both variables were retained in the final model to model these sociodemographic variables in context. In each final model, group-by-time interactions were tested, but as noted below, such interactions were significant at the p < 0.05 level only for dictation scores.

Planned Comparisons

Figure 1 presents reading achievement scores for the four groups. In all figures, achievement scores are presented as z-scores normalized within each wave and residualized for intelligence scores. This gives comparison children scores near zero in all waves and compares the three modified ADHD subtype groups to the comparison group. Planned comparisons were conducted comparing each subtype group with the comparison group. When intelligence and the other covariates were controlled, children who met modified criteria for the inattentive subtype had significantly lower reading achievement scores over the 8 year period than comparison children, β = −7.00, z = −2.28, p < 0.03, but children who met modified criteria for the combined subtype (β = 0.24, z = 0.14, p = 0.89) and the predominantly hyperactive–impulsive subtype (β = 3.26, z = 1.40, p = 0.16) did not differ significantly on reading scores from comparison children. Effect sizes were calculated by taking the difference between group means of test scores for each kind of academic achievement in the year 6, 7, and 8 assessments, residualized for intelligence scores, and dividing by the pooled variance. For those children who had missing data for one or more of the last 3 waves, mean scores were calculated based on the available waves of data. The effect size for the significant difference in reading between the inattentive and comparison groups was Cohen’s d = −1.16. The number of internalizing symptoms reported by the parent in wave 1 also predicted reading test scores over 8 years, even when the subtypes of ADHD and all covariates were in the model, β = −0.41, z = −1.98, p < 0.05. The difference in the mean of residualized reading scores in years 6, 7, and 8 between children with 0 versus ≥5 internalizing symptoms also was large (d = 0.94).

Fig. 1
figure 1

Reading achievement over 8 years expressed as z-scores (M = 0, SD = 1) normalized within assessment waves and residualized on covariates for children given the diagnosis of the combined subtype, the predominantly hyperactive–impulsive (Hyper–Imp) subtype, and the predominantly inattentive subtype of attention-deficit/hyperactivity disorder (ADHD) and for non-ADHD comparison children. Values in x-axis refer not to grade level in school but to assessment years, from wave 1 (ages 4–6) to wave 8 (ages 11–13). Assessments in wave 5 were not conducted

In order to understand the impact of covarying the child characteristics of intelligence and wave 1 internalizing symptoms on these planned comparisons, they were removed from the model one at a time. Removing only wave 1 internalizing symptoms from the model did not change the pattern of findings, as only the inattentive type exhibited lower reading scores than the comparison group, β = −8.60, z = −2.90, p < 0.005. Removing only intelligence from the model, however, resulted in both the combined type, β = −4.77, z = −2.74, p < 0.01, and the inattentive type, β = −11.27, z = −3.80, p < 0.0001, exhibiting lower reading scores than the comparison group.

As shown in Fig. 2, children who met modified criteria for the inattentive subtype had lower mathematics achievement scores than did comparison children, β = −6.49, z = −3.34, p < 0.001, Cohen’s d = −1.30, but children who met modified criteria for the predominantly hyperactive–impulsive subtype (β = 0.40, z = 0.18, p = 0.36) did not. In addition, children who met modified criteria for the combined subtype received marginally lower mathematics test scores than comparison children, β = −2.55, z = −1.92, p = 0.055. The number of internalizing symptoms reported by the parent in wave 1 also predicted mathematics test scores over time, even when the subtypes of ADHD and all covariates were in the model, β = −0.37, z = −2.15, p < 0.04. The difference in the mean of residualized mathematics scores in years 6, 7, and 8 between children with 0 versus ≥5 internalizing symptoms was considerable (d = 1.42).

Fig. 2
figure 2

Mathematics achievement over 8 years expressed as z-scores (M = 0, SD = 1) normalized within assessment waves and residualized on covariates for children given the diagnosis of the combined subtype, the predominantly hyperactive–impulsive (Hyper–Imp) subtype, and the predominantly inattentive subtype of attention-deficit/hyperactivity disorder (ADHD) and for non-ADHD comparison children

When only wave 1 internalizing symptoms were removed from the model, the inattentive subtype had lower mathematics achievement scores than did comparison children, β = −7.86, z = −3.95, p < 0.0001, and children who met modified criteria for the combined type did as well, β = −3.74, z = −2.86, p < 0.005. The 12% increase in β for the combined type group when internalizing symptoms were removed was not large, but sufficient to reach traditional levels of statistical significance. When only intelligence scores were removed from the model for mathematics, both the inattentive subtype, β = −13.20, z = −5.28, p < 0.0001, and the combined subtype, β = −10.07, z = −5.57, p < 0.0001, had lower mathematics achievement scores than comparison children.

As shown in Fig. 3, children who met modified criteria for the inattentive subtype had lower dictation scores than comparison children, β = −6.90, z = −2.86, p < 0.005, Cohen’s d = −1.46, but children who met modified criteria for the combined type, β = −2.34, z = −1.75, p = 0.08, and the predominantly hyperactive–impulsive subtype, β = 0.83, z = 0.38, p = 0.70, did not at the 0.05 level. For dictation, there was a significant interaction with time, β = 1.21, z = 2.66, p < 0.01, reflecting a steeper increase in dictation scores in the hyperactive–impulsive group than the comparison group, controlling for intelligence and the other covariates. The apparent interaction in the opposite direction with time for the inattentive group did not reach conventional levels of significance, β = −0.89, z = −1.83, p = 0.07. When intelligence scores were dropped from the model the group-by-time interactions were significant for both the inattentive type, β = −0.99, z = −1.98, p < 0.05, and the hyperactive–impulsive type, β = 1.91, z = 2.62, p < 0.01. These reflect steeper increases in dictation scores among the hyperactive–impulsive type than the comparison group and steeper decreases in dictation scores among the inattentive group than the comparison group when intelligence was not controlled.

Fig. 3
figure 3

Dictation scores over 8 years expressed as z-scores (M = 0, SD = 1) normalized within assessment waves and residualized on covariates for children given the diagnosis of the combined subtype, the predominantly hyperactive–impulsive (Hyper–Imp) subtype, and the predominantly inattentive subtype of attention-deficit/hyperactivity disorder (ADHD) and for non-ADHD comparison children

The significant associations between varying levels of parent-reported internalizing (anxiety and depression) symptoms in wave 1 and reading and mathematics test scores over time are illustrated in Figs. 4 and 5, respectively. Children who were reported to exhibit more internalizing symptoms in year 1 had markedly lower reading and mathematics scores, controlling for intelligence and the other covariates.

Fig. 4
figure 4

Reading achievement over 8 years expressed as z-scores (M = 0, SD = 1) normalized within assessment waves for children reported by their mothers to exhibit different ranges of numbers of anxiety and depression symptoms during the first assessment

Fig. 5
figure 5

Mathematics over 8 years expressed as z-scores (M = 0, SD = 1) normalized within assessment waves for children reported by their mothers to exhibit different ranges of numbers of anxiety and depression symptoms during the first assessment

Post Hoc Comparisons

In addition to the planned comparisons reported above, a number of post hoc comparisons were conducted in an exploratory spirit in the same regression models to generate hypotheses for future research. Controlling all variables in the corresponding model noted above, children who met modified criteria for the inattentive subtype had significantly lower reading achievement scores than children who met criteria for the combined subtype, β = −7.24, z = −2.52, p < 0.02, and the predominantly hyperactive–impulsive subtype, β = −10.26, z = −2.93, p < 0.005. Similarly, children who met modified criteria for the inattentive subtype had significantly lower mathematics test scores than children who met modified criteria for the combined subtype, β = −3.93, z = −2.14, p < 0.04, and the predominantly hyperactive–impulsive subtype, β = −6.89, z = −2.56, p < 0.02. In addition, children who met modified criteria for the inattentive subtype had significantly lower dictation test scores than children who met modified criteria for the predominantly hyperactive–impulsive subtype, β = −7.35, z = −2.62, p < 0.01, but differed only marginally from the combined subtype, β = −4.56, z = −1.93, p = 0.053.

Follow-up Analyses

Because we required impairment in only one setting in the first year for the diagnosis of ADHD, rather than the two settings required by DSM-IV, the models above were also conducted when impairment in two settings was required. The numbers of children who met criteria for each subtype in the first assessment were smaller: combined (N = 73), hyperactive–impulsive (N = 15), and inattentive (N = 8). Using this more restrictive definition of the subtypes, children who met criteria for the combined subtype did not differ from comparison children in reading scores over the 8 years, β = −2.58, z = −0.75, p = 0.45, but they did exhibit lower mathematics scores, β = −7.27, z = −3.61, p < 0.0005, and lower dictation scores than comparison children, β = −5.36, z = −2.13, p < 0.04. No other subtype differed from the comparison group at 0.05 using this definition of ADHD.

We further wished to explore whether those eight children in the inattentive subtype group who experienced impairment in two or more settings performed differently from those six children with impairment in only one setting. Analyses indicated that there were not significant differences at the p < 0.05 level between these two groups of children on the math and dictation scores. However, there was a significant difference between the groups of children on reading scores, z = 3.72, p < 0.001, with children impaired in only one setting having lower reading scores than children impaired in two settings.

We also used a continuous symptom variable to predict long-term academic achievement by conducting the same analyses with the dimensions of inattention and hyperactivity/impulsivity symptoms at wave 1 entered continuously after controlling for intelligence and all other covariates. This analysis explored the interaction between wave 1 symptoms—endorsed by parent, teacher, or both—of inattention and hyperactivity/impulsivity and academic achievement. The interaction between inattention and hyperactive/impulsive symptoms predicted reading scores over time significantly, β = 0.18, z = 2.18, p = 0.03. This prediction reflects the finding that inattention predicts reading scores better when hyperactivity/impulsivity is low. For math, the interaction was not significant, but inattention symptoms predicted math scores, β = −0.84, z = −2.84, p = 0.004, at all levels of hyperactivity/impulsivity symptoms. For spelling, the interaction was also not significant, and there was a nonsignificant trend toward inattention symptoms predicting dictation scores, β = −0.61, z = −1.88, p = 0.06, at all levels of hyperactivity/impulsivity.

Discussion

The present study examined the academic performance of children first diagnosed with modified criteria for ADHD at 4–6 years of age by following them over 8 years. The present findings did not support our hypothesis that children with significant problems with inattention (i.e., both the predominantly inattentive and combined subtypes) would have difficulties with academic achievement over time when intelligence and other partial confounds were controlled. Rather, only children in the predominantly inattentive subtype exhibited problems with academic underachievement when intelligence and other partial confounds were controlled. Children who met modified criteria for the predominantly hyperactive–impulsive and combined subtypes of ADHD at 4–6 years of age did not have lower academic test scores than the comparison children over time, relative to intelligence and other control variables.

These findings provide evidence in support of the predictive validity of the distinction between the predominantly inattentive subtype of ADHD and the other subtypes when children meet symptom criteria and are impaired in at least one setting at 4–6 years of age. It is not clear what this difference between the subtypes of ADHD means in taxonomic terms, however. It is possible that the predominantly inattentive subtype exhibits problems with inattention that are more severe than those experienced by children with the combined subtype in their impact on academic learning. On the other hand, the present findings raise the possibility that younger children who meet criteria for the inattentive type have qualitatively different problems (Milich et al. 2001). For example, it is possible that the inattentive group has serious deficits in academic skill learning that lead to their symptoms of inattention. That is, they may cease to attend to academic work because they cannot learn the academic skills rather than failing to learn because they are inattentive. That is, it is possible that meeting symptom criteria for the inattentive subtype of ADHD at 4–6 years of age is indicative of learning disability rather than ADHD. Although differentiating between these two explanations for the present findings will be difficult, it should be a priority for future research with children who show early problems with inattention symptoms. It will be particularly important to compare the results of the present study with those of children recruited at later ages, as the young children in the present sample may not be representative of all children with ADHD.

Interestingly, of the six children who were impaired in only one setting in the predominantly inattentive subtype, five of them were impaired only in school. This finding, along with the fact that children with impairment in only one setting had lower reading scores than children with impairment in two settings, further indicates that early problems with inattention and early academic deficits in the school setting are difficult to disentangle, and may suggest that for these children, the inattention deficits are truly indicative of early learning problems. However, as these children were quite young when recruited into the study, it is likely that parents did not have an opportunity to develop a concern about problems with inattention, in that parents notice hyperactive and impulsive behavior at home more readily due to the more salient nature of those behaviors in the home setting.

As mentioned in the introduction, the present study is an exploration of the predictive validity of ADHD symptoms as measured in early childhood. One reason that the present findings are important is that they suggest that the inattentive and combined subtypes are different enough in terms of academic achievement to be distinguished in spite of the considerable lack of stability of the subtypes over time. In fact, children who met criteria for the inattentive subtype at wave 1 (and who are therefore the Inattentive subtype group in the present analyses) were very likely to continue to meet criteria for ADHD, albeit for different subtypes, as they progressed through school (Lahey et al. 2005). These findings suggest the predictive validity of the symptoms overall, but variability in diagnostic groups over time suggest that the prediction is not subtype-specific.

The present analyses suggest that the predictive validity for problems with academic achievement may be higher for children who met symptom criteria for inattentive subtype in wave 1. These findings are further supported by the dimensional analysis, which found that inattention better predicted reading scores at lower levels of hyperactivity/impulsivity, highlighting the strength of the relationship between early inattention and reading problems. The analyses exploring symptom scores dimensionally build on the diagnostic analyses by demonstrating the prediction of either inattention symptoms (for math) or the interaction between inattention and hyperactivity/impulsivity symptoms (for reading). These findings further underscore the importance of inattention in predicting academic achievement over time, and highlight the fact that inattention may only predict reading problems among children with lower levels of hyperactivity/impulsivity. Thus, the present data contribute to our understanding of the predictive validity of symptoms of ADHD in terms of children's academic achievement over time, even if they raise new questions regarding the reasons for these differences in academic achievement.

To begin to address the question of whether children in the predominantly inattentive subtype may be better characterized as learning disabled, we examined the number of children within that subtype who demonstrated a 15-point discrepancy between their achievement and intelligence scores. Albeit acknowledging the serious limitations of this approach to defining LD, we wished to shed additional light on the differences between the subtypes. It was interesting that only 3 of the 14 children in the inattentive subtype group had a 15-point discrepancy at baseline. Although these numbers are too small to reach strong conclusions, the small proportion of children in the inattentive subtype group with a 15-point discrepancy at baseline was not consistent with the idea that many children in this group have learning problems serious enough to be considered to be learning disabled. That does not rule out the possibility that they have less serious academic learning deficits that might still be large enough to cause the development of secondary inattentiveness. Furthermore, as the children were quite young in wave one there might have been a floor effect for the academic achievement measure, and this might have had further impact on the number of children who demonstrated a 15-point discrepancy.

In a related issue, the pattern of academic performance relative to intelligence in the groups remained relatively stable over time in terms of rank order. Children in the inattentive group consistently performed the lowest (with the exception of wave 1 Dictation scores), and there was little crossover in ordering of the other groups’ performance over time. It is further of interest to note that the majority of Group X Time interactions were not significant, further suggesting that patterns of performance remained stable over time. These patterns suggest stability in performance relative to intelligence. In fact, the only significant interactions emerged for Dictation scores, when intelligence was removed as a covariate in the model. This is an important issue, as the data were analyzed in terms of children’s performance relative to their performance on intelligence tests, not in absolute terms. In other words, children in the inattentive group performed worse than expected given their intelligence scores, while children in the hyperactive group performed somewhat better than expected given their intelligence.

One possible explanation for the fact that children in the inattentive group did not demonstrate a decrease in scores relative to intelligence over time is that these children were more likely to receive special education services. In fact, at wave 2 and at each subsequent wave at least 50% of inattentive subtype children received special education services, compared to less than 20% of hyperactive–impulsive subtype children and 20–40% (depending on wave) of combined subtype children. These high rates of participation in special education services for inattentive children could explain the maintenance rather than worsening of the performance gap between inattentive and other subtypes relative to intelligence.

The number of children who met modified criteria for the predominantly inattentive subtype in wave 1 of the present study was small (N = 14). Although large effect sizes for group differences were found in the present study, it would be important to attempt to replicate the present findings in larger samples of young predominantly inattentive children.

The present findings examined basic reading, dictation, and mathematics skills, and thus focused on a relatively narrow set of academic achievement variables. For example, we did not include an analysis of more complex skills such as reading comprehension or mathematics reasoning. As the children in the present sample grow older, they will be called upon to perform increasingly complex academic skills, including integration of knowledge. It would be important to compare the groups on such advanced skills in future assessments in order to more completely document the ways in which academic impairment is related to early ADHD. It should also be noted that we examined the performance of children who met modified criteria for ADHD on standardized individually-administered measures of academic achievement. As such, it is an analysis of children’s underachievement on tests of basic skill development, rather than an assessment of how well they function academically in the classroom. Thus, these findings should not be construed as showing that children with the combined and hyperactive–impulsive groups do not have problems in meeting the academic demands of the classroom. An important next step in this area is to explore the relationship between different ADHD subtypes and classroom academic impairment, including work completion, academic productivity and accuracy, and work-related skills such as organization and planning. This broader exploration of academic achievement and success in school is critical to explore in order to complete the picture of academic functioning and achievement of young children with ADHD as they move through school.

Because the present sample of children who met modified criteria for ADHD was a clinic-referred group of children, it is possible that these findings will not generalize beyond treatment-seeking samples. It is possible, in fact, that the children in the present study are a particularly impaired sample of children, as most parents do not seek services for ADHD children until well into the elementary school years. Although it is essential to understand the nature of ADHD in clinic samples, it is also important to replicate the present findings using more representative samples. In a related issue, it is interesting to note that while the effects of treatment were not explored in the present study; treatment (either psychosocial or medication) was included as a covariate in the analyses, and did not emerge as a significant predictor of effects. Given evidence suggesting that traditional ADHD treatments (both psychosocial and medication treatments) demonstrate effects on academic behavior (such as time on task), but not on academic achievement, the findings in the present analyses are not surprising. The potential effects of ADHD treatment on academic achievement are important to explore, and the present findings further suggest the importance of combining ADHD treatment with targeted academic interventions for those children who show skills deficits. Because the measures of treatment in the present sample only included parent report, it was not possible to verify these reports or to explore treatment adherence. An important next step would be to determine the role of treatment in the relationship between inattention and academic achievement.

Given the longitudinal nature of the study, it is impressive that children with high rates of parent-reported internalizing symptoms in wave 1 had consistently lower reading and mathematics scores over an 8 year period to ages 11–13, even controlling for ADHD, intelligence, and other confounds. Interestingly, symptoms of oppositional defiant disorder and conduct disorder were not found to be independently associated with lower reading and mathematics achievement. Unlike internalizing symptoms, early conduct problems do not appear to contribute to academic underachievement. It will be very important to explore further these relationships in future studies, as it appears that both meeting modified criteria for the inattentive subtype of ADHD and early internalizing symptoms are robust predictors of future academic underachievement. There are no current studies that address the role of internalizing symptoms in ADHD children’s academic functioning over time; the present results suggest that a pattern of multiple associations—among inattention, internalizing symptoms, and academic achievements—may emerge as early as children’s first school experiences. It is possible that internalizing symptoms endorsed by parents are a manifestation of children’s depression or anxiety regarding poor performance and behavioral difficulties in school, particularly given that the relationship between internalizing symptoms and academic achievement emerged early (in wave 1) and remained stable over 8 years. Alternatively, children’s academic difficulties may be secondary to their comorbid problems early on. For example, children who have early problems with inattention and who also have difficulties with depression or anxiety may have particular difficulties in attending to and participating fully in the learning environment at school, thereby experiencing early deficits that persist over time. Future studies are needed to elucidate the relationships between inattention, early internalizing symptoms, conduct problems, and academic underachievement.