Attention-Deficit/Hyperactivity Disorder (ADHD) is a highly prevalent neurodevelopmental disorder characterized by developmentally inappropriate levels of inattention and/or hyperactivity/impulsivity which cause significant functional impairment in multiple settings (American Psychiatric Association 2013; Polanczyk et al. 2007; Visser et al. 2014; Walkup et al. 2014). Further, as compared to their typically-developing peers, children with ADHD have been shown to perform poorly on an array of neurocognitive tests (Ware et al. 2012; Willcutt, Doyle, Nigg, Faraone, & Pennington, 2005). Among the many cognitive deficits that have been linked to ADHD, considerable attention has focused on working memory (WM) impairments (Kasper et al. 2012; Martinussen et al. 2005). WM is a temporary storage system in which an individual can maintain, update, and/or manipulate information over brief periods in order to guide ongoing behavior and cognitive activities (Baddeley and Hitch 1974). Substantial research has demonstrated that, as a group, children with ADHD have poorer WM as compared to their typically-developing peers (Bedard et al. 2004; de Jong et al. 2009; Gau et al. 2009; Kasper et al. 2012; Kerns et al. 2001; Martinussen et al. 2005). Some data have suggested that ADHD probands manifest deficits in auditory-verbal and visual-spatial modalities (Fair et al. 2012; Gau and Chiang 2013; McInnes et al. 2003; Nikolas and Nigg 2013), yet other studies have indicated differentially greater weakness in visual-spatial relative to auditory-verbal WM (Martinussen et al. 2005; Simone et al. 2016).

Despite group-level differences in WM between children with and without ADHD, the disorder is characterized by considerable phenotypic and neurocognitive heterogeneity (Nigg et al. 2005). Castellanos and Tannock (2002) posited that WM, among other neurocognitive processes, might represent a distinct endophenotype of ADHD, which may help to parse the vast heterogeneity of the disorder. A recent empirical study investigated the role of different executive functions as potential endophenotypes for ADHD by comparing typically-developing children to unaffected siblings of youth with ADHD and found that WM weaknesses were evident in some, but not nearly all of the unaffected siblings (Nikolas and Nigg 2015). These findings suggest that WM weaknesses could represent a potential endophenotype for a subset of children with ADHD, however other children with the disorder may exhibit different neurocognitive deficits (or potentially none).

While WM deficits are clearly evident in some children with ADHD, the etiological role of WM in ADHD has remained elusive. Barkley (1997) has suggested that WM deficits in ADHD are largely secondary to a core deficit in inhibitory control. Others have suggested that WM weaknesses are characteristic of only a subgroup of children with the disorder (Castellanos and Tannock 2002; Nikolas and Nigg 2015). In contrast, Rapport and colleagues (Alderson et al. 2010; Rapport et al. 2001) have hypothesized that impaired WM is the core underlying neurocognitive deficit leading to the dysregulated behavior typical of children with ADHD (i.e., deficits in attention and behavioral inhibition). While cross-sectional studies supporting their model have identified a possible mediating role of WM vis-à-vis ADHD and response inhibition (Alderson et al. 2010), and ADHD and activity level (Rapport et al. 2009), a more definitive demonstration of mediation would require a longitudinal design (Selig and Preacher 2009). Rapport and colleagues have also suggested that WM impairments are evident in as many as 81–98% of children with ADHD (Kasper et al. 2012; Rapport et al. 2013), yet the methodological approach employed also resulted in 50% of typically-developing children falling below the “average” score. Taken together, there is evidence to suggest that many children with ADHD present with WM weaknesses. However, due to inconsistencies in the literature, it remains unclear how many children with the disorder actually have deficient WM, and the extent to which this specific neurocognitive weakness contributes to difficulties in daily functioning.

In addition to poor WM, relative to their typically-developing peers, children and adolescents with ADHD present with significantly higher rates of academic underachievement (DeShazo et al. 2002; Hinshaw 1992), as well as school drop-out (Loe and Feldman 2007; Trampush et al. 2009). It has been estimated that 20–50% of children with ADHD meet criteria for a learning disability (LD; Pastor and Reuben 2008; Pliszka 2000). Yet, many children with ADHD have been shown to have poor academic functioning, even in the absence of a frank LD. As WM has also been linked to poor academic achievement in children (regardless of ADHD diagnosis; Alloway and Alloway 2010), some investigators (Rogers et al. 2011; Sjowall and Thorell 2014) have begun to examine whether WM ability mediates the relation between ADHD symptoms and academic outcomes in school-aged children. Specifically, Rogers et al. (2011) found that in adolescents with ADHD, auditory-verbal and visual-spatial WM partially mediated the relation between inattentive symptoms and performance on tests of reading, but not mathematics, achievement. Similarly, Sjowall and Thorell (2014) found that WM (collapsed across auditory-verbal and visual-spatial tasks) partially mediated the relation between ADHD symptoms and teacher ratings of children’s math and language skills. Given the heterogeneity of WM impairment in samples of children with ADHD, it is still unclear whether ADHD and WM ability uniquely contribute to poor academic outcomes. Further, there have been inconsistent findings regarding the relations between ADHD, WM, and mathematics outcomes, which could be due to the different academic outcome assessments used (i.e., tests versus teacher ratings). Thus, it is important to examine within a single sample of children whether ADHD symptoms and WM ability both, or differentially, contribute to objective tests and subjective ratings of academic achievement.

To date, two studies (Alloway et al. 2010; Holmes et al. 2014) have compared children with ADHD to non-ADHD children with low WM and found that the groups did not appear to differ on tests of academic achievement. Alloway et al. (2010) divided their sample based on teacher ratings of WM (irrespective of ADHD diagnosis) and found that the low WM group performed substantially poorer on all academic achievement measures relative to those with average WM, but that the groups did not differ on teacher ratings of classroom functioning. In contrast, Holmes et al. (2014) found that teachers rated children with ADHD as having significantly more hyperactivity and impulsivity than their non-ADHD low WM peers. Based on these findings, it would appear that WM is contributing to poorer academic achievement in children (regardless of ADHD status), and the contribution of WM to behavioral dysfunction in children with or without ADHD is unclear. Further, it remains uncertain whether both ADHD symptom domains (inattentive and hyperactive/impulsive), as well as WM ability significantly contribute to poorer academic and behavioral functioning in school-aged children.

The present study examined whether WM ability (auditory-verbal and visual-spatial), inattentive symptoms, and/or hyperactive/impulsive symptoms significantly contribute to academic, behavioral, and global functioning among 8-year-old children. Children completed tests of academic achievement; teachers rated the children on academic and behavioral functioning; and clinicians judged overall global functioning. As findings regarding distinct associations of modality-specific WM processes and academic abilities are mixed (Brady 1991; Jorm 1983; McLean and Hitch 1999; Schuchardt et al. 2008; Swanson and Sachse-Lee 2001), we made no specific hypotheses regarding differential relations between WM modalities and academic achievement. Therefore, irrespective of modality, we hypothesized that:

  1. 1)

    WM ability, but not ADHD symptom severity, would be significantly associated with all measures of academic functioning (objective and subjective).

  2. 2)

    Inattentive and hyperactive/impulsive symptom severity, but not WM ability, would significantly predict teacher ratings of behavioral functioning and clinician ratings of global functioning.

If these hypotheses are supported, it would suggest a double dissociation whereby WM ability would be linked to learning and academic problems rather than behavioral functioning in school-aged children; and inattentive and hyperactive/impulsive symptoms, but not WM ability, would be more closely associated with poorer behavioral functioning.

Method

Participants

The participants in the current study were part of a larger longitudinal investigation (n = 216) in which preschoolers were initially recruited at 3–4 years old via screenings at local preschools and direct referrals from preschools and community mental health providers. All were rated by parents and teachers using the ADHD-Rating Scale IV (ADHD-RS; DuPaul et al. 1998) and categorized as “hyperactive/inattentive” (i.e., “at-risk” for developing ADHD) or “typically-developing.” Those rated as having at least six symptoms (i.e., rated as occurring often or very often) of either inattention and/or hyperactivity/impulsivity by a combination of parent and teacher reports were considered hyperactive/inattentive; those rated as having fewer than three symptoms in both domains by parents and teachers were classified as typically-developing. Recruitment was set such that approximately twice as many children were entered into the hyperactive/inattentive group (n = 140) than the typically-developing group (n = 76). At study entry, participants were required to be English-speaking and attending preschool or daycare. Exclusionary criteria were: Full Scale IQ < 80 as assessed by the Wechsler Preschool and Primary Scale of Intelligence–Third Edition (WPPSI-III; Wechsler 2006), systemic medication use (including for ADHD), and presence of a neurological, post-traumatic stress, and/or pervasive developmental disorder. Additional recruitment and selection details for the original sample can be found in Rajendran et al. (2013).

Of the 216 preschoolers initially recruited for the longitudinal study, 160 returned for their 8-year-old evaluation (Mean age = 8.56, SD = 0.31, range = 7.91–9.33): 53 were typically-developing at preschool and did not have ADHD at 8-years-old; 11 were typically-developing at preschool and did have ADHD at 8-years old; 21 were at-risk for ADHD at preschool and did not have ADHD at 8-years-old; 75 were at-risk for ADHD at preschool and did have ADHD at 8-years-old. The children who returned for this follow-up evaluation did not differ from those lost to follow-up on any key demographic variables assessed at the initial evaluation, which included age, socioeconomic status (SES), WPPSI-III IQ scores, or parent- and teacher-ratings of ADHD. The 8-year-old evaluation was used in this study largely for practical reasons, as it was the first time in our longitudinal study in which both auditory-verbal and visual-spatial WM were assessed. No additional exclusionary criteria were used at the 8-year-old evaluation.

At 8-years-old the sample was predominantly male (75.6%) and largely middle class, but included youth from a range of socioeconomic backgrounds (see Table 1). The children were of varied racial and ethnic backgrounds: White/Caucasian (58.8%), Other/Mixed Race (18.1%), Asian/Pacific Islander (12.5%), and Black/African-American (10.6%); 70.0% were non-Hispanic (70.0%). ADHD symptom severity and diagnoses were determined using the Kiddie Schedule for Affective Disorders and Schizophrenia – Present and Lifetime Version (K-SADS-PL; Kaufman et al. 1996), which was administered to a parent or caregiver. Of the children assessed at 8-years-old, 53.8% met criteria for a DSM-5 (American Psychiatric Association 2013) ADHD diagnosis (Inattentive Presentation =15.6%, Hyperactive/Impulsive Presentation =5.0%, Combined Presentation =29.4%, Not Otherwise Specified =3.8%). Several children in the sample met criteria for internalizing (20.6%) and externalizing disorders (8.8%). In accordance with DSM-5 (American Psychiatric Association 2013), 4.4% (n = 7) of our sample met diagnostic criteria for a specific learning disorder (LD; i.e., having a score falling 1.5 standard deviations or lower than the population mean on any of the academic achievement measures). Of these children, all but one was deemed at-risk for developing ADHD at the baseline evaluation and met diagnostic criteria for ADHD at the 8-year-old evaluation.

Table 1 Descriptive characteristics of the sample of children at 8-years-old

Materials

Diagnostic Measures

Kiddie Schedule for Affective Disorders and Schizophrenia – Present and Lifetime Version (K-SADS-PL; Kaufman et al. 1996)

Parents/caregivers were administered the K-SADS-PL by well-trained psychology graduate students or postdoctoral fellows who were supervised by a licensed psychologist to determine diagnoses and ADHD symptom severity. Each symptom was recoded from the original K-SADS-PL scale (1 = not present, 2 = subthreshold, and 3 = threshold) to a 0–2 scale. Thus, for each symptom domain (inattention and hyperactivity/impulsivity), total severity scores could range from 0 to 18. These severity scores for each ADHD symptom domain were used in the final analyses.

Working Memory Measures

Auditory-verbal WM was assessed using the Working Memory Index (WMI) of the WISC-IV Integrated (Kaplan et al. 2004). This index is comprised of the Digit Span and Letter-Number Sequencing subtests. The Digit Span subtest is separated into two parts: Forward and Backward. For the Forward condition, children listen and repeat back series of numbers, whereas in the Backward condition children report them in the reverse order. For both conditions the task consists of two trials for each span length that increases in length until the child fails both trials within a set or finishes the last sequence of the task. For Letter-Number Sequencing, children listen to a series of numbers and letters and are instructed to first recite the numbers in sequential order and then the letters in alphabetical order. The Letter-Number Sequencing task consists of three trials per span length that increase in length until the child fails all three trials within a set or completes the final sequence of the task. The WMI at 8-years-old has been shown to have strong internal consistency (r = 0.91) and test-retest reliability (r = 0.84; Kaplan et al. 2004).

Visual-spatial WM was assessed using the Spatial Span subtest from the WISC-IV Integrated that contains two conditions, Spatial Span Forward and Spatial Span Backward. In Spatial Span Forward, children watch the examiner tap a series of blocks and then tap the blocks in the same order as was presented. In Spatial Span Backward, children tap them in the reverse order to what was presented. Similar to the Digit Span subtest, for both Spatial Span conditions each span length contains two trials and ends when an individual fails both trials of a set or finishes the last sequence of the task. As the Spatial Span subtest does not calculate a standardized score for the total collapsed performance on the forward and backward conditions, we averaged the individual scaled scores for these two conditions to arrive at a combined scaled score of visual-spatial WM.

Within our sample, the WMI and averaged Spatial Span scaled scores demonstrated a moderate, positive correlation (r = 0.576, p < 0.001), suggesting some overlap between these measures, but that they are relatively distinct from each other. Among those who did and did not meet criteria for ADHD, 34.5% and 13.7% fell below the 25%ile on the WMI (X 2 = 9.07, p = 0.003) and 24.4% and 5.5% fell below the 25%ile on the Spatial Span tests (X 2 = 10.91, p = 0.001), respectively. Thus, more children with ADHD had WM difficulties relative to controls, but even with this liberal cut score, the majority of children with ADHD had normatively intact WM ability.

Academic Achievement Measures

Wechsler Individual Achievement Tests – Second Edition (WIAT-II; Wechsler 2001)

Children were administered selected subtests from the WIAT-II (referenced below) to yield an objective assessment of academic achievement. Each subtest is administered separately with its own instructions, which include reversal and discontinue rules. For each subtest, raw scores are calculated and then transformed into individual standard scores (M = 100, SD = 15).

Word Reading

The Word Reading subtest requires individuals to read a series of American English words. The task begins with simpler words and progresses in word complexity and is discontinued when the participant is unable to accurately read six consecutive words or the final word of the test is read.

Pseudoword Decoding

This subtest requires individuals to read a series of nonsense words phonetically. The task begins with simpler nonsense words and progresses in complexity. The task is discontinued when the participant is unable to accurately read six consecutive nonsense words or the final nonsense word of the test is read.

Spelling

The Spelling subtest requires individuals to listen to sentences read aloud by an examiner, and to write a specified target word from the sentence. The task begins with simpler words and progresses in word complexity. The task is discontinued when the participant is unable to accurately spell six consecutive words or the final word of the test is administered.

Reading Comprehension

This subtest requires individuals to read a series of short sentences and passages and then answer questions about what they previously read. Participants begin the task based on their current grade level (or most recent grade level completed) and the task is discontinued when the participant reaches the final sentence or passage within their grade section.

Numerical Operations

The Numerical Operations subtest requires individuals to complete various mathematical problems (e.g., addition, subtraction, percentages, fractions, etc.) that increase in complexity as the task progresses. The task is discontinued when the participant completes six consecutive problems incorrectly or when the final problem is reached.

Table 2 shows correlations among the WM, academic achievement, and ADHD symptom severity scores. As shown in Table 2, all of the WM, academic achievement measures, preschool IQ, and ADHD symptom domains are moderately correlated; thus suggesting that while there is some overlap between these variables, they are all relatively distinct from each other.

Table 2 Pearson bivariate correlations for WM, preschool IQ, academic achievement measures, and symptom severity scores

School/Classroom Functioning Measure

National Institute for Children’s Health Quality (NICHQ) Vanderbilt Assessment Scale – Teacher Version (Wolraich et al. 2003)

The NICHQ was used to assess children’s overall school/classroom functioning. Teachers completed these rating scales, which probed for the student’s performance in mathematics, reading, and written expression. Teachers also rated the students on their behavioral functioning in the classroom: 1) relationship with peers, 2) following directions, 3) disrupting the classroom, 4) assignment completion, and 5) organizational skills. For each of the above academic and behavioral dimensions, teachers rated students on a 5-point Likert scale (1 = excellent, 2 = above average, 3 = average, 4 = somewhat of a problem, and 5 = problematic). For our analyses, we used the sum of each of these scales (i.e., three items of academic functioning and five items of behavioral functioning) as our outcome measure. For our sample, coefficient alphas for the teacher-reported academic functioning and behavioral functioning scales were 0.81 and 0.88, respectively.

Global Functioning Measure

Following a comprehensive evaluation, which included the K-SADS-PL interview with parents and rating scale data from parents and teachers, each child’s case was presented to a group of clinicians who independently rated the child on the Children’s Global Assessment Scale (CGAS; Schaffer et al. 1983) based on the child’s lowest level of functioning over the previous evaluation year. Scores on this scale range from 1 to 100, with scores below 60 typically representing impaired functioning (Schaffer et al. 1983). Median scores across clinicians were calculated for each child, and this score was used in the final analyses. Across the 160 participants assessed at age 8, the number of clinician-raters varied from 4 through 13. Reliability among raters was calculated separately for each number of raters (except 4 and 13 where there was only one case each) using intra-class correlations (ICC). Reliability was excellent with ICC values ranging from 0.938–0.976.

Procedure

A member of the research team tested child participants individually while a different evaluator interviewed the child’s parent/caregiver using the K-SADS-PL. Both evaluators were blind to the child’s prior diagnostic status. For those children who were prescribed stimulant medication (n = 30, 18.8%), parents were instructed to withhold medication on the day of the evaluation. The full evaluation lasted approximately 2–3 h, during which children completed the academic tests, WM tasks, as well as other neuropsychological measures. Children were given a small prize at the end of the session for participating in the study. Parents received compensation for their time and expenses associated with study participation. This study was approved by the institutional review board (IRB) of the affiliated institution. Following a description of the study and their rights as participants, parents/caregivers signed informed consent, and children gave verbal assent.

Statistical Analyses

Multiple linear regressions were conducted to assess whether WM ability and ADHD symptom severity (inattentive and hyperactive/impulsive symptoms) significantly contributed to each outcome variable (i.e., academic achievement tests, teacher-rated academic functioning, teacher-rated behavioral functioning, and clinician-rated global functioning). As SES has been shown to be a strong predictor of academic outcomes (Sirin 2005), as well as health and social-emotional functioning in children (Bradley and Corwyn 2002), it was entered into the first step of each model to control for its effects on the dependent variables.

For the second step, WM ability (either auditory-verbal or visual-spatial), and K-SADS inattentive and hyperactive/impulsive symptom severity were added into the model to determine their individual associations with the outcome variables. For the final step, interaction variables between centered WM x inattentive symptoms and centered WM x hyperactive/impulsive symptoms were added into the model.

The first set of regression analyses was conducted with auditory-verbal WM (AVWM) ability, and then a second set of analyses using the same statistical procedures was conducted with visual-spatial WM (VSWM) ability.

Results

Auditory-Verbal WM (See Table 3)

Academic Achievement Tests

Word Reading

SES and AVWM significantly contributed to Word Reading, accounting for 12.5% and 23.9% of the variance, respectively. Inattentive and hyperactive/impulsive symptom severity, as well as the two interaction terms, was not significantly related to Word Reading scores.

Table 3 Multiple linear regressions with auditory-verbal WM – final model

Reading Comprehension

SES and AVWM were significantly associated with Reading Comprehension and accounted for 17.5% and 16.3% of the variance, respectively. Neither ADHD symptom domain nor their interactions with AVWM significantly predicted Reading Comprehension scores.

Pseuodoword Decoding

Again, SES and AVWM significantly contributed to Pseudoword Decoding accounting for 9.8% and 22.3% of the variance, respectively. None of the other predictor variables were significantly related to Pseudoword Decoding scores.

Numerical Operations

SES and AVWM were significantly related to Numerical Operations scores and accounted for 11.7% and 26.9% of the variance, respectively. Inattentive and hyperactive/impulsive symptom severity, as well as their interactions with AVWM, were not significantly related to Numerical Operations.

Spelling

Similar to the other WIAT-II subtests, SES and AVWM significantly contributed to Spelling scores accounting for 8.8% and 24.4% of the variance, respectively. Neither of the ADHD symptom domain nor their interactions with significantly predicted Spelling scores.

School Functioning as Rated by Teachers

As shown in Table 3, after accounting for SES, AVWM and inattentive symptom severity significantly contributed to classroom academic functioning accounting for 20.4% and 6.1% of the variance, respectively. None of the other predictor variables were significantly associated with teacher ratings of academic functioning. In contrast, after accounting for SES, inattentive symptom severity and hyperactive/impulsive symptom severity were significantly associated with classroom behavioral functioning, in which inattentive symptom severity accounted for 28.5% of the variance and hyperactive/impulsive symptom severity accounted for an additional 2.9% of the variance. Neither AVWM nor any of the other predictor variables were associated with teacher ratings of behavioral functioning.

Global Functioning as Rated by Clinicians

Similar to teacher ratings of behavioral functioning, both inattentive and hyperactive/impulsive symptom severity significantly contributed to clinician ratings of children’s overall global functioning, in which inattentive symptom severity contributed 64.4% of the variance and hyperactive/impulsive symptom severity contributed an additional 2.6% of the variance.

Visual-Spatial WM (See Table 4)

Academic Achievement Tests

Word Reading

SES and VSWM significantly contributed to Word Reading, accounting for 12.3% and 10.0% of the variance, respectively. Inattentive and hyperactive/impulsive symptom severity, as well as the two interaction terms were not significantly related to Word Reading scores.

Table 4 Multiple linear regressions with visual-spatial WM – final model

Reading Comprehension

SES and VSWM significantly contributed to Reading Comprehension, accounting for 17.3% and 8.0% of the variance, respectively. In addition, the interaction between VSWM x hyperactive/impulsive symptom severity was significantly associated with Reading Comprehension and accounted for an additional 3.7% of the variance. The nature of this interaction was such that children with higher hyperactive/impulsive symptom severity and lower VSWM had differentially poorer reading comprehension scores when compared to children with low levels of hyperactive/impulsive symptoms irrespective of VSWM and those with high levels of symptoms but stronger VSWM (see Fig. 1). None of the other predictor variables were significantly related to Reading Comprehension scores.

Fig. 1
figure 1

Interaction of hyperactivity/impulsivity symptom severity and visual-spatial working memory on reading comprehension scores. Error bars represent standard deviation (SD)

Pseudoword Decoding

SES and VSWM significantly contributed to Pseudoword Decoding, accounting for 9.8% and 7.0% of the variance, respectively. Inattentive and hyperactive/impulsive symptom severity and the two interaction terms were not significantly related to Pseudoword Decoding scores.

Numerical Operations

SES and VSWM significantly contributed to Numerical Operations, accounting for 12.6% and 19.4% of the variance, respectively. Inattentive and hyperactive/impulsive symptom severity, as well as the two interaction terms were not significantly related to Numerical Operations.

Spelling

SES and VSWM significantly contributed to Spelling, accounting for 8.6% and 11.5% of the variance, respectively. Inattentive and hyperactive/impulsive symptom severity, as well as the two interaction terms were not significantly related to Spelling.

School Functioning as Rated by Teachers

As displayed in Table 4, SES, VSWM, and inattentive symptom severity significantly contributed to classroom academic functioning. After accounting for SES, VSWM and inattentive symptom severity contributed 6.5% and 15.2% of the variance, respectively. Hyperactive/impulsive symptom severity and the two interaction terms were not significantly associated with teacher ratings of academic functioning.

Both inattentive and hyperactive/impulsive symptom severity significantly contributed to classroom behavioral functioning, in which inattentive symptom severity accounted for 29.1% of the variance and hyperactive/impulsive symptom severity accounted for an additional 2.9% of the variance. SES, VSWM, and the two interaction terms were not significantly associated with teacher ratings of behavioral functioning.

Global Functioning as Rated by Clinicians

Both inattentive and hyperactive/impulsive symptom severity significantly contributed to clinician ratings of children’s overall global functioning, in which inattentive symptom severity accounted for 64.6% of the variance and hyperactive/impulsive symptom severity accounted for an additional 2.5% of the variance. SES, VSWM, and the two interaction terms were not significantly related to clinician ratings of children’s global functioning.

Discussion

To the best of our knowledge this was the first systematic examination of differential relations of both WM modalities, as well as inattentive and hyperactive/impulsive symptom severity with academic, behavioral, and global functioning in children. As hypothesized, our data indicated that, regardless of which WM modality was assessed (auditory-verbal or visual-spatial), WM ability was significantly associated with all tests of academic achievement, but not with measures of behavior problems or overall global impairment. In contrast, inattentive and hyperactive/impulsive symptoms were associated with measures of behavior problems and global functioning, but not with academic achievement. These findings indicate that compromised WM ability is specifically related to poor academic achievement in children and that the presence of inattentive and hyperactive/impulsive symptoms per se have little to no relation to academic skills. Moreover, while SES was also shown to significantly predict academic achievement, the amount of additional variance accounted for by WM ability across many tests of academic achievement was nearly double what SES accounted for alone.

Interestingly, teacher-ratings of school-based academic performance yielded a somewhat different pattern of predictors. Across both modalities, WM and inattentive symptoms (but not hyperactive/impulsive symptoms) significantly contributed to teachers’ ratings of academic functioning (math, reading, and written expression). This discrepancy between objective test measures and teacher ratings of academic performance may be accounted for by either a difference between skills and performance in children with ADHD, or by negative biases affecting teacher ratings. For the first scenario, several studies (Barkley and Fischer 2011; Barkley and Murphy 2010) have shown that children’s performance on tests in an individual setting is often not strongly predictive to real-world environments (e.g., in school) and thus their behavioral dysregulation prevents them from communicating such knowledge in the classroom. Not surprisingly, inattentive symptom severity is more closely linked to classroom performance than hyperactivity/impulsivity, and more closely associated with classroom performance than test performance. Alternatively, teacher-ratings might be affected by halo effects (Abikoff et al. 1993). Specifically, behavior management issues may elicit a negative bias from classroom instructors, which in turn, influences their ratings of students’ academic functioning.

As hypothesized, we also found that both inattentive and hyperactive/impulsive symptoms, but not WM ability, significantly predicted both teachers’ and clinicians’ ratings of impairment and overall functioning. Specifically, these findings indicate that ADHD symptom severity was significantly associated with teachers rating children as exhibiting more problematic classroom behaviors (e.g., relationships with peers, difficulties organizing tasks and completing assignments, disrupting the classroom environment), and with clinicians rating children as having poorer overall global functioning. If WM was a core deficit in children with ADHD (Rapport et al. 2001), we would expect that WM ability would have also significantly contributed to teachers’ and clinicians’ ratings of these maladaptive behaviors. To the contrary, WM ability did not independently contribute to any measure of behavioral functioning. Thus, WM weaknesses do not appear to be contributing to or acting as a driver of ADHD-like behaviors. Rather, our findings indicate that WM ability, but not ADHD symptoms, is specifically related to academic functioning. This is consistent with findings from Alloway et al. (2010) who found that children with poor WM (irrespective of ADHD), were more likely to perform worse on academic measures as compared to their peers with average WM.

While not of primary interest for this study, it is notable that among the children with ADHD in our sample, fewer than half had compromised WM ability even when based on a liberal-cut off criterion (i.e., 25th percentile). These findings are consistent with other reports (Nigg et al. 2005; Nikolas and Nigg 2015), which suggest that only a minority of children with ADHD present with WM difficulties, again suggesting that WM is not a core underlying deficit of ADHD, but rather points to notable cognitive heterogeneity of the disorder (Castellanos and Tannock 2002).

It is notable within our data that there were virtually no differences observed on academic, behavioral, and global functioning measures when the analyses were conducted by using children’s auditory-verbal or visual-spatial WM ability. Prior studies have reported closer associations between reading skills and auditory-verbal as compared to visual-spatial WM (Brady 1991; Jorm 1983; Schuchardt et al. 2008), and less consistently between math ability and visual-spatial WM (McLean and Hitch 1999; Schuchardt et al. 2008). Yet our data might suggest that academic achievement is more related to the ability set forth by the central executive component of WM as opposed to the modality-specific slave systems, although this speculation was not directly tested in our study.

Given the current findings linking WM to academic performance, but not ADHD, it is not surprising that WM training, which does improve WM in children with ADHD, seems to have little or no effect on ADHD symptoms (Chacko et al. 2014; Rapport et al. 2013; van Dongen-Boomsma et al. 2014). We (Simone et al. 2016) previously suggested that cognitive heterogeneity in ADHD might account for the limited efficacy of WM training, and that greater benefits may be obtained if the treatment is limited specifically to those children who have ADHD and poor WM. Our present data suggest that, while WM might improve the acquisition of academic skills in such children (given that WM significantly contributed to performance on academic achievement tests), it would likely have only limited effects on ADHD symptoms as WM did not significantly contribute to school behavioral performance or global functioning. Nevertheless, further research is needed to clarify the extent to which children with ADHD with low or intact WM would benefit from WM training in improving their acquisition and application of academic skills, as well as reducing the manifestation of ADHD symptoms and their impact in real-world settings.

The current study has several notable strengths. First, this is a well-characterized sample of children with and without ADHD who have been followed annually from preschool age through 8-years-old. Second, we used well-established diagnostic measures along with objective measures of auditory-verbal and visual-spatial WM to classify the children. We had a diversity of outcome measures including objective tests, teacher reports, and clinician impressions. Finally, by utilizing a regression approach, we were able to assess for significant and unique contributions of our independent variables on each outcome measure.

Nevertheless, there were some limitations to the current study, which must be considered. First, our sample was comprised of a narrow age range (only 8-year-old children). While this likely reduced variability in findings, caution is warranted when generalizing these results to older or younger children. Second, the scales used to assess teacher judgments of academic functioning, and to a lesser extent classroom behavioral functioning, were comprised of only three and five items, respectively. It is possible that there were too few items to make a valid estimate of each construct we proposed we were assessing. As the sample was originally recruited with strict exclusionary criteria for preschool Full Scale IQ, it likely limited the number of children with truly impaired WM (i.e., ≥ 2 standard deviations below the mean) and may limit generalization of findings to some clinical settings. Also, we did not collect information from the children regarding their actual in-school academic performance (e.g., report cards), and therefore it remains open whether teacher-ratings of academic functioning in the classroom are reliable estimates of their actual school academic performance. Finally, it is important to note that moderate correlations were observed among the WM measures, academic achievement tests, and ADHD symptom domains. While this suggests there is some overlap among these variables, they are also relatively distinct from each other. Nevertheless, while WM was observed to be a significantly unique contributor to academic test achievement, it remains possible that shared aspects of WM and the ADHD symptom domains could be partially responsible for this as well.

Overall, our findings indicate that WM ability is specifically associated with academic achievement across a wide array of skills in children with and without ADHD and not with the presence or severity of ADHD symptoms. Further, severity of ADHD symptoms is unrelated to academic achievement, although symptoms of inattention may have an impact on classroom performance.