Clinically significant psychiatric disorders frequently co-occur (Angold et al. 1999); these high rates of comorbidity raise concerns about the distinctness of specific disorders (Krueger and Markon 2006). As a result, dimensional models of psychopathology have been examined to understand underlying factors that contribute to diverse phenotypes (e.g., Krueger 1999; Krueger et al. 1998; Krueger and Markon 2006; Slade and Watson 2006). Despite much work examining structural models, little is known about temporal stability of the latent dimensions and factors that influence patterns of stability.

There is extensive evidence for models of psychopathology that include two correlated latent factors reflecting internalizing and externalizing disorders in children, adolescents, and adults, and across cultures (Achenbach and Rescorla 2000; Krueger and Markon 2006; Lahey et al. 2008; Vollebergh et al. 2001). Although studies have focused on several developmental periods, these studies have predominantly used cross-sectional designs, with some exceptions. Krueger et al. (1998) found that both the internalizing and externalizing dimensions of psychopathology demonstrated strong stability (βs = 0.69 and 0.86, respectively) from age 18 to 21 in the Dunedin cohort, yielding support for homotypic continuity of emotional and behavioral problems. Conversely, there was no evidence of heterotypic continuity. Likewise, Vollebergh et al. (2001) examined the stability of anxious-misery, fear, and externalizing latent factors over the course of a one-year follow-up in a sample of individuals from 18 to 64 years. These data found strong stability for all factors with stability of externalizing (β = 0.96) being stronger than for anxious-misery and fear latent dimensions (βs = 0.85 and 0.89, respectively).

Although most research on dimensional models of psychopathology has focused on internalizing and externalizing dimensions, recent work has found support for an alternative structure. Lahey et al. (2012) found that a bifactor model that included a general factor and specific internalizing and externalizing factors provided a better fit to the data in a large sample of adults. Examining lifetime psychopathology from adolescence through adulthood, Caspi et al. (2014) independently replicated this structure. There have been additional replications of the bifactor structure in cross-sectional studies (e.g., Hankin et al. 2017; Kim and Eaton 2015; Lahey et al. 2011, 2015; Murray et al. 2016; Pettersson et al. 2016; Stochl et al. 2015), typically in samples of older children, adolescents, and adults. Far less attention has been paid to the structure of psychopathology in young children. Olino et al. (2014) compared several models of psychopathology with data collected from semi-structured diagnostic interviews conducted with primary caregivers of three-year-old children; the authors found that the bifactor model provided the best fit, relative to a correlated two-factor model and a single factor model. Further, the general factor was positively associated with temperamental surgency and negative affect, and negatively associated with effortful control; the internalizing factor was negatively associated with surgency; and the externalizing factor was positively associated with surgency and negatively associated with effortful control, using parent reports of child temperament.

Although the bifactor model of psychopathology has been replicated in a number of studies, there are important questions that need to be addressed to advance this line of research. The present study extends this work in four key ways. First, we tested one-factor, two-factor (correlated internalizing-externalizing model), and bifactor models to identify the structure that fits the data at age six. We expected that the same bifactor model identified at age three would remain a well-fitting model at age six.

Second, we examined whether the latent factors demonstrate homotypic continuity. Lack of longitudinal stability would raise questions about model validity. There have been three evaluations of general psychopathology factor stability in samples of older youth and adults. In a cohort studied annually from ages 7 to 15, Murray et al. (2016) found that the general factor had high internal consistency (indexed by omega-hierarchical) over time, suggesting longitudinal coherence of the structure. However, modest autocorrelations (β range = 0.10–0.33 across waves) over time were found, indicating low temporal stability. Snyder et al. (2017) examined the stability of the general, specific internalizing, and specific externalizing factors across an 18-months period in adolescence (from approximately age 13.5 to 15 years). The authors reported strong homotypic continuity for all factors (general factor β = 0.86; internalizing factor β = 0.72; and externalizing β = 0.71). Lastly, Greene and Eaton (2017) reported on three-years stability of the bifactor solution in a large epidemiologic survey of adults (age range = 18–90+). The authors reported strong homotypic continuity (general factor β = 0.67; distress factor β = 0.53; fear factor β = 0.87; and externalizing β = 0.87). However, these studies relied on full longitudinal models that empirically compromised assumptions about the orthogonality of the later timepoints (Koch et al. 2017). Hence, these previous studies may report biased results based on model misspecifications. This may be partially mitigated by the modest associations between factors observed in these studies. Nonetheless, estimation of associations using recently developed methods (Koch et al. 2017) are needed. Moreover, stability of latent dimensions of psychopathology has not yet been examined during early childhood. As rapid development in young children’s cognition, language, inhibitory control, and social relationships is typical (Egger and Angold 2006), stability of psychopathology may be more modest than in individuals from more advanced developmental periods.

Third, we examined heterotypic continuity to determine whether dimensions at age 3 are associated with different dimensions at age 6. Snyder et al. (2017) and Greene and Eaton (2017) found no evidence of heterotypic continuity. However, these studies examined different developmental periods. Heterotypic continuity might be more common in early childhood when rapid changes in development are present. If heterotypic continuity exists, it is important to determine whether the general factor becomes more differentiated with age or whether the specific factors converge to increase the magnitude of the general factor.

Finally, although there has been some interest in understanding determinants and correlates of latent factors (Hankin et al. 2017; Lahey et al. 2012; Olino et al. 2014), limited attention has been given to factors that influence the stability of latent dimensions of psychopathology over time. Previous studies (Greene and Eaton 2017; Snyder et al. 2017) have examined the influence of sex and age on homotypic stability of latent psychopathology dimensions in adolescents and adults. Neither study found that sex moderated the stability of any dimension. Greene and Eaton (2017) found no evidence for age moderating homotypic continuity. However, Snyder et al. (2017) found that stability of internalizing problems increased with age. Although these studies examined age and sex as moderators of homotypic continuity, construct domains highly relevant to psychopathology have not been examined. In addition, moderators of heterotypic continuity have not been examined.

We focused on well-established vulnerability factors for and/or correlates of youth psychopathology as potential moderators of both homotypic and heterotypic continuity of the latent psychopathology factors: child sex, child temperament, parental psychopathology, and parenting. There are established gender differences in levels and rates of internalizing (Hankin et al. 1998) and externalizing (Negriff and Susman 2011) problems. Similarly, there is greater persistence of internalizing pathology in women (Essau et al. 2010) and externalizing in men (Hicks et al. 2007). Temperament reflects individual differences in emotional reactivity and regulation (Rothbart et al. 2001) and is well-documented as a risk factor for psychopathology in youth (e.g., Caspi et al. 1996; Dougherty et al. 2010) and an influence on the course of disorders (e.g., Bufferd et al. 2018; Chassin et al. 2004). Parent-to-child transmission of internalizing and externalizing psychopathology is also well-established (e.g., Hicks et al. 2004; Klein et al. 2005) and influences persistence of child psychopathology (e.g., Weissman et al. 2016). Finally, maladaptive parenting behavior is associated with youth psychopathology (e.g., McLeod et al. 2007) and course of psychopathology (e.g., Nanni et al. 2012; Silk et al. 2009).

The present study examines the stability of structural models of psychopathology across early childhood. This is a critical period of development when prevention or intervention before formal schooling may ameliorate emergence of problem behaviors (Anticich et al. 2013; Chronis-Tuscano et al. 2015). As an earlier onset of many disorders is associated with poorer course and outcomes compared to later onset (e.g., Nagin and Tremblay 2001; Weissman et al. 1999), it is critical to identify factors that influence the persistence of early-onset psychopathology. We anticipate the general structure of psychopathology will be consistent throughout early childhood and that latent factors will be at least moderately stable. However, we hypothesize that stability will be weaker in later developmental periods. In addition, we explore heterotypic continuity between latent factors, and child and parental moderators of homotypic and heterotypic continuity in latent dimensions of psychopathology.

Methods

Participants

Participants were 541 3-years old children and their families in a study of temperament and risk for psychopathology (Olino et al. 2010) recruited from commercial mailing lists. Primary caregivers were required to speak English, and children with significant medical disorders or developmental disabilities were excluded. Informed consent was obtained prior to participation. The study was approved by the institutional review board at Stony Brook University.

Children were 3.56 years (SD = 0.27); 247 (53.9%) were male and 398 (86.9%) were White/non-Hispanic. Children’s receptive language ability, assessed by the Peabody Picture Vocabulary Test (PPVT; Dunn and Dunn 1997), was in the average range (M = 102.9, SD = 13.9). Mean ages of mothers and fathers were 36.2 years (SD = 4.5) and 38.5 years (SD = 5.4), respectively. Most parents were married or cohabiting (96.0%) when they entered the study; approximately half of the parents (56.7% of the mothers and 46.7% of the fathers) graduated from college. The sample was representative of the surrounding county based on census data (Olino et al. 2010). Participants were invited to participate in a follow-up assessment approximately 3 years later. Families with follow-up data (n = 466; 86.1%) had children who were 6.09 (SD = 0.44) years old and 254 (54.5%) were male. Participants who declined the follow-up assessment did not significantly differ on child sex or race, any of the measure of youth temperament, parental psychopathology, and parenting behavior, or symptom counts at age 3, with the exception of panic symptoms: participants were more likely to drop out if parents reported higher levels of panic when children were age 3; t (539) = 3.18, p < 0.01. Families were compensated for their participation at both assessments. Research staff conducting diagnostic and temperament assessments at age 3 were unaware of other variables, and diagnostic interviewers at age 6 were unaware of age 3 data.

Measures

Child Psychiatric Disorders (Ages 3 and 6)

The Preschool Age Psychiatric Assessment (PAPA; Egger et al. 1999) is a reliable, developmentally sensitive interview to assess DSM-IV-TR disorders. The PAPA uses a structured format and an interviewer-based approach, applying a priori guidelines for rating symptoms and associated impairment using a detailed glossary. A 3-months primary period is used to enhance recall. Diagnoses are derived using algorithms. Although the PAPA was designed for 2–5-year-olds, it has been used in children as old as 8 years (Luby et al. 2009). We used the PAPA at both age 3 and 6 to maintain comparability across assessments. Interviews were conducted with the primary caregiver by telephone at age 3 (97.6% mothers) and in person at age 6 (92.1% mothers); parent-report diagnostic interviews have been found to yield equivalent results when administered by telephone and in-person (Lyneham and Rapee 2005). Details about both assessments are available elsewhere (Bufferd et al. 2011, 2012).

Following Olino et al. (2014), we used symptom scales for depression, GAD, phobias (the sum of social phobia, specific phobia, and agoraphobia symptoms), separation anxiety, panic, inattention, hyperactivity, impulsivity, and oppositional defiant disorder (ODD) to parallel the domains usually examined in quantitative classification studies. For each dimension, symptoms were counted present if they exceeded a clinically significant threshold. Symptoms (e.g., concentration problems) common to diagnostic criteria for multiple disorders were included for all relevant dimensions.

Inter-rater reliability of all symptom scales was acceptable at both assessments (i.e., intra-class correlations [ICCs] > 0.78, n = 21 at age 6 and n = 35 at age 6), with the exception of phobic symptoms at age 6. In the random selection of cases to be reviewed, there was very limited variability in this dimension for both raters, precluding quantitative analysis of this dimension.

Observed Child Temperament (Age 3)

Temperament was assessed in 541 children during a 2-h laboratory visit which included a structured observation consisting of 12 episodes from the Laboratory Temperament Assessment Battery (Lab-TAB; Goldsmith et al. 1995). Principal Components Analysis coded variables was conducted to reduce the number of temperament variables (Dougherty et al. 2011). This analysis yielded five temperament scales: Sociability/Assertiveness (α = 0.93), Dysphoria (α = 0.80), Fear/Inhibition (α = 0.71), Exuberance (α = 0.88), and Disinhibition (α = 0.70). Interrater reliability was adequate for all scales, with ICCs as follows: Sociability/Assertiveness (0.82), Dysphoria (0.88), Fear (0.82), Exuberance (0.92), and Disinhibition (0.83). Additional details about the assessment and coding are available in the online supplement.

Parental Psychopathology

Children’s parents were interviewed using the Structured Clinical Interview for DSM-IV, Non-patient version (SCID-NP; First et al. 1996). Interviews were conducted by telephone, which yields similar results as face-to-face interviews (Rohde et al. 1997), by two Masters-level raters with no knowledge of the temperament ratings. SCIDs were obtained from 535 (99.8%) mothers and 443 (82.6%) fathers. When parents were unavailable, family history interviews were conducted with the co-parent. Diagnoses based on family history data were obtained for an additional one (0.2%) mother and 83 (15.5%) fathers. Based on audiotapes of 30 SCID interviews, kappas for inter-rater reliability of lifetime diagnoses were 0.93 for depressive disorders; 0.91 for anxiety disorders; and 1.00 for substance abuse/dependence. Of the children, 219 (40.9%) had at least one parent with a lifetime depressive disorder (35.3% MDD; 14.6% dysthymic disorder); 32.3% of mothers and 17.3% of fathers had a lifetime depressive disorder; 237 children (44.2%) had a parent with a lifetime anxiety disorder; 34.0% of mothers and 19.0% of fathers had a lifetime anxiety disorder; 264 children (49.3%) had a parent with a lifetime substance use disorder; 22.0% of mothers and 36.9% of fathers had a lifetime substance use disorder.

Maternal Self-Reported Parenting

Parenting style at the age three assessment was assessed via the 37-item Parenting Styles and Dimensions Questionnaire (PSDQ; Robinson et al. 1995), a widely parent-report measure to assess three styles of parenting: authoritative (warm/supportive, but with structure/limits; α = 0.82), authoritarian (unsupportive, controlling, punitive; α = 0.75), and permissive (warm/supportive, but lacking structure/limits; α = 0.74).

Data Analysis

As expected in a community sample, symptom scores were positively skewed. Hence, all symptom scores were log transformed to better approximate normal distributions. Furthermore, all models were estimated using robust full information maximum likelihood in Mplus, version 8.0 (Muthén and Muthén 1998–2017). Models were evaluated on several indices of goodness of fit, as well as whether theoretical predictions, as indicated by specific paths within the model, were supported. Based on recent discussion of the challenges of applying CFA in temperament and personality and clinical science research (e.g., Marsh et al. 2004), Hopwood and Donnellan’s (2010) relied on RMSEA < 0.10 and CFI < 0.90 for acceptable model fit, while acknowledging that this is a liberal criterion. For completeness, we also report chi-square tests, but do not interpret them. We also include the Bayesian Information Criteria (BIC) as an additional index of model fit, with lower values indicating better fit.

To maximize our data for the evaluation of models at age 6, we used Markov Chain Monte Carlo estimation to impute 100 datasets with the SEQUENTIAL option in Mplus (Asparouhov and Muthén 2010; Raghunathan et al. 2001). We used age 3 and (when available) age 6 data to impute missing age 6 symptom dimension scores. All analyses relied on imputed data for analysis. This approach maintained comparable data when estimating the models with the age 6 data, only, and the full longitudinal models. All fit information is reported as the mean index across the 100 imputed datasets.

First, we examined one-factor, two-factor (correlated internalizing-externalizing), and bi-factor models for the age 6 assessment. Consistent with Olino et al. (2014), residual correlations between the depression and ODD dimension scores were freely estimated in the age 6 models. Second, we estimated a longitudinal model that included covariance paths between latent factors at each of the two assessment waves. We present tests of longitudinal measurement invariance (Widaman et al. 2010) in the online supplement. Third, we estimated a model that included homotypic and heterotypic continuity paths following the recommendations from Koch et al. (2017). Briefly, this method provides unbiased prediction estimates of orthogonal latent factors. This requires two-steps in which the prediction to the general factor is estimated in one analysis and prediction to the specific factors is estimated in a separate step. Finally, we estimated models examining whether child sex and observed temperament, parental psychopathology, and parenting moderated homotypic and/or heterotypic continuity paths. These models were estimated using principles from Koch et al. (2017) that relied on separate analyses for the prediction to the general and specific factors. Tests of moderation included interactions between latent variables and continuous observed variables. In order to specify the interaction terms in the analysis, we used the TYPE = RANDOM analysis option. Interaction effects between latent variables and categorical observed variables (e.g., child sex, specific forms of parental psychopathology) were estimated using multiple group models.

Results

Table 1 displays the bivariate correlations between age three and age six symptom counts. Correlations between dimensions at age three are displayed above the diagonal and at age six below the diagonal. To examine the similarity between patterns of correlations, we estimated ICCs for the 36 pairs of correlations between the age three and six data. This analysis revealed substantial similarity (ICC = 0.91). For both assessments, there was consistency in significant associations among internalizing symptoms and among externalizing symptoms. There were also significant associations between depression and GAD symptoms with all externalizing symptoms; however, associations between separation anxiety, panic symptoms, and phobia symptoms with externalizing symptoms were more modest.

Table 1 Associations among symptom counts for disorder dimensions

Age 6 Models

One-Factor Model

The one-factor model specified that all disorders were due to a single underlying liability factor. All symptom dimensions demonstrated significant factor loadings (at p < 0.05, with standardized loadings ranging from 0.24 (phobias) to 0.77 (GAD). However, this model provided a very poor fit to the data (Table 2).

Table 2 Fit information for age 6 one-, two-, and bifactor models

Two-Factor Model

The two-factor model specified that depression, GAD, and separation anxiety, panic, and phobic disorder scores loaded on an internalizing factor, and inattention, hyperactivity, impulsivity, and oppositional-defiant disorder scores loaded on an externalizing factor. Factor loadings for all disorders on their respective factors were significant (p < 0.01). Standardized factor loadings ranged from 0.29 (phobias) to 0.90 (depression) for the internalizing factor and from 0.51 (ODD) to 0.73 (hyperactivity) for the externalizing factor. The correlation between the internalizing and externalizing factors was moderate (r = 0.41, p < 0.001). Model fit was poor (Table 2), but better than for the one-factor model.

Bifactor Model

In this model, depression, GAD, separation anxiety, panic, inattention, hyperactivity, impulsivity, and ODD scores were indicators of the age 6 general factor; depression, GAD, separation anxiety, panic, and phobia scores were indicators of the age 6 internalizing factor; and inattention, hyperactivity, impulsivity, and ODD scores were indicators of the age 6 externalizing factor. Consistent with the age 3 model (Olino et al. 2014), phobic symptoms did not load on the general factor at age 6 in preliminary analyses. Thus, we removed this loading in all models. Consistent with the conventional application of bifactor models, all latent factors were specified to be orthogonal. This model provided a good fit to the data (Table 2). Beyond global model fit, all indicators had significant factor loadings on the general factor, except for panic symptoms (Table 3, left panel). All internalizing indicators had significant factor loadings on the internalizing factor and all externalizing indicators had significant factor loadings on the externalizing factor. The bifactor model had the best fit of the age 6 models.

Table 3 Standardized factor loadings for the bifactor model at age six (left panel) and in the longitudinal homotypic model with age 3 and age 6 symptoms (right panel)

Longitudinal Model

We then added the final age 3 model from Olino et al. (2014) with the exception that we made the internalizing and externalizing specific factors orthogonal at age 3 like the age 6 model and included covariance paths between the general, internalizing, and externalizing factors across time. The model was an adequate fit to the data as indicated by the RMSEA = 0.076, but inadequate according to other indices, χ2 (109) = 455.56, p < 0.001 and CFI = 0.856. All indicators had significant factor loadings on the expected factors at both age 3 and age 6, except for panic symptoms on the common factor at each time point. Modification indices suggested the model would be improved by adding residual correlations across time for separation anxiety, inattention, and ODD symptoms. After including these three additional parameters to the model, the model was a good fit to the data, χ2 (106) = 327.62, p < 0.001; CFI = 0.91; and RMSEA = 0.062. All indicators had significant factor loadings on the expected factors at both age 3 and age 6 (all ps < 0.05; factor loadings shown in Table 3, right panel), with the exception that panic failed to significantly load on the general factor at age 3 and age 6 (both ps > 0.10). Longitudinal correlations for the general, internalizing, and externalizing factors were all significant (rs = 0.54, 0.84, and 0.50, respectively, all ps < 0.001). The only significant heterotypic correlation was found between the common factor at age 3 and the externalizing factor at age 6 (r = 0.16, p < 0.05). The structural parameters of the model (e.g., factor loadings and covariance paths) remained virtually unchanged (ICC = 0.99 for agreement of parameters across models) with and without the post-hoc covariance parameters included. This suggest that these additional post-hoc paths are not substantively influencing our interpretation of the major structural elements of the model.

Longitudinal Model Incorporating Heterotypic Paths

Following the methods from Koch et al. (2017), we estimated individual models for prediction of the age 6 general factor and the age 6 specific factors from the age 3 general and specific factors. This method is necessary to preserve the othogonality of the dependent latent variables. Overall, these models were a good fit to the data (χ2 (106) = 341.07, p < 0.001; CFI = 0.90; and RMSEA = 0.064 for the model predicting the age 6 common factor and χ2 (106) = 323.18, p < 0.001; CFI = 0.90; and RMSEA = 0.061 for the model predicting the age 6 specific factors). In these models, we found significant homotypic continuity for all three latent factors and heterotypic continuity between the general factor at age 3 and the specific factors at age 6. The age 3 general factor was positively associated with age 6 externalizing problems. A schematic presentation of the major longitudinal associations is presented in Fig. 1.

Fig. 1
figure 1

Schematic figure of longitudinal homotypic and heterotypic paths. ***p < 0.001. Observed indicators are not displayed in the figure. Standardized regression coefficients are displayed in the figure. Dashed lines indicate non-significant paths in the model. Per model constraints for bifactor models, within each assessment, latent factors have covariances constrained to zero. The analysis relied on estimation methods from Koch et al. (2018). The figure simplifies the presentation; standardized regression coefficients are based on residualized estimates of age 3 latent factors

Moderators of Homotypic and Heterotypic Continuity

Finally, we examined whether child sex or temperament, maternal reports of parenting, or parental history of psychopathology moderated homotypic and heterotypic continuity paths. Consistent with the previous models, we estimated interaction influences for the common and specific factors in separate models. Across all models, a total of 108 interaction effects were estimated. Only two reached conventional levels of significance. As these did not exceed the number of interactions expected due to chance, we do not pursue them further. Full results are in the online supplement.

Discussion

Studies of the structure of psychopathology have provided important information about the nature of comorbidity. Higher order and bifactor models have both consistently explained covariation in psychopathological symptoms across development. However, only a small number of studies (Greene and Eaton 2017; Krueger et al. 1998; Murray et al. 2016; Snyder et al. 2017) have examined consistency in structural models over time. Further, no previous work has examined whether the stability of latent factors are moderated by established risk factors for psychopathology, beyond sex and age. In addition, multiple investigations of the structure of psychopathology have examined late childhood, adolescence, and adulthood; however, there are few investigations of the structure of psychopathology in preschool-aged children. Given that rates of DSM-IV disorders in young children are relatively low (e.g., Bufferd et al. 2011, 2012), examination of the structure of symptoms can help to clarify the nature of these difficulties during this important and understudied developmental period. Here, we examined the stability of a bifactor model of psychopathology in early childhood, and tested whether child sex, temperament, parental psychopathology, and maternal parenting behaviors moderated homotypic and heterotypic paths. Overall, we found evidence for the stability of the latent factors, but no moderators of stability.

The same bifactor structure of psychopathology symptoms identified at age three (Olino et al. 2014) was also a good fit to the data for the same children at age six. Thus, a similar structure across longitudinal assessments supports the validity of the model. However, our measurement invariance analyses (presented in the online supplementary material) showed that the magnitude of factor loadings changed across this developmental window. The interpretation of the factors is not isomorphic across time and our interpretations are not about pure stability of the same latent constructs, but longitudinal associations between constructs with the same manifest indicators. In addition, our general factor has strong loadings to distress (i.e., MDD and GAD) indicators and our internalizing factor has stronger loadings to fear (i.e., phobia, panic). Thus, the interpretation of these factors may be different from some other studies.

The general, internalizing, and externalizing latent factors all had strong stability across the 3-years interval, with standardized regression coefficients reflecting moderate-large effects. In their examination of the stability of the internalizing-externalizing factor model, Krueger et al. (1998) found homotypic continuity for both internalizing and externalizing factors from age 18 to 21 in the Dunedin cohort with effect magnitudes in the same range. Krueger et al. (1998) found that the correlation between internalizing and externalizing was similar across assessment waves, providing indirect evidence that the common variance between these factors was consistent over time. Snyder et al. (2017) and Greene and Eaton (2017) found moderate-strong stability of each latent dimension. Murray et al. (2016) failed to find stability of the general factor across ages 7–15. However, these studies relied on analytic methods that could have introduced bias into the stability of the latent factors. Our results provide stronger tests of longitudinal associations finding moderate stability for the common and externalizing factor and strong stability for the internalizing factor. As described below, we found little evidence for specific constructs that differentially influence the stability of these domains.

We further examined heterotypic continuity of the latent dimensions of psychopathology. In our models, we found that the general factor at age three was positively associated with the externalizing factor at age six. This pathway suggests that the common psychopathology dimension in the preschool period differentiates into more specific externalizing later in development. Given this findings, it is crucial to identify factors that lead to these differentiated responses. For example, Kessel et al. (2016) found that for youth with heightened responses to errors, early irritability predicted later internalizing problems, whereas for youth with attenuated responses to errors, early irritability predicted later externalizing problems. Previous studies (e.g., Snyder et al. 2017; Greene and Eaton 2017) failed to identify heterotypic associations in their models, but did so in models that were ill-specified. Thus, our contrasting findings could be due to differences in model estimation, or to developmental differences in the samples.

Our finding of moderate-strong homotypic continuity from an early age is encouraging regarding the likely success of early screening and prevention/early interventions, as it encourages downward extension of existing assessment approaches and treatments. In addition, preschool psychopathology is often regarded as reflecting developmental normative behaviors (Wakschlag and Danis 2004) or transient responses to the immediate environment (Bufferd et al. 2016, 2018). These findings are consistent with evidence that preschool psychopathology is often persistent, highlighting the importance of early detection and intervention.

We examined multiple child and parent factors that could influence the stability of longitudinal relationships between latent dimensions of psychopathology. However, given the many tests conducted, the number of significant findings did not exceed chance. The high degree of stability in our study may have limited our ability to identify moderators of internalizing stability, as there was modest variability to predict. Future studies could examine other potential moderators, such as genetic profile scores (Nikolova et al. 2011) and stressful life events (Vrshek-Schallhorn et al. 2013). In addition, our sample was not very diverse in terms of race/ethnicity and socioeconomic status, so these findings might differ in samples with greater demographic heterogeneity. It is critical to identify substantive moderators of the stability of latent psychopathology dimensions as this sample progresses into older developmental periods as these data may yield important insights for intervention.

Although bifactor models are gaining popularity in investigations of psychopathology structure, there are limitations on estimation (e.g., restricting the covariance between the common and specific factors to zero) that may be too restrictive in accurately describing the data (Bonifay et al. 2018; Reise 2012; Widiger and Oltmanns 2018). Furthermore, there are concerns about the interpretation of the general factor. For example, bifactor models may represent a methodological artifact. In previous studies of bifactor modeling, participants were interviewed directly (Caspi et al. 2014; Greene and Eaton 2017; Kim and Eaton 2015; Lahey et al. 2012) or completed self-report measures of symptoms (Murray et al. 2016). Both of these contexts require self-appraisals of the presence of psychopathology. In contrast, our study relied on parent reports of their children’s psychopathology. Importantly, our results support a similar model. Thus, we cannot rule out the possibility that parents may be relying on stable perceptions of their children, rather than providing veridical reports of the children’s behavior, which could overestimate the stability of psychopathology. However, other studies (e.g., Hankin et al. 2017; Snyder et al. 2017) have replied on a combination of child self-reports and parent-reports of psychopathology. Thus, this is evidence against the bifactor model solutions being driven solely by monomethod artifacts. Finally, recent genetics work finds heritability for the general factor (Waldman et al. 2016) and associations between several single nucleotide polymorphisms and the general factor (Neumann et al. 2016), which provides additional construct validity of the general psychopathology factor.

Some investigators have speculated that neuroticism is a large component of the general factor (Lahey 2009; Lahey et al. 2018) and meta-analytic evidence is consistent with this interpretation (Kotov et al. 2010). Thus, examinations of the development of neuroticism, more generally, may provide valuable insights for understanding the development of the general psychopathology factor (e.g., Ebstein 2006; Viken et al. 1994). However, some studies (e.g., Griffith et al. 2010) find near perfect associations between neuroticism and internalizing, raising questions about the role of neuroticism in the general or internalizing dimension. Furthermore, consistent with the Hierarchical Taxonomy of Psychopathology (Kotov et al. 2017), the presence of a general psychopathology factor indicates the need for greater attention to transdiagnostic screening, interventions, and etiological factors. Screening may benefit from focusing on general psychopathology for identifying those in need of clinical attention. Transdiagnostic interventions have demonstrated promise for addressing a broad array of symptomatology (Farchione et al. 2012). Research on etiology may find stronger effects on the general factor than solely focusing on specific factors. However, this is speculative as further work is needed in this area.

This study benefited from longitudinal assessments of psychopathology in a large sample of youth using a semi-structured diagnostic interview and sophisticated modeling methods. However, the study should be evaluated in light of several limitations. First, levels of symptoms were modest in this unselected community sample. Thus, tests of these models in samples with higher levels of psychopathology are needed. Second, we relied on DSM disorder symptom counts. It is possible that other indicators may better discern true latent dimensions. Third, some disorders (e.g., social phobia) had few indicators; therefore, we collapsed multiple phobias into a single dimension. Fourth, our models included several cross-time residual correlations to achieve a good model fit. Thus, this specific structure may fail to replicate in different samples. However, analyses were repeated without these residual correlations and the quantitative and substantive results were virtually identical. This indicates that these residual correlations did not bias the conclusions. Moreover, as noted elsewhere, (Marsh et al. 2004), there has been growing concern that fit criteria have been overly stringent and lead to rejecting well-fitting and theoretically sound models. This has been particularly concerning for models in the area of personality and psychopathology. Finally, our sample was representative of the geographical region from which it came. However, this region is predominantly Caucasian and middle/working class. Inasmuch as these demographic factors influence levels of symptoms and/or our moderators, our results may not generalize to other samples.

In sum, the present study revealed a similar structure of psychopathology in youth assessed twice during early childhood. We observed moderate-high homotypic continuity of the general, internalizing, and externalizing dimensions of psychopathology and found that the general factor was positively associated with externalizing problems and negatively associated with internalizing problems later in development.