Internalizing problems (IPs) are a prevalent form of maladjustment in childhood and adolescence that can cause great distress for children and their families, are moderately to highly stable over time, and often presage the emergence of anxiety and mood disorders (Broeren et al. 2013; Hopkins et al. 2013). Empirical attention toward the etiology and trajectories of IPs has pointed toward the complex interplay of dispositional and environmental factors. Consistent with multiple-domain models of developmental psychopathology (Cicchetti and Curtis 2007) and models of individual-environment interaction (Ellis et al. 2011; Hankin and Abela 2005), we have proposed a multilevel model of temperamental, neurobiological, psychological, familial and social-contextual contributors to the development of IPs (Mills et al. 2012). In this report, we consider how early life experiences and dispositional characteristics jointly shape preschoolers’ risk for the development of IPs into middle childhood.

Multilevel Models of Developmental Psychopathology

A fundamental goal of research on the development of psychopathology is to understand the interactive contributions of etiological factors across multiple levels in the emergence and trajectories of problems over time (Cicchetti 2008). Recently, attention has been given to the ways in which environmental factors affect the propensity of children with particular temperamental or neurobiological characteristics to manifest problems. At issue is whether, or under what circumstances, contextual and dispositional factors exert their joint influence in ways that are consistent with diathesis-stress (Hankin and Abela 2005) or differential susceptibility (Ellis et al. 2011) models of development. The former model suggests that children with dispositional vulnerabilities are likely to manifest more difficulties if they are raised in adverse contexts, but will develop similarly to children without vulnerabilities if they are raised in typical or advantaged contexts. The latter model recasts vulnerabilities as susceptibilities, or openness to external influence for better or for worse. Consistent with the diathesis-stress model, children with susceptibilities are expected to manifest more problems if they are raised in adversity. A unique prediction of the differential susceptibility model, however, is that those same children will have fewer problems than average, or more positive characteristics, if they are raised in advantaged contexts. To date, there is mixed evidence for whether the diathesis-stress or differential-susceptibility model more accurately characterizes children’s development of IPs (Hastings et al. 2014). There is likely to be some truth in both models, but which holds true could depend on the specific factors examined.

Our multilevel model of the development of IPs extends from bioecological and biopsychosocial perspectives (Hastings et al. 2011a; Mills et al. 2012). IPs are posited as arising from dispositional factors, such as children’s behavioral inhibition and high physiological reactivity, acting in concert with environmental factors, such as exposure to familial and socioeconomic disadvantages (Zahn-Waxler et al. 2000). IPs are expected to be more severe when more of these factors are present, when they are not offset by protective factors, and when they occur in particular combinations. To date, most examinations of multilevel models of IPs have focused on a single child characteristic and a single parent or contextual factor (e.g., Lewis-Morrarty et al. 2012), or have examined multiple factors within only the dispositional domain (e.g., Hastings et al. 2011b) or the environmental domain (e.g., Rijlaarsdam et al. 2013). In addition, most studies of multilevel factors have been cross-sectional (Hopkins et al. 2013; Reising et al. 2013), and several have examined processes of mediation, rather than moderation. We extend these efforts by considering multiple dispositional factors – behavioral inhibition, adrenocortical regulation, gender – as moderators of multiple environmental factors – parenting behavior, maternal negative emotionality, family socioeconomic resources – in relation to the development of IPs over the transition from preschool- to school-age.

Behavioral Inhibition

Children with inhibited temperaments are more likely to have IPs (Mills et al. 2012). Behaviorally inhibited children are wary of novelty, reacting with heightened physiological arousal and shying away from or avoiding engagement. Behavioral inhibition was the most robust predictor of stable high social anxiety symptoms in a recent study of multiple child emotional, behavioral and cognitive characteristics (Broeren et al. 2013). However, most inhibited young children do not develop serious IPs (Degnan and Fox 2007), suggesting that other dispositional or environmental factors might contribute to their later adjustment.

Adrenocortical Regulation

The hypothalamic-pituitary-adrenal (HPA) axis is one of the primary stress response and regulation systems. Stressful events can provoke an acute increase in HPA activity which is detectable in elevated levels of salivary cortisol 20 to 25 min after the event (Gunnar and Adam 2012). Although evoking an acute HPA reaction can be difficult in young children (Gunnar et al. 2009), the ecologically valid challenge of meeting adult strangers in a laboratory or field context has been found to elicit an ‘arrival effect’: preschool-aged children manifest the stress-characteristic cortisol response about 20 min after meeting unfamiliar adults (Fernald and Gunnar 2009; Hastings et al. 2011a).

Children with more IPs tend to have higher basal levels of salivary cortisol (Ruttle et al. 2011; Tyrka et al. 2012) and greater adrenocortical reactivity to stressors (Hastings et al. 2009), suggesting that they are primed to respond to novel challenges as threats. Further, longitudinal studies have shown that elevated cortisol levels precede the emergence or exacerbation of IPs (Guerry and Hastings 2011).

More inhibited children have greater HPA reactivity (Talge et al. 2008), but whether cortisol levels account for, or add to, links between inhibition and IPs is unclear. We found that more inhibited preschool-aged girls had higher IPs when they maintained elevated cortisol levels for longer after meeting adult strangers, indicative of less effective adrenocortical regulation (Hastings et al. 2011a). Conversely, Essex et al. (2010) found that young children’s behavioral inhibition and cortisol levels were independently predictive of their chronic shyness across childhood. Further consideration of how adrenocortical regulation works in conjunction with other factors to influence the development of IPs is needed.

Gender

Girls are at greater risk than boys for trajectories of stable or worsening IPs (Zahn-Waxler et al. 2000). This could stem from social-contextual and neurobiological risk factors that differ between the genders, or from girls and boys differing in their reactions to shared risk factors (Zahn-Waxler et al. 2008). Weak adrenocortical regulation (Hastings et al. 2011b) and aversive environmental experiences (Rudolph and Flynn 2007) have been more strongly associated with IPs in girls. Conversely, Coplan et al. (2007) have argued that, as shyness and fearfulness are less gender-typical and socially accepted in boys, behavioral inhibition might be particularly maladaptive for boys in school and peer settings.

Parenting Behavior

Meta-analyses have provided robust evidence that children have fewer IPs when parents are warm, sensitive and supportive, and more IPs when parents are critical, rejecting, domineering and punitive (McLeod et al. 2007a, b). Positive and negative aspects of parenting behavior have been found to make opposite, but independent and additive, contributions to the prediction of IPs (Hopkins et al. 2013; Mills et al. 2012), and parenting acts as a proximal mechanism through which other environmental factors are associated with IPs (Mills et al. 2012; Reising et al. 2013; Rijlaarsdam et al. 2013). In addition, negative parenting predicts IPs more strongly in children who are highly inhibited or more physiologically reactive (Hastings et al. 2014; Lewis-Morrarty et al. 2012).

Maternal Negative Emotionality

Mothers who report more emotional distress, including anxious, depressed, neurotic, angry and generally negative affect, have children with more IPs (Coplan et al. 2008; Mills et al. 2012). Both direct links between maternal negative emotionality and IPs, and indirect links via parenting behavior, have been documented (Essex et al. 2010; Hopkins et al. 2013; Rijlaarsdam et al. 2013). There has been less consideration of whether particular dispositional characteristics are involved in the link between experiencing maternal negative emotionality and the subsequent development of IPs.

Socioeconomic Resources

Children have more IPs when they are raised in homes with fewer resources, as indexed by lower family income, less parental education, and lower parental occupational prestige (Hopkins et al. 2013; Mills et al. 2012). This has been shown to be mediated by the tendency for more socioeconomically disadvantaged parents to engage in less positive and more negative parenting behavior (Reising et al. 2013; Rijlaarsdam et al. 2013). As with maternal negative emotionality, however, there have been fewer examinations of moderating influences on the effects of socioeconomic status (SES).

Contributions of this Investigation

We examined the additive and interactive contributions of these diverse factors to children’s development of IPs by combining parallel data from three independent studies in order to produce a larger group of participants that afforded greater power to detect multilevel effects (Curran 2009). This technique has been effective for revealing smaller effects that could not be identified within separate, constituent samples (Feder et al. 2004). Our simultaneous examination of multiple related risk factors (e.g., behavioral inhibition and HPA activity; maternal negative emotionality and parenting behavior) allowed us to determine whether their relations with IPs were unique or overlapping. We used regions of significance analyses (Hayes and Matthes 2009) to examine whether the diathesis-stress or differential susceptibility model applied to each of several interactive effects of distinct dispositional and environmental factors. The prospective longitudinal design was essential for informing models of the development of IPs; the concurrent versus predictive associations between risk factors and problems can be markedly different, and even reversed (Hastings et al. 2011b), such that cross-sectional studies might not reveal critical developmental processes. Finally, we assessed school-age children’s IPs as seen by both mothers and teachers. If different factors predict children’s IPs across contexts, it might indicate how settings and relationships affect children’s expressions of their strengths and vulnerabilities.

Objectives and Hypotheses

We conducted this multimethod, multilevel, prospective longitudinal study to investigate how dispositional and environmental risk factors jointly contributed to children’s development of IPs from preschool-age to middle childhood. Preschoolers’ behavioral inhibition and IPs, and maternal and familial environmental factors, were measured through mother reports (Mills et al. 2012). Preschoolers’ cortisol reactivity was assessed from saliva samples collected after meeting adult strangers, an age-appropriate mild social stressor (Fernald and Gunnar 2009). Mothers and teachers reported on children’s IPs. We expected to find that each of the risk factors would be associated with the development of IPs, and that children with the identified dispositional factors would manifest the strongest links between the environmental factors and IPs. We did not make a priori predictions about which moderation effects might conform to the diathesis stress or differential susceptibility model. We predicted that the independent and interactive risk factors would predict the development of IPs more strongly for girls than for boys.

Method

Participants

Time 1

Families were recruited into three independent studies with overlapping physiological and behavioral measures, conducted in two metropolitan areas in Canada. Extensive information on the three samples is available in prior publications (Hastings et al. 2011a; Mills et al. 2012); summary information is provided here. Sample 1 included the children of adults who had themselves been recruited into a study of emotional and behavioral problems as children, sample 2 was over-represented for children with elevated IPs and inhibition based on targeted recruitment, and sample 3 was recruited from a randomly selected representative community sample. The three constituent studies contributed 130, 133 and 236 families, respectively, to the integrated sample. There were 246 girls and 253 boys ranging from 2.0 to 6.1 years (M = 3.95 years; SD = 0.82). According to mother-reports on the Child Behavior Checklist (CBCL; Achenbach and Rescorla 2001), there were 46 children (9.2 %) in the clinical range (T ≥ 65) and 73 children (14.6 %) in the borderline-clinical range (60 ≤ T ≤64) on the broad-band IPs (IP) scale. Mothers were 20 to 56 years old (M = 32.95, SD = 5.21). The families were 85 % two-parent and 88 % Caucasian. Most mothers had college (36.2 %) or advanced (35.4 %) degrees, but 28.4 % had not progressed beyond high school. The highest occupational prestige of each family was coded using the Standard International Occupational Prestige Scale (SIOPS; Hakim 1998); the average prestige was 49.58 (SD = 12.82), representing mid-SES status, with ratings ranging from 16 (e.g., domestic laborers) to 78 (e.g. lawyers, physicians).

Time 2

Families were contacted approximately 4.5 years later, and 375 (75.2 %) agreed to participate in a follow-up assessment. Across the three constituent studies, there were respectively 104, 101, and 170 families for whom mother- and/or teacher-reported IP at Time 2 were available. These families included 185 girls and 190 boys, ranging from 6.4 to 11.7 years (M = 8.34, SD = 0.84). According to mother-report on the CBCL or teacher-report on the teacher-report form (TRF; Achenbach and Rescorla 2001), there were 81 children (21.6 %) in the clinical range and 63 children (16.8 %) in the borderline-clinical range for IP at Time 2. Families who completed the Time 2 assessment tended to have higher SES (a composite of parent education, income and occupational status; see below) than families who did not continue participation, t(495) =1.74, p < 0.10, but the two groups did not differ significantly on any other Time 1 variables of interest, all |t| <1.0, p ≥ 0.45.

Procedures and Measures

The three constituent studies were conducted in accord with the principles of the Canadian Tri-Council policy statement on ethical conduct, and were reviewed and approved by the Institutional Review Boards at their respective universities. At the first meeting of each family with the researchers, the mother provided informed consent and assent was obtained from the child prior to beginning data collection.

All values reported herein pertain to the 375 families included in the current analyses. Details on the collection and evaluation of Time 1 measures have been reported previously (Hastings et al. 2011a; Mills et al. 2012). Samples differed in the specific instruments used to measure behavioral inhibition, parenting behavior, and maternal negative emotionality. To facilitate integrative data analysis across the three samples, we developed appropriate measures of these constructs and established measurement invariance across samples (Curran 2009). Multiple-group confirmatory factor analysis was used to ensure unidimensionality of constructed scales across samples (see Mills et al. 2012 for extensive detail on these procedures, including tests of model equivalence across samples).

Two of the three studies had additional, independent questionnaires and behavioral procedures that could be examined to assess the convergent validity of the inhibited temperament and maternal parenting scores. Information on these tests of validity is included below.

Child internalizing Problems (IPs)

In each sample, mothers completed the Child Behavior Checklist (CBCL) to report on children’s IPs. At Time 1, in two samples, the ASEBA preschool (1.5–5 years) version of the CBCL was used; in the third, the earlier CBCL 2–3 and CBCL 4–18 versions were used, depending on the age of child. At Time 2, mothers completed the CBCL during a home visit for two samples, and during a lab visit for the third sample. Two mothers failed to complete the CBCL. Teacher reports of children’s IPs on the Teacher Report Form (TRF) were solicited by mail. Teacher reports were obtained for 289 children (77.1 %), including 140 girls and 149 boys. At Times 1 and 2, the psychometrics were good for all versions of the CBCL and TRF; all α ≥ 0.70 for IP across the three samples. The age- and gender-normed T-scores were used in analyses.

Salivary Cortisol

Families were either visited in their homes or they came to a university laboratory; both venues involved interacting with unfamiliar adult examiners. The times of visits extended across the day (9 AM and 7 PM). Each study’s protocol involved collection of a saliva sample approximately 20 min after meeting the unfamiliar adults (“arrival” sample). Following other studies with temporal variation in saliva collection (e.g., Hastings et al. 2009), analyses controlled for arrival time by including time of day (measured in hours since midnight) and interval between arrival and the first saliva sample as predictors. To collect saliva, children chewed cotton dental rolls that were then placed into Salivettes (Sarstedt, Inc.), which were immediately placed into a cooler, then frozen after testing was completed. Samples were thawed and centrifuged to express saliva at the time of cortisol assay. All samples were assayed using a high sensitivity enzyme immunoassay kit (High Sensitivity Salivary cortisol Catalog No. 1-0102/1-0112; Salimetrics, State College, PA). Usable cortisol data could be assayed from 295 arrival samples. Raw cortisol data (μg/dL) were log-transformed to correct for leptokurtic and positive skews. Log-transformed data were used in analyses, but untransformed data are reported for ease of interpretation.

Child Inhibited Temperament

Inhibition was measured from parent-reports on the Emotionality Activity Sociability (EAS) Temperament Survey for children (Buss and Plomin 1984) for sample 1, and the Children’s Behavior Questionnaire (Rothbart et al. 2001) for samples 2 and 3. All items were originally scaled or arithmetically transformed to range from 1 (not at all) to 7 (very true/typical). Six parallel items from the two scales were used to create the measure of inhibition (e.g., prefers to watch than join; at ease with almost anyone [reversed]; acts shy around new people), α = 0.83. Validity check. In sample 1, fathers and examiners also completed the EAS (see Karp et al. 2004 for more information). The six-item measure of mother-reported inhibition correlated significantly with the full-scale shyness scores of fathers and examiners at rs = 0.44 and 0.41, respectively, both p < 0.001. In sample 2, children were observed in the unfamiliar peer paradigm in a laboratory playroom, and teachers reported on anxious and isolated behaviors at preschool using the Social Competence and Behavior Evaluation form (see Hastings et al. 2008, 2014 for more information). The six-item measure of mother-reported inhibition correlated significantly with both observed reticence and teacher-reported inhibition, rs = 0.25 and 0.24, respectively, p < 0.05.

Maternal Parenting

The three studies used different measures of parenting with substantively overlapping items. Eight closely-worded items were identified from the Parenting Scale (Arnold et al. 1993) and Parenting Dimensions Inventory (Power 1993) for sample 1, the Child Rearing Practices Report (Block 1981) and Responses to Children’s Emotions (Hastings and De 2008) for sample 2, and the Parenting Styles and Dimensions Questionnaire (Robinson et al. 2001) for sample 3. All items were originally scaled or arithmetically transformed to range from 1 (not at all) to 7 (very frequent/descriptive). There were five items that assessed punitive and critical parenting (using physical punishment; yelling or shouting; scolding; criticizing; feeling ashamed/disappointed with child), and three that assessed affectionate and democratic parenting (warmth; comforting; reasoning). The three positive items were reverse-scored, and this set of 8 items had acceptable internal consistency, α = 0.66. The mean of the items was computed to create an index that ranged from positive parenting (low) to negative parenting (high). Validity check. In sample 1, examiners completed the Home Observation for Measurement of the Environment (HOME; Caldwell and Bradley 1984) at the end of a 2–3 h visit to the home of each family (see Stack et al. 2012 for more information). Mother-reported negative parenting was correlated negatively, r = −0.24, p < 0.01, with the total HOME score, which at the high end reflects a more stimulating and supportive home environment. In sample 2, mothers and preschoolers were observed for two guided teaching tasks (jigsaw puzzle and origami), in which maternal sensitivity, warmth, and praising were coded, and one compliance task (clean-up), in which critical parenting was coded (see Hastings et al. 2008; McShane and Hastings 2009 for more information). The eight-item mother-reported negative parenting score was correlated negatively with an aggregated sensitivity/ warmth/praising index, r = −0.37, p < 0.01, and positively with the criticism score, r = 0.25, p < 0.05.

Maternal Negative Emotionality

Maternal negative emotionality was measured from eight items on the EAS Temperament Survey for adults (Buss and Plomin 1984) for samples 1 and 3, and the Positive and Negative Affect Scales (Watson et al. 1988) for sample 2. All items were originally scaled to range from 1 (not characteristic/not at all) to 5 (very characteristic/ extremely). Items describing mothers’ experience and expression of negative emotions were identified (e.g., many things annoy me; I get emotionally upset easily) and averaged to create the measure of maternal negative emotionality, α = 0.77.

Family Socioeconomic Status (SES)

Mothers reported on their and their partners’ education and employment, and total family income before taxes. Mother education and father education were re-scaled from 1 (did not complete high school) to 5 (attained graduate or professional degree). Family income was re-scaled from 1 ($0-$10,000) to 7 ($75,000+). The highest occupational prestige of parent employment also was used. A principal components analysis supported a single-factor solution for these four measures, eigenvalue =2.23, 56 % of variance accounted for. Scores were z-transformed and then averaged to index family SES.

Analyses

Structural equation modeling (SEM) was used to predict IPs at Time 2. Separate models were examined for the prediction of mother- and teacher-reported IPs, in order to examine context effects, and because the significant but low correlation between reporters (see Table 1) did not justify constructing a latent score. These models are analogous to multiple regression models, with IPs predicted by a set of main effect and interaction terms. Two models were examined, the first with behavioral inhibition as the dispositional moderator, and the second with arrival cortisol as the moderator (controlling inhibition). All analyses were performed using Mplus version 6.0 (Muthén and Muthén 2011). The SEM framework offered several benefits for the current analyses, including (a) relaxed assumptions for missing data, (b) tests for differences between samples, and (c) statistical adjustment for violation of distributional assumptions.

Table 1 Descriptive statistics and correlations between variables

The maximum likelihood estimation procedure implemented within SEM accommodates data missing at random (Allison 2003; Enders and Bandalos 2001), a less stringent assumption than missing completely at random needed for multiple regression (Little and Rubin 2002). Missingness of data ranged from 0.5 % (n = 373) to 22.9 % (n = 289). The SEM approach maintained the sample size of 375 and minimized bias in parameter estimates due to missingness (Muthén et al. 1987). All models were constructed to first account for covariates, then main effects of risk factors, then two-way interactions, and finally three-way interactions. The models predicting mother-reported IPs at Time 2 included mother-reported IPs at Time 1, such that the development of problems over time was predicted. Because mother-reported problems at Time 1 were not significantly associated with teacher-reported problems at Time 2 (see Table 1), the earlier score was not included in the teacher models. Effects that reached traditional significance (p < 0.05) were interpreted. Three-way interactions that approached this level (0.05 < p < 0.10) were examined with caution, to ascertain if there was an interpretable two-way moderation effect for either gender. Regions of significance (ROS) analyses were used to examine these interactions (Preacher et al. 2006). Paralleling other recent developmental analyses (e.g., Kochanska et al. 2011), we examined whether the projected values of IPs for the dispositional moderator variable differed significantly at 2 SD above and below the central value of the environmental predictor variable. Support for the differential susceptibility model would require the slopes of the moderator to have diverged significantly both above and below the central tendency of the predictor; support for the diathesis-stress model would require that the slopes diverged significantly only at the “risk” end of the predictor (e.g., +2 SD for negative parenting, or −2 SD for SES).

Via multiple group analysis, the SEM approach facilitated tests for differences between path coefficients (i.e., regression coefficients) across the three samples; the fit of a model with estimated paths for each sample is compared to the fit of a model that places constraints on those paths across samples (Meredith 1993). If the models do not differ significantly, then relationships between Time 1 predictor variables and Time 2 IPs do not vary greatly across samples and may be constrained to be equal without loss of generalizability. Alternatively, if the models differ significantly then the less constrained model (i.e., the model with estimated paths for each sample) is a more appropriate summary of the data (see Bollen 1989).

Finally, simple extension of SEM allows observed variables to deviate from multivariate normality (e.g., gender). Violations of this assumption tend to inflate estimates of the χ 2 fit statistic and underestimate standard errors for parameter estimates (Satorra 1990). Robust maximum likelihood was chosen as the estimation procedure (for details, see Satorra 1992), which accounts for biases due to non-normality and high model complexity (Satorra and Bentler 1994), but maintains flexibility for analyzing missing data and multi-group analysis.

Prior to analysis all observed variables were grand mean centered to preserve differences in means between the samples and aid with interpretation of interaction effects (Aiken and West 1991, pg. 31–34). Interaction variables were formed via products of corresponding main effect variables. Within the analysis, each independent variable (for both main and interaction effects) was indicated by a single latent variable with variance constrained at 1. The indicators were estimated for these variables, which approximately equal the observed standard deviation of the respective variable. The dependent variable was also indicated by a single latent variable, but its loading was constrained to be equal to the observed standard deviation of IPs for its given sample. The separation of the standard deviations (factor loadings) from the correlations (path coefficients) allowed us to explicitly test whether the relationship between the independent variables and the IPs differed across samples. All independent variables had estimated direct effects (i.e. regression coefficients) on the dependent variable, and all predictor variables were allowed to covary. The latent variable variance of the dependent variable was estimated, and one minus that estimate represents a pseudo-R 2 value for variance in IPs accounted for.

Results

Descriptive Statistics

Descriptive statistics and zero-order correlations are reported in Table 1. Mothers reported more IPs at Time 2 for boys than for girls, for children who were more behaviorally inhibited and had more IPs, when mothers had reported more negative parenting and negative emotionality, and when families had lower SES. Lower SES at Time 1 also was correlated with more teacher-reported IPs at Time 2. Arrival cortisol was not significantly correlated with IPs.

Predictions on Internalizing Problems at Time 2

Preliminary Analyses

Previous examinations showed that contemporaneous relations among the variables at Time 1 were largely consistent across the three constituent samples (Mills et al. 2012). Parallel examinations of the data were conducted to determine if the relations between the Time 1 predictor variables and the Time 2 IPs scores were consistent across the samples. This was done by comparing models that allowed free estimation of indicator variables and path coefficients as a baseline model to a model that constrained path coefficients and indicator variables across samples. Considering the χ 2, RMSEA and CFI fit statistics collectively, the unconstrained models did not fit the data significantly better than the constrained models for three analyses: prediction of mother-reported IPs from behavioral inhibition and the other factors and control variables (Δχ2 = 72.2, df = 62, p = 0.18; CFI = 0.99; RMSEA = 0.04; 90 % C.I.RMSEA = [0.00, 0.07]); prediction of teacher-reported IPs from behavioral inhibition and the other factors and control variables (Δχ2 = 60.40, df = 42, p = 0.03; CFI = 0.98; RMSEA = 0.06; 90 % C.I.RMSEA = [0.02, 0.09])1; and prediction of mother-reported IPs from arrival cortisol and the other factors and control variables (Δχ2 = 85.42, df = 65, p = 0.05; CFI = 0.99; RMSEA = 0.05; 90 % C.I.RMSEA = [0.01, 0.08]).Footnote 1 These preliminary analyses indicated that variability did not differ significantly across samples, and that it was reasonable to examine the significant effects within these models.

This was not the case in the remaining model. In the prediction of teacher-reported IPs from arrival cortisol, placing constraints on indicators and path coefficients across samples led to significantly decreased fit (Δχ2 = 96.29, df = 48, p < 0.01). This result indicated that effects in the unconstrained model were not robust across samples and may not be generalizable. In addition, when constraints were applied, there were no novel significant effects beyond those observed in the prior models. Thus, this model was not examined further.

Behavioral Inhibition and Mother Reported Internalizing Problems

The model predicting mother-reported IPs at Time 2 with behavioral inhibition as the moderating dispositional variable is presented in Table 2. Significant main effects were detected for IPs at Time 1 (stability): gender (more problems for boys), negative parenting (positive), and maternal negative emotionality (positive). Inhibition significantly moderated the prediction of IPs from negative emotionality (Fig. 1) and SES, and the latter effect was further moderated by gender (Fig. 2). Two additional three-way interactions were observed: Gender Χ Inhibition Χ Negative Parenting (Fig. 3), and Gender Χ Inhibition Χ Time 1 IPs (Fig. 4). Simple slopes analysis was used to probe these interactions at ±1 SD of inhibition (Aiken and West 1991).

Table 2 Models predicting mother-reported and teacher-reported internalizing problems at Time 2 with behavioral inhibition as the moderator of environmental factors
Fig. 1
figure 1

Behavioral inhibition moderated the prediction of Time 2 mother-reported internalizing problems from maternal negative emotionality

Fig. 2
figure 2

Behavioral inhibition and gender moderated the prediction of Time 2 mother-reported internalizing problems from socio-economic status

Fig. 3
figure 3

Behavioral inhibition and gender moderated the prediction of Time 2 mother-reported internalizing problems from negative parenting

Fig. 4
figure 4

Behavioral inhibition and gender moderated the stability of mother-reported internalizing problems from Time 1 to Time 2

Considering Fig. 1, it was only when inhibition was higher that maternal negative emotionality predicted more IPs (b = 0.23, p < 0.01); the slope was non-significant at low levels of inhibition (b = 0.05, ns). The regions of significance (ROS) analysis showed that children with low versus high inhibition would have significantly different IPs at 2 SD above the mean for negatively emotionality (z = −2.32, p < 0.05), but not at 2 SD below (z = 1.29, p = 0.20). This effect was more consistent with the diathesis-stress model than the differential susceptibility model.

Considering Fig. 2, it was only for more highly inhibited girls that family SES was negatively predictive of IPs (b = −0.34, p < 0.01); the slopes were non-significant for less inhibited girls (b = 0.09, ns), and both more (b = −0.10, ns) and less (b = 0.01, ns) inhibited boys. The ROS analysis showed that girls with low versus high inhibition had significantly different IPs at 2 SD above (z = 3.35, p < 0.01) and 2 SD below (z = −3.81, p < 0.001) the mean for SES. Neither test was significant for boys (both p > 0.15). For girls only, this effect was more consistent with the differential susceptibility model than the diathesis-stress model.

Considering Fig. 3, although experiencing more negative parenting was associated with having more IPs at Time 2 for all children, the slopes were significant only for more inhibited boys (b = 0.22, p < 0.01) and less inhibited girls (b = 0.19, p < 0.05). The comparative slopes for less inhibited boys and more inhibited girls were b = 0.13 and 0.11, respectively. However, ROS analyses showed that less versus more inhibited girls and boys did not differ in projected IPs either at 2 SD above or 2 SD below the mean of negative parenting.

Finally, considering Fig. 4, IPs were stable for all children, but more so for highly inhibited girls (b = 0.38, p < 0.01) than for less inhibited girls (b = 0.23, p < 0.01). The stability of IPs was comparable for more (b = 0.29, p < 0.01) and less (b = 0.32, p < 0.01) inhibited boys.

Behavioral Inhibition and Teacher Reported Internalizing Problems

The model predicting teacher-reported IPs at Time 2 with behavioral inhibition as the moderating dispositional variable is presented in Table 2. Significant main effects were detected for inhibition (t = 2.34, p < 0.05) and SES (t = −2.10, p < 0.05). There was also one significant three-way interaction: Gender Χ Inhibition Χ Negative Parenting (t = −1.96, p = 0.05). Although the strongest association between negative parenting and teacher-reported IPs was for more inhibited girls (b = 0.09, p = 0.30), neither this slope nor the others (all b ≤ 0.06) were significant. The ROS analysis showed that less inhibited girls had fewer teacher-reported IPs than more inhibited girls when negative parenting was 2 SD above the mean (z = −1.66, p < 0.10); no other tests were significant (all p > 0.10). This effect offered marginal support for the diathesis-stress model, for girls only, however this effect should be interpreted with caution.

Table 3 Model predicting mother-reported internalizing problems at Time 2 with arrival cortisol moderating environmental factors

Arrival Cortisol and Mother Reported Internalizing Problems

The model predicting mother-reported IPs at Time 2 with arrival cortisol as the moderating variable is presented in Table 3. Lower arrival cortisol tended to predict more IPs, and cortisol was involved in two interactions: Gender Χ Arrival Cortisol Χ Negative Parenting, and Gender Χ Arrival Cortisol Χ Time 1 IPs (see Figs. 5 and 6, respectively).

Fig. 5
figure 5

Cortisol levels and gender moderated the prediction of Time 2 mother-reported internalizing problems from negative parenting

Fig. 6
figure 6

Cortisol levels and gender moderated the stability of mother-reported internalizing problems from Time 1 to Time 2

Examining Fig. 5, the prediction of IPs from negative parenting was significant only for boys with higher arrival cortisol (b = 0.25, p < 0.01) and girls with lower arrival cortisol (b = 0.20, p < 0.01). The comparative slopes for boys with lower and girls with higher arrival cortisol were b = 0.13 and 0.04, respectively. The ROS analysis showed that girls with lower arrival cortisol had more mother-reported IPs than girls with higher arrival cortisol when negative parenting was 2 SD above the mean (z = 1.64, p = 0.10); no other tests were significant (all p > 0.15). This effect offered marginal support for the diathesis-stress model, for girls only.

Considering Fig. 6, girls with higher arrival cortisol had the most stable IPs (b = 0.38, p < 0.01), and girls with lower arrival cortisol the least (b = 0.26, p < 0.01); the pattern was opposite but less extreme for boys with lower (b = 0.35, p < 0.01) versus higher (b = 0.27, p < 0.01) cortisol.

Discussion

This examination of multilevel bioecological models of the development of IPs showed that several dispositional and environmental factors contributed to the stability or exacerbation of IPs over the transition from preschool- to school-age. Although it required relying upon maternal reports for most measures, aggregating across independent samples provided sufficient power to detect moderation effects that have eluded some prior efforts (e.g., Bayer et al. 2010), and the replication of these effects across the samples should increase confidence that they are robust (Meredith 1993). High inhibition and low SES predicted more problems across the home and school contexts, suggesting this dispositional vulnerability and this environmental stressor had broad impacts on development. Notably, inhibition and HPA reactivity moderated most links between earlier environmental factors and later mother-reported IPs. The diathesis-stress model was consistent with several of the interaction effects, while clear support for the differential susceptibility model was limited to one effect. The role of gender in these models was prominent; only one disposition-environment interaction was not further moderated by gender. In addition, both inhibition and HPA reactivity influenced the stability of girls’ IPs over time in ways that supported the Zahn-Waxler et al. (2000, 2008) model of gender and developmental psychopathology, but dispositionally vulnerable boys seemed more prone to the adverse effects of negative parenting (Coplan et al. 2007). Thus, support was found for multiple perspectives on IPs, reflecting the principle of equifinality (Cicchetti 2008). By revealing how multiple factors across multiple domains jointly contribute to the development of IPs, these findings may help to refine efforts to identify which children are most in need of early intervention efforts, and what aspects of child and family functioning such efforts should target.

Multilevel Predictors of Internalizing Problems

Inhibition, Gender and Development

As have others (Hastings et al. 2014; Kochanska et al. 2011), we found blended support for differential susceptibility and diathesis-stress models. Diathesis-stress most clearly characterized the observation of elevated IPs in children who had been highly inhibited preschoolers with more emotionally negative mothers (Fig. 1). In the absence of maternal negative emotionality, more inhibited children had low levels of IPs, comparable to those of less inhibited children regardless of maternal emotionality. Some studies within early childhood have found that the diathesis-stress model characterizes how inhibited or shy temperament moderates the link between maternal negative affect and child anxiety or IPs (Coplan et al. 2008), but few longitudinal studies into middle-childhood have replicated that pattern (Bayer et al. 2010). Several mechanisms could account for our finding that highly inhibited children were vulnerable to mothers’ negative emotionality. Emotionally negative mothers might model maladaptive patterns of emotional coping, they could undermine inhibited children’s attachment security, or they may generate a rejecting or conflicted home context (Hopkins et al. 2013; Mills et al. 2012). Lacking both an internal source of regulation in the face of social novelty, and an external model and source of effective emotion regulation, inhibited and unsupported young children would not be prepared to adapt to the contexts of middle childhood. Unable to meet the increased demands for autonomy and social competence, they would be prone to showing more of the anxiety and depression symptoms comprising IPs.

Support for the differential susceptibility model was seen in the moderating influence of girls’ behavioral inhibition on the link between SES and IPs (Fig. 2). Highly inhibited girls had more IPs when families had fewer resources, but fewer problems when families were more advantaged. IPs are most prevalent in children who live with the chronic and pervasive stresses that are typical of lower socioeconomic contexts (Reising et al. 2013). That inhibited young girls were most prone to these adverse effects was not surprising, as both girls and wary children are particularly likely to manifest their reactions to stress through anxiety and depression (Zahn-Waxler et al. 2008). However, inhibited girls also appeared uniquely likely to benefit from the protective aspects of higher SES. The reticence of inhibited girls to engage with unfamiliar social contexts might lead them to spend more time at home. If that home is provided by well-educated parents with well-paying, high-prestige jobs, they would likely be engaging with a safe and enriched setting. This could provide emotional security and cognitive stimulation, guiding the girls away from worsening IPs over time. From the evolutionary perspective of the differential-susceptibility model (Ellis et al. 2011), therefore, behavioral inhibition might have remained a prevalent trait because it made girls more receptive to the advantages of a privileged home, despite leaving them vulnerable to distress if they were raised in relative hardship.

Mothers who reported the most negative parenting practices also reported the most IPs in their children almost 5 years later, particularly for highly inhibited boys and less inhibited girls (Fig. 3). Curiously, neither the diathesis-stress nor the differential susceptibility model was clearly reflected in this interaction. In the absence of negative parenting, more inhibited boys had fewer IPs, comparable to those of less inhibited boys, but inhibited boys did not have significantly more IPs at high levels of negative parenting. Considered in light of the diathesis-stress pattern of the interaction between inhibition and maternal negative emotionality, this finding could suggest that inhibited boys are sensitive to mothers’ parenting behaviors, and in particular, benefit from more positive parenting. Being shy and fearful is less gender-typical for boys than girls, and parents react more negatively to boys’ displays of emotional vulnerability (Cassano et al. 2007). Inhibited boys who do not experience such negative, rejecting reactions at home, but rather, feel accepted and supported despite their reticence, may be protected from developing more serious difficulties.

Adding Context

The finding that negative parenting predicted mother-reported IPs for less inhibited girls reflected a positive outcome of the absence of risk factors that was consistent with our multilevel bioecological model, and was almost the mirror-image of a diathesis-stress effect. The fewest IPs were seen in less inhibited girls with mothers who reported more positive parenting. The regions of significance analysis did not offer robust support for this divergence from more inhibited girls, but there was a complementary effect in the prediction of teacher-reported IPs. More inhibited girls tended to have more IPs at school than did less inhibited girls when mothers reported more negative parenting practices, whereas the teacher-reported IPs of more and less inhibited girls did not differ when mothers reported more positive parenting.

Considered together, these effects might reflect the importance of both the context for girls’ expression of their vulnerabilities, and the perspective of the informant for understanding that expression (Hastings et al. 2014). Mothers likely base their evaluations of their daughters’ problems on what they see when they are together, which inhibited girls may experience as safe contexts due to the presence of their mothers. Most mothers have limited experience with children from other families to learn what is “normal,” and their gender-normed expectations might cause them to interpret the IPs of inhibited girls as “typically feminine” (Zahn-Waxler et al. 2000, 2008). However, teachers see children’s adjustment being displayed at school, away from parental accompaniment, and have a larger basis of comparison to distinguish average from elevated levels of anxiety and depression. Thus, positive-parenting mothers may particularly note the lack of IPs in uninhibited daughters while negative-parenting mothers discount the presence of IPs in inhibited girls as typically feminine. Teachers may be more sensitive to the presence of IPs in highly inhibited girls who have experienced negative parenting – that is, again, in children who are lacking both internal and external sources of effective regulation. It is important to note that this interpretation is highly speculative. Both of these interaction effects for girls involved a mix of significant, marginal and non-significant tests, such that the findings should be interpreted with caution pending replication.

Cortisol, Gender and Development

Because our measure of inhibition was based exclusively on maternal reports, caution should be exercised when interpreting it as a dispositional measure. Thus, it was important for us to examine the extent to which multilevel patterns of risk for IPs were evident when an objective measure of physiological regulation was considered, and an intriguing pattern emerged from looking at children’s adrenocortical reactivity to meeting adult strangers. Counter to predictions, preschoolers with lower cortisol levels tended to have more IPs in middle childhood. In a recent study of 8–11 year-old boys, Tyrka et al. (2012) showed that weaker cortisol responses to a home visit challenge were associated with having more internalizing symptoms 2 years later. Mounting an appropriate HPA response to a developmentally meaningful challenge is essential for adaptive coping. Blunted HPA reactivity is thought to reflect allostatic load and has been linked to internalizing difficulties in children, youth and adults (Calhoun et al. 2012; Gunnar and Adam 2012; Hastings et al. 2011b).

Closely paralleling our finding with inhibition, negative parenting predicted mother-reported IPs most strongly in boys with higher cortisol levels and girls with lower cortisol levels (Fig. 5). This was partially driven by a protective effect. Boys with elevated cortisol had few problems when they experienced more positive parenting, but with more negative parenting from mothers, their IPs were only as high as those of boys and girls with weaker HPA responses. When mother had reported very highly negative parenting, girls with lower cortisol levels had more IPs than girls with stronger HPA responses. If muted HPA reactivity to normal challenges reflects a dispositional vulnerability, then this could be seen as support for a diathesis-stress model for girls. Alternatively, if both very low reactivity and very high reactivity reflect different patterns of atypical HPA responses, then examining the patterns across girls and boys could be seen as consistent with a goodness-of-fit model (Wachs and Gandour 1983). Heightened HPA reactivity to social challenges may be adaptive for boys with supportive mothers, whereas muted reactivity may be maladaptive for girls with punitive and critical mothers.

Gender and the Stability of Internalizing Problems

Two effects provided evidence of multilevel processes within the dispositional domain that are consistent with hypotheses about gender differences in the development of psychopathology. Considerable theory and research point to a stronger propensity for girls to follow trajectories toward IPs (Zahn-Waxler et al. 2000, 2008). We did not find that girls had more IPs than boys, which might have been attributable to the characteristics of the samples in this study, or to the fact that problems were measured prior to adolescence when the rates of internalizing disorders increase more markedly in girls. Our previous examinations (Hastings et al. 2011a; Mills et al. 2012) showed that preschool-aged boys and girls did not differ in the dispositional and environmental factors we assessed. Yet, boys and girls differed in the chances of their IPs being maintained or exacerbated over time, depending on the presence of multilevel influences. IPs were most stable for highly inhibited girls (Fig. 4), and for girls with lower cortisol after meeting adult strangers (Fig. 6). Intriguingly, the latter effect replicates a recent observation in a longitudinal study of adolescent girls (Calhoun et al. 2012). These findings are consistent with models of gender as a canalizing factor, by which girls and boys manifest divergent patterns of maladjustment despite similar etiological risk factors (Zahn-Waxler et al. 2008). It is intriguing, though, that elevated inhibition and blunted HPA activity were not significantly associated in the preschoolers. This again points to the principle of equifinality, with temperament and HPA activity being distinct aspects of self-regulation that initiate two converging paths toward stable IPs in girls.

Limitations

The findings of this study should be evaluated in the context of its limitations. Without a true baseline measure of salivary cortisol having been obtained, the extent to which the “arrival cortisol” sample reflected reactive versus basal cortisol levels cannot be assessed; our findings will need to be replicated in studies that include pre- and post-challenge cortisol assays. Mothers were the sole source of information for all Time 1 behavioral data. Although convergent validity for mother-reported measures of behavioral inhibition and negative parenting was demonstrated in two samples, it will be important to replicate these findings with measures derived from independent sources. Similarly, although having teacher-reported IPs improved the situation at Time 2, it will be important for future work to include children’s self-reported internalizing difficulties. Although there were significant unique predictors in the model predicting teacher-reported IP, the overall model was not strong and these associations need to be replicated. This study considered many dispositional and environmental factors that contribute to internalizing trajectories, but not an exhaustive set; as children develop, there are likely to be other influences that could moderate or mediate the current findings, such as peer experiences. Differences in the procedures and measures across the three constituent samples may have produced measurement error that was higher than would have been produced had identical measures been used, which would lead to underestimating the actual relations. However, as the predictive models did not differ significantly across the samples, there should be confidence in the effects that were successfully identified and the findings that were interpreted.

Conclusions

This investigation provided mixed evidence for both diathesis-stress and differential-susceptibility effects in the development of IPs, as well as distinct profiles of developmental risk for girls and boys. The findings highlight the diversity of complex multilevel effects that shape the development of IPs. Clearly there is a greater likelihood of such problems when more dispositional and environmental risk factors are present, and when they occur in combination. More attention needs to be given to identifying: (1) which dispositional factors are vulnerabilities, or susceptibilities, or resiliencies and advantages, (2) which environmental factors and which contexts set the stage for these dispositional factors to express their developmental potential, (3) how the combined effects of dispositional and environmental factors are evidenced in adaptive functioning, as well as problems, and (4) when in the course of maturation these effects become evident. Furthering such efforts will increase our understanding of the numerous paths leading children to manifest internalizing difficulties, and the numerous ways in which those paths can be redirected toward greater well-being. The fact that multiple dispositional and environmental factors lead toward children’s development of internalizing difficulties suggests that there also are multiple potential points of intervention that could be leveraged. Social support and income assistance programs, courses in effective parenting, and coping skills and self-regulation training for children may all be effective mechanisms of change to foster healthy development and to help lead children away from depression and anxiety problems and toward greater socio-emotional competence and overall well-being.