Introduction

A major theme in studies of development is the possibility that there may be lasting effects of early experiences on psychological outcomes (O’Connor 2006). The question of whether or not the individual returns to normal development following negative life events or restricted periods of family adversity has challenged researchers for years (Campbell 1995; Strohschein 2005). The answer is of central theoretical as well as practical significance for early intervention and prevention efforts (Dawson et al. 2000).

This paper explores change in both undercontrolled behavior problems (the core categories of externalizing items except aggressive or destructive behaviors) and internalizing problems across early childhood, identifying predictive factors accounting for initial problem status at age 18 months, time to time change within the developmental period (18 months to 2.5 years and 2.5 years to 4.5 years) and long term change over the entire developmental period (18 months to 4.5 years). While a number of studies have examined problem behaviors among children from three years of age onwards, there is less knowledge about problem behaviors among younger children and possible persisting effects of early experiences from this age onwards on multiple outcomes, with few relevant datasets available. A consequence is that developmental pathways from 1–2 years of age onwards are under-explored (Colder et al. 2002; Leve et al. 2005; Miner and Clark-Steward 2008; Lavigne et al. 1998; Egger and Angold 2006). Knowledge about such pathways is, however, of vital importance in order to develop effective prevention and early intervention measures.

Early Onset and Predictability of Problem Behaviors

Between 10% and 20% of preschoolers are typically found to have problem behaviors of sufficient severity to affect their daily lives (Sonuga-Barke et al. 1997; Verlhulst and Van de Ende 1995; Achenbach and Rescorla 2000). Approximately one third of these (4–7% of the child population) appear to have serious problems (Richman et al. 1982; Smart et al. 1996). Results from longitudinal studies indicate that serious problems in childhood have high predictability. Between 40% and 60% of the children with high problem levels at 3–4 years of age continue to have problems at 10 years of age (Prior et al. 1992; Kooy and Verhulst 1992). Data on age-related changes, however, suggest that many challenging or difficult child behaviors (as defined by adult caregivers) are age-appropriate, reflecting developmental change or age-related conflict and frustration (Campbell 1995; Mathiesen and Sanson 2000). This makes it difficult to identify which problem behaviors present at 1–2 years of age remain problematic into the preschool period and to identify factors associated with stability and change from this age onwards.

The Nature of Early Childhood Problem Behaviors

Two broad-band categories of childhood problem behaviors have emerged from research across varying assessment instruments, samples and analytic procedures: externalizing problems (aggression, non-compliance, hyperactivity, and concentration problems); and internalizing problems (depression, anxiety, fearfulness and social withdrawal) (Achenbach et al. 1991). Previous analyses of the current sample of Norwegian children measuring internalizing and undercontrolled problems (undercontrolled problems captures oppositional, irritable, inattentive and overactive behaviors, but does not include symptoms of aggression) showed that these two dimensions could be differentiated at 18 and 30 months (Mathiesen and Sanson 2000). Comorbidity between externalizing and internalizing problem behaviors is, however, high throughout childhood and adolescence, and several review articles have demonstrated the importance of comorbidity for understanding the etiology and course of behavior problems (Angold et al. 1999; Rutter 1997). Much less is known about comorbidity during early childhood. In a large-scale epidemiological diagnostic investigation of preschoolers aged 2 through 5 years, Lavigne and colleagues report that a quarter of children with at least one psychiatric disorder had comorbid disorders, defined as ‘a disruptive disorder comorbid with an emotional disorder or other disorders’ (Lavigne et al. 1996). The risk of having comorbid disorders were found to increase with each additional year from age 2 onwards.

A number of studies have indicated that, normatively, externalizing problems tend to decrease from the age of 2 years onward, while internalizing problems tend to increase (Gilliom and Shaw 2004; Achenbach et al. 1991). There is evidence that changes in one domain are associated with changes in the other (Keiley et al. 2003), and that early externalizing problems may elicit later internalizing problems. One explanation is that externalizing behaviors often lead to problematic social interactions. Gilliom and Shaw (2004) showed that high initial levels of externalizing problems were linked to an increase in internalizing problems over time in a sample of disadvantaged boys. We examined whether these relations held between internalizing and undercontrolled problems with the current population-based sample of boys and girls.

Besides having different developmental pathways, the importance of differentiating between these two broad problem dimensions is further indicated by research findings that the development of externalizing and internalizing problems might be related to causal factors that are unique to each problem behavior domain as well as to shared causal factors (Leve et al. 2005; Sanson et al. 2004; Gilliom and Shaw 2004). In fact, another possibility that does not seem to have been considered in the literature is antagonistic causal factors that increase one type of problem behavior and decrease the other. Antagonistic causal factors would contribute to distinctness in the sense of contributing a negative component to the overall correlation.

Risk factors for Early Externalizing and Internalizing Problem Behaviors

We view psychological development as a result of a complex, dynamic, and transactional connection between the organism and its environment in line with basic assumptions underlying life span developmental models (O’Connor 2006). O’Connor emphasizes that challenges in studies of effects of early experiences on psychological development focused in such models is both to identify the kinds of environmental risk experiences that are associated with individual differences in adjustment across the life span and to account for the existence of individual differences in response to adverse environments. In addition, studies have to consider multiple outcomes to evaluate whether some children who don’t develop particular types of problem behaviors are nevertheless vulnerable on other indices of adjustment. He argues that research hypotheses need to be tailored more to individual differences and the certainty of outcome diversity. These perspectives are underlined in the current study.

Given existing evidence on the prevalence, stability and consequences of problem behaviors in childhood, prevention and early intervention are clearly desirable. These require a good understanding of etiology, including the contribution of both environmental factors and intrinsic child factors. Findings have been quite consistent in indicating that individual temperament characteristics such as high levels of emotionality from the preschool years have clear prospective relationships to both externalizing and internalizing problem behaviors (Sanson et al. 2004; Najman et al. 2000). However, temperament has less frequently been studied as a predictor of change in problem behavior (Miner and Clarke-Stewart; Owens and Shaw 2003). Maternal depression, parental discord, family stress and social isolation are external risk factors substantially related to the development of both internalizing and externalizing problem behaviors (Campbell 1995; Cummings and Davies 1994; Leve et al. 2005). Among these, maternal depression has been found to be a strong predictor of both problem dimensions already from the second year of children’s lives (Campbell 1995; Mathiesen and Sanson 2000; Owens and Shaw 2003). Depressed mothers are less emotionally available for their children and their style of parenting is characterized by more criticism than is typical (Gilliom and Shaw 2004; Rutter 1990). McCarty and McMahon (2003) stress the importance of focusing also on mediating effects. They refer to Davies et al. (1999) who studied older children and found that marital quality mediated the effects of maternal depressive symptoms on child externalizing problem behaviors, whereas maternal depressive symptoms mediated the effects of marital quality on child internalizing problem behaviors. Environmental risk factors like family stress and social isolation are connected to both internalizing and externalizing problem behaviors at all ages (McCarty and McMahon 2003).

Despite having some etiological factors in common, including direct linkages, it is likely that temperament factors are more consistently and differentially related to internalizing and externalizing problem behaviors than environmental risk factors (Sanson et al. 2004; Gilliom and Shaw 2004). High temperamental shyness, fearfulness, emotionality, and inhibition to novelty consistently predict internalizing problems (Prior et al. 2000; Schwartz et al. 1999; Janson and Mathiesen in press). Externalizing problem behaviors, on the other hand, are related to high scores on temperamental emotionality and activity and low fearfulness (Lahey et al. 1999; Schwartz et al. 1999; Mathiesen and Sanson 2000). However, little is yet known about the factors which contribute to the development of internalizing and externalizing problem behaviors in very early childhood, and the degree to which the predictors are general (agonistic or antagonistic) or specific to each dimension from the early ages. The present study examined the role of child temperament characteristics, along with a set of family-related environmental risk factors, in predicting problem behaviors from 18 months to 4–5 years.

Predictors of Change in Problem Status

While there is significant predictability in problem behavior from early childhood to school age, there is also substantial change. Temperament and family environment are also found to relate to changes in problem behavior. Most studies, however, have used temperament or family factors assessed only at the initial time of measurement (Caspi and Silva 1995; Gilliom and Shaw 2004). This does not allow examination of whether time to time change in a predictor is associated with time to time change in the outcome. Of special interest is the question of whether maternal depression, parental discord and family stress in early life have continuing negative consequences for the child, even after the exposure is terminated, or if it is only when such adversities are currently present that they have an impact. Existing research is inconclusive. Focusing on hard-to-manage preschool children, Campbell and colleagues (Campbell 1995; Campbell et al. 1991; Campbell et al. 1994) found that problems were more likely to persist in the context of ongoing and concurrent family adversity (family stress and maternal depression). Others, like Strohschein (2005), using data from the Canadian National Longitudinal Survey of Children and Youth (NLSCY), argue that the experience of parental divorce followed by maternal depression in early life has long-standing repercussions for the child, even if the maternal depression is temporary. A central goal of the current study was to use latent growth modeling to shed light on the contribution of both initial risk and time to time change in risk, and to do so separately for internalizing and undercontrolled problems.

Whereas previous studies have only examined change in environmental risk factors, here we also examined the impact of changes in child temperament. Although temperament is relatively predictable over time, individual differences in change usually are significant (Sanson et al. 1996; Prior et al. 2000)—hence it is also useful to investigate the impact of changes in temperament over time.

The Current Study

The main goal is to focus on two related questions with substantial practical impact, namely whether there may be lasting effects of early experiences on psychological outcomes over and above changes in risk factors, and whether changing levels of risk factors predict changes in problem scores. Growth modeling was used to test how initial problem behavior (intercept) and changes in internalizing and undercontrolled problems across early childhood (slope) are affected by initial value and time to time changes in risk factors. The following specific issues were investigated:

  1. 1.

    Growth and change in early childhood problem behaviors. We expected that overall; there would be substantial individual differences in initial level (intercept variance) and change (slope variance) in both undercontrolled problems and internalizing problems from 18 months (t1) to 4.5 years (t3), and modest but significant positive relations across constructs, between the two initial level factors and between the two slope factors. We also anticipated modest decreases and increases in population (mean) levels of undercontrolled and internalizing problems respectively over this period. In line with Patterson and Capaldi’s failure model, we expected that higher initial level of undercontrolled problem behaviors would predict higher slope of internalizing problem behaviors but not the reverse.

  2. 2.

    Risk factors for early childhood problem behaviors and persisting effects of early experiences on undercontrolled and internalizing problems at 4.5 year. We expected that risk factors present at 18 months would predict both initial levels and long term change in internalizing and undercontrolled problems over time (slope). Considering intrinsic child characteristics, temperamental emotionality was expected to predict both undercontrolled and internalizing problems, activity would predict undercontrolled problems only and shyness and low sociability would predict internalizing problems only. Overall, maternal symptoms of depression and anxiety were expected to predict both undercontrolled and internalizing problems, lack of partner support would predict undercontrolled problems only and family stress would predict internalizing problems only.

  3. 3.

    The impact of changes in risks factors over the period from 18 months to 4.5 year. We expected that time to time change in risk factors would predict time to time change in undercontrolled and internalizing problems over and above effects of initial risk on initial problems and slope, during each of the sub-periods, 18 months (t1) to 2.5 years (t2) and 2.5 years to 4.5 years (t3).

Method

Sample and Procedure

Routinely, more than 95% of all Norwegian families with children attend a public health program eight to twelve times during the first four years of the child’s life. All families from 19 different geographic health care areas in eastern Norway that visited a child health clinic in 1994 for the scheduled 18 months (t1) vaccination visit were invited to complete a questionnaire. The families who answered at t1 received a similar questionnaire when the children were 2.5 years (t 2) and 4.5 years old (t 3). Of the 1081 eligible families, 939 (87%) participated at t1 (921 mothers and 18 fathers), 804 families at t 2 (20 fathers), and 760 families at t 3 (23 fathers). Only maternal questionnaires were used in the analyses since the few fathers that filled in the questionnaires ordinarily did that at only one occasion and correlation between reports of child behaviors from mothers and fathers ordinarily is low. More than 95% of the families were ethnic Norwegians. Background information on the mothers who failed to respond at t1 was available at the child health clinic. Non-respondents did not differ significantly from respondents with respect to maternal age, education, and employment status, number of children and marital status. In relation to sample attrition over time, there were only small and non-significant differences between the mothers who filled in questionnaires at all waves and the mothers who only participated in the first wave with respect to age, education, number of children, financial status, social support, chronic stress, negative life events, maternal mental health symptoms, and the child’s problem behavior scores. However, a somewhat larger proportion of the mothers who only participated at t1 were single (14% compared to 8% among mothers who remained in the sample). This is not likely to bias the sample in a substantial way because the difference is modest and statistical modeling was carried out using missing data estimation techniques that include cases with partial data to increase power and decrease potential attrition bias (Schaffer and Graham 2002).

The 19 health care areas differed considerably and were overall representative of the diversity of social environments in Norway. The sample was almost evenly divided on gender, with 49% boys. The ages of the mothers ranged from 19 to 46 years at t1, with a mean of 30 years (SD = 4.7). At t1, 9% of the mothers were single. This increased to 10% at t2 and 13% at t3. In terms of education, 8% of the mothers had nine years schooling or less, while 18% had a college or university education of four years or more. Maternal employment was evenly distributed into three categories: At t3, 33% of the mothers worked full-time outside the home (32% at t1 and t2; respectively), 34% had part-time work (31% at t1 and 30% at t2), and 33% had no paid work 37% at t1 and 38% at t2). The index child was the only child in 22% of the families at t3. Of the background factors, only maternal education correlated with problem behavior and was included in the analysis.

Measures

The measures used are described in more detail in (Mathiesen et al. 1999) and therefore only a brief description is included here. All English-language questionnaires were translated into Norwegian, using back-translation to check for accuracy.

Problem Behaviors

The Behavior Checklist (BCL) (Richman and Graham 1971) measures problems related to the child’s behavior and adjustment to family life. The scale consists of 19 questions covering 12 behavioral categories: eating, sleeping, soiling, dependency and attention seeking, relationships with siblings and peers, activity, concentration, control problems, temper, mood, worries and fears. The BCL includes no categories measuring aggression or destructive behaviors. Some of the behavioral categories are measured by one question, some by two and some by three questions. We added a question about sadness, since the BCL does not cover this important symptom of internalizing behavior. Each of the behavioral categories is rated 0, 1, or 2 where ‘0’ signifies no difficulties, ‘1’ indicates moderate difficulties, and ‘2’ substantial difficulties. The scores on the 12 behavioral categories are summed to produce a total BCL score.

Undercontrolled and Internalizing Problems

Factor analysis was used to identify dimensions of internalizing and undercontrolled problems from the BCL at t1, t2 and t3, respectively. Since our interest was specifically in internalizing and undercontrolled problem behaviors, the three categories thought to describe regulation problems were excluded from the analyses, and the category measuring relations to siblings and friends was also removed since such problems could be due to either internalizing or undercontrolled problems. A 2-factor (Varimax) analysis gave the best fit at all three waves. The items difficult to manage, irritable, temper tantrums, too active and poor concentration loaded on factor 1 which was labeled undercontrolled problems. Our measure of undercontrolled problems captures typical items like oppositional, irritable, inattentive and overactive behaviors, but does not include symptoms of aggression. The categories worried, sad and fearful loaded on factor 2 which was labeled internalizing problems. Factor scales were created by averaging the ratings on each item for each of these dimensions; the alpha coefficients for the undercontrolled problems factor scale with 5 items were 0.46 at t1, 0.50 at t2, and 0.47 at t3 with a mean corrected item-total correlation of 0.26 (varying between 0.17 and 0.41).The corresponding alphas for the internalizing problems scale with 3 items were 0.42, 0.48 and 0.49 with a mean corrected item-total correlation of 0.29 (varying between 0.22 and 0.43). Although the alphas are low, this is expected because of the small number of items in each scale. The average inter-item correlations are comparable to levels reported elsewhere for this type of scale. In addition, growth models allow partitioning error variance from true score variance in intercepts and slopes so this mitigates some concerns about bias due to measurement error.

To make the scores more amenable to growth curve analysis under standard assumptions, the scores were root transformed (square root for undercontrolled and 2/3rds root for internalizing) and based on visual inspection, outliers on the high end of the scale remaining after transformation were trimmed back to less extreme values. The percentage of the sample that was trimmed for undercontrolled problems ranged between 1 and 3% at t1 to t3. The percentage of the sample that was trimmed for internalizing problems was just under 1% at each time point.

Temperament

Temperament was assessed by the EAS Temperament Survey for Children: Parental Ratings (Buss and Plomin 1984), which contains four dimensions: (1) Emotionality—the tendency to become aroused easily and intensely (often named Negative Emotionality); (2) Activity—preferred levels of activity and speed of action; (3) Sociability—the tendency to prefer the presence of others to being alone; and (4) Shyness—the tendency to be inhibited and awkward in new social situations. The 20-item version of the EAS for children aged 1–9 years was used with five items rated on a 5-point scale assessing each of the four dimensions. An examination of the factor structure, reliability and stability of the EAS with this data set showed that all four temperament scales had alpha coefficients ranging from 0.48 to 0.71 at t1, 0.54 to 0.73 at t2, and 0.60 to 0.79 at t3 (Mathiesen and Tambs 1999).

Physical Health Problems

The child’s somatic health status was measured from mothers’ answers to questions on 32 types of handicaps, diseases and/or illness symptoms during the last year. The handicaps, diseases and symptoms were included either because of high prevalence or potential severity. The 32 questions were classified into 5 indicators measuring, respectively: handicaps, infections, stomach problems, allergic diseases, and other symptoms or diseases. The scores on the 5 indicators were summed to provide a composite index of physical health problems, on a 0–5 scale.

Maternal Mental Health

Symptoms of anxiety and depression were measured by the 25-item version of the Hopkins Symptom Check List (HSCL-25) (Hesbacher et al. 1980; Winokur et al. 1984). The reliability and validity of the HSCL have been well established (Deane et al. 1992; Tambs and Moum 1993). Two items, “thoughts of ending your life” and “loss of sexual interest or pleasure”, were excluded from the Norwegian questionnaire because some mothers who participated in a pilot project perceived them as offensive. The alpha coefficient was 0.90 at all three waves.

Family Stress

Mothers were asked to indicate whether they had experienced enduring problems during the last 12 months in the following areas: housing, employment, their partner’s health, and their relationship with their partner, each scored 0 (no problem) or 1 (problem). The sum of the scores on these four stress areas formed the composite measure of family stress, with a range from 0 to 4.

Child-related Stress

Mothers indicated whether they had experienced problems in three child-related problem areas with any of their children during the last year, namely problems with finding childcare arrangements, children’s illnesses, and child rearing, again scored as 0 or 1. The sum of these three ratings was used to form a composite index of child related stress, with a range from 0 to 3.

Social Support from Partner

A ‘social support from partner’ index was formed by taking the mean of the scores on 4 questions (each scored on a Likert-scale from 1 to 5) measuring, respectively: 1) closeness and contact, 2) respect and responsibility, 3) feeling of belonging, and 4) practical help (Dalgard et al. 1995; Mathiesen et al. 1999). The Cronbach alpha coefficients for the ‘social support from partner’ index were 0.59 at t 1, 0.76 at t 2 and 0.66 at t 3.

External Social Support

Corresponding to the index measuring ‘social support from partner’, the questionnaire tapped the same four qualities (closeness and contact, respect and responsibility, feeling of belonging, and practical help) to describe the mothers’ relationship to other family members, and friends, and neighbors, respectively. The two scales measuring social support from family and friends were each computed by summing the mean value of the four belonging questions (each question rated on a five-point Likert-type scale). The Cronbach alphas for all 18 items at t1, t2 and t3 respectively were 0.84, 0.87 and 0.86. A ‘social support index’ was based on a principal components analysis of the scores on the 18 relevant questions (4 pertaining to family, 4 pertaining to friends and 10 to neighbors). The unrotated first factor explained 40–42% of the variance at all waves of measurement and scores on this factor were used in the analyses.

Analytic Strategy

Latent growth curve (LGC) analyses were used to explore the trajectories for undercontrolled and internalizing problems from 18 months to 4.5 years, and the capacity of risk factors to predict initial levels (intercept), time to time change between each assessment point and change over the entire developmental period (slope). The intercept in all instances was defined as t1.

Instead of using the typical approach to time varying influences, using each time varying influence as a concurrent predictor of the outcome at each assessment point, we re-parameterized our time varying predictors as initial status (t1), change from 18 months to 2.5 years (t2 − t1) and change from 2.5 years to 4.5 years (t3 − t2). This re-parameterization was necessitated by our interest in the effects of early risk on long term change in the outcome (slope) versus ongoing risk on time to time change in the outcome. We used the t1 predictor to predict initial status and slope of the outcome, the t2 − t1 change in risk to predict the t2 outcome, and the t3 − t2 change in risk to predict the t3 outcome. If the t1 risk factor effects on the slope predominated, it would suggest that early risk effects long term outcome and subsequent time to time change in risk is irrelevant. If the effects of time to time change in risk factors predominated, it suggests that current circumstances are more important than exposure to early risk. And of course, both types of effects could be important. Because early and ongoing risk can be correlated, it is important to include both simultaneously in the model to clarify how the risk factor actually operates. There is no difference in the information contained in the typical approach and our approach but our approach gave us a more direct interpretation of the effect of change in the predictor from one assessment to the next on change in the outcome during the same time, net of other influences in the model.

Models were estimated using a robust full information maximum likelihood (FIML) estimator in Mplus (Muthén and Muthén 2004), which corrects for non-normality and allows the inclusion of participants with partial data on the dependent variables across time. Missing data estimation was model based likelihood procedures under the assumption of ignorable missingness, (i.e., missing at random, MAR, which means missing at random after controlling for all predictors included in the model) the current recommended standard (Schaffer and Graham 2002).

Results

Descriptive Statistics

Table 1 presents the means and standard deviations for the transformed, trimmed undercontrolled and internalizing problem indices, and shows that overall levels were initially lower for internalizing than undercontrolled problems. The means for undercontrolled problems decreased steadily over time while those for internalizing problems increased steadily.

Table 1 Means, Standard Deviations, Sample Size and Correlations for Internalizing (Int) and Undercontrolled Problems (Ucon) at t1–t3

Table 1 also shows Pearson correlations for the internalizing and undercontrolled problems over the three time points. The correlation of undercontrolled problems was moderate from 18 months to 4.5 years, with r’s ranging from 0.36 to 0.49. There were somewhat lower correlations between internalizing problems across the three waves, ranging from 0.30 to 0.39; these might be attenuated by the limited reliability of the three-item scale. In terms of comorbidity across domains, Table 1 shows that there were only weak (but generally significant) correlations ranging from 0.14 to 0.18 over time across the two behavior problem domains.

Correlations among the predictors (not shown to conserve space) were all in the low to moderate range, from 0.05 to 0.43 at t1; 0.04 to 0.49 at t2; and 0.05 to 0.41 at t3. At all three times, the highest inter-correlations were between maternal symptoms of anxiety and depression and family stress.

LGC Analyses

Relations Between Internalizing and Undercontrolled Problems

We started out testing a parallel linear growth process model since we expected both undercontrolled and internalizing problems to develop in a linear manner and we expected growth to be correlated. Fitting the standard linear parallel process growth model using the transformed, trimmed measures with robust ML estimation resulted in a non-significant chi-square of 12.40 with 11 degrees of freedom (p < 0.33, TLI of 0.997, RMSEA of 0.011, 90% CI for RMSEA of 0.000, 0.037), indicating that the model fits well.

The mean slope for undercontrolled problems was significant and negative, indicating a downward trend in the population. The mean slope for internalizing problems was significant and positive indicating an upward trend in the population. The slope variances were both significant indicating individual differences in linear growth. The only significant correlation was between the two intercept factors, 0.35. The correlation between the slope factors was about equal in magnitude but non-significant. The rest of the correlations were non-significant, negative and very small. The growth factors accounted for 45 to 55% of the variance of the observed undercontrolled problems measures and 38 to 46% of the variance of the observed internalizing problems measures.

To look at possible cross-construct effects on growth, the model was converted to a structural model by specifying paths from each intercept factor to both slopes. Intercept factors were allowed to freely correlate as were the residual slope factors. In addition, the time specific influences at t1 and t2 for one construct were allowed to predict the other construct at t2 and t3 respectively. These effects for both growth factors and time specific influences exhaust the possible prospective cross-construct influence. None of the prospective cross-construct effects were significant and the concurrent correlations between intercepts and slopes were as in the previous model.

In summary, both internalizing and undercontrolled problems are well described by a standard linear growth model. Population (mean) change for undercontrolled and internalizing problems decreased and increased linearly, respectively, over this period. Both constructs showed significant individual differences in linear slopes. The only cross-construct relation that was significant was the concurrent relation among the intercepts (r = 0.35). The initial level of undercontrolled problems had 12% common variance with the initial level of internalizing problems. Although these two constructs were initially related, high initial levels of undercontrolled problems were not linked to increases in internalizing problems later on (or vice versa).

Predictions of Internalizing and Undercontrolled Problem Trajectories

The 10 family and child predictors were added to the parallel linear growth process model and the results are shown in Table 2. The family and child factors from t1 were tested as time invariant predictors of intercepts and slopes. The change scores from t1 to t2 and from t2 to t3 were included as time varying predictors in order to explore possible effects of changes in predictor variables on internalizing and undercontrolled problems. Before we estimated the final overall multi-predictor, multi-outcome model, the separate effect of each predictor variable was evaluated and the vast majority of the effects were significant. In the final step we started with all the predictors simultaneously and removed the least significant effects step by step from each of the equations until all effects in an equation were significant at p < 0.05. The initial multi-predictor parallel linear growth process model with all possible predictor effects adequately fit the data (χ 2 = 98.88 (89), p = 0.22, TLI = 0.982, RMSEA = 0.011, 90% CI for RMSEA of 0.000, 0.021) as did the final multi-predictor model that was trimmed of non-significant effects (χ 2 = 172.32 (175), p = 0.54, TLI = 1.003, RMSEA = 0.000, 90% CI for RMSEA of 0.000, 0.014). For comparison, the model with all predictor effects forced to zero fit poorly, (χ 2 = 867.40 (207), p < 0.001, TLI = 0.475, RMSEA = 0.057, 90% CI for RMSEA of 0.054, 0.062). Table 2 shows the effects of the predictors in the uni-predictor models and Fig. 1 shows effects of the predictors in the final multi-predictor model, including predictor effects from t1 and changes in predictors from t1 to t2 and t2 to t3 on internalizing and undercontrolled problem behaviors at t1, t2 and t3, respectively. Table 2 indicates approximate p values for uni-predictor effects using shades of gray. To make comparisons to significance levels in the multi-predictor results easier, Table 2 also indicates approximate p values using asterisks for the predictors from the final multi-predictor model. The magnitude, however, of the actual multi-predictor effects are only shown in Fig. 1. No variable became significant in the multi-predictor model that was not also significant by itself.

Fig. 1
figure 1

Final multipredictor model with significant time invariant predictor effects on intercept and slope for internalizing and undercontrolled problem behaviour, and with significant time variant predictor effects on internalizing and undercontrolled at t2 and t3 (see Table 2 for level of significance). Note that error terms are omitted for clarity

Table 2 Univariate Effects from t 1 Predictors and Changes in Predictors from t 1 to t 2 and t 2 to t 3, Respectively, on Internalizing and Undercontrolled Problems

Specific results of the final analyses for each outcome are described below. Bear in mind that significant effects of t1 predictors on slope are over and above any significant effects of change in predictors from t1 to t2 or t2 to t3 on change in the outcomes from t1 to t2 or t2 to t3, and vice versa.

Predictors of Internalizing Trajectory

Significant time invariant predictors (t1) for initial status of internalizing problems were high emotionality and shyness, low sociability and high maternal depressive symptoms. The slope of internalizing problems was only predicted, positively, by family stress at t1 (see Fig. 1).

The Impact of Changes in Predictors on Changes in Internalizing Problems

Changes in several predictors were related to changes in internalizing problems both at t2 (2.5 years) and t3 (4.5 years). As discussed previously, in the context of a growth model, these effects of change in the predictor on the outcome at t2 or t3 may be interpreted as effects on changes in the outcome from t1 to t2 or from t2 to t3 respectively. Of specific importance was that increased levels of family stress, emotionality and shyness, respectively, from t1 to t2 and t2 to t3 predicted increases in internalizing from t1 to t2 and t2 to t3. Bear in mind that these effects are above and beyond the linear changes in internalizing across t1, t2 and t3 captured by the indirect effect of t1 family stress through the latent slope. Decreased levels of sociability from t1 to t2 predicted increased levels of internalizing problems from t1 to t2 and increased maternal depressive symptoms from t2 to t3 predicted increased levels of internalizing problems from t2 to t3 (see Fig. 1).

Predictors of Undercontrolled Trajectories

The significant time invariant predictors of the intercept of undercontrolled problems were low levels of social support and maternal education, and high levels of maternal depressive symptoms and child activity and emotionality. Partner support and child emotionality at t1 predicted the slope of undercontrolled problems negatively and positively respectively (see Fig. 1).

The Impact of Changes in Predictors on Changes in Undercontrolled Problems

Addressing time varying predictors, decreases in activity and emotionality from t1 to t2 predicted decreases in undercontrolled problems from t1 to t2. Decreases in activity and emotionality from t1 to t2 and t2 to t3 predicted decreases in undercontrolled problems from t2 to t3. Increases in partner support from t1 to t2 and from t2 to t3 predicted decreases in undercontrolled problems from t2 to t3 and decreases in shyness from t2 to t3 predicted decreases in undercontrolled problems from t2 to t3 as well (see Fig. 1).

When the parallel growth model was run with one predictor at a time, many more predictors had significant effects (see Table 2 and note the lack of asterisks in light gray, dark gray or black shaded cells). The disappearance of these effects in the final model indicates that some effects are either confounded, or mediated, by other effects. In particular, despite having strong effects when considered singly, child health problems, gender, child rearing stress and maternal age did not have any significant effects in the multi-predictor model. In addition, maternal education and social support had many significant effects when considered singly but in the multi-predictor models, these variables had only a single significant effect each. Variables that clearly played major roles included activity, negative emotionality and shyness for the child, depressive symptoms for the mother, and family stress and partner support for the family environment.

In the final multi-predictor model, the correlation between the intercepts of undercontrolled and internalizing problem behaviors was substantially smaller but still significant (r = 0.15 but not shown in Fig. 1) compared to the model without predictors (r = 0.35). Evidently, the predictors account for a substantial amount of the initial correlation or co-morbidity between these two constructs. Although it was similar in magnitude to the correlation between intercepts, the correlation between the slopes in the model with no predictors was not significant (r = 0.35, z = 1.60, p = 0.11), but it is worth pointing out that in the final multi-predictor model, this same correlation was reduced to −0.02 (not shown in Fig. 1).

Although it is beyond the scope of this work to include a detailed analysis of mediation, direct and indirect effects and shared sources of correlation, one variable in particular stands out, negative emotionality, because it had pervasive effects in the same direction on both internalizing and undercontrolled problems and so this predictor undoubtedly contributes heavily to the correlation among the two outcomes. Partner support, maternal depressive symptoms, family stress and shyness made more minor contributions to shared variance because most effects were on one outcome or the other but not both. The only variable that clearly contributed to the distinctness of the two outcomes was child activity which was negatively related to internalizing problems and positively related to undercontrolled problems. Child sociability was also negatively related to internalizing and positively related to undercontrolled problems but it had fewer effects in the multi-predictor model and those effects were limited to internalizing problems. Overall, the predictors accounted for more undercontrolled problems intercept and slope variance (r 2 = 0.43 and 0.20 respectively) than internalizing problems intercept and slope variance (r 2 = 0.30 and 0.07 respectively).

Discussion

The Nature and Stability of Early Childhood Problem Behaviors

Both undercontrolled and internalizing problems were moderately correlated from 18 months to 4.5 years, with undercontrolled problem scores at t1 predicting 15% of the variance of t3 scores and internalizing problem scores at t1 predicting 10% of the variance of t3 scores. This indicates that some problem behaviors present in the second year of life are likely to persist, representing early manifestations of more serious problems. The predictive validity of undercontrolled and internalizing problems with a very early onset is rather remarkable in view of the developmental changes taking place during early childhood, and the almost normative nature of some challenging behaviors in this age range. In her review of research in the field from 1995, Campbell notes that follow-up studies of pre-school children identified as having behavior problems at ages 3 or 4 generally report a high probability (around 0.50) for children to continue to show difficulties throughout the elementary school years.

Although problem behaviours are correlated across time that does not imply that there are no changes. In this sample the mean levels of undercontrolled and internalizing problems respectively decreased and increased with increasing age. Internalizing and undercontrolled problems were initially moderately and significantly related (r = 0.35). The slopes of the developmental trajectories were also moderately correlated (r = 0.35), however, the correlation was not significant (p = 0.11). The lack of significance for the slope correlation despite being the same magnitude as the intercept correlation could be a statistical power problem. Even though the sample is large, missing data, using robust estimation procedures to compensate for non-normality, modest reliability of the child outcomes, lack of aggression items in the undercontrolled construct and only having 3 repeated assessments would all tend to lower power. Several studies have found high comorbidity levels between externalizing and internalizing problem behaviors, indicating that between a third and a half of the children with one disorder exhibit the other as well (Rutter 1997; Prior et al. 2000). These studies, however, have generally been of older children than those in this study.

Our results suggest that much of the comorbidity we found between our two problem dimensions among children at these young ages was accounted for by a set of common determinants that affect both types of behavior (specifically child emotionality, family stress, maternal depressive symptoms, and partner support). We did not find that undercontrolled problem behavior leads to internalizing problems as predicted by Patterson and Capaldi’s failure model (1990).

The current study has focused on two related questions with substantial practical implications, namely whether there may be lasting effects of early experiences on psychological outcomes over and above changes in risk factors, and whether changing levels of risk factors predict changes in problem scores. If most of a child’s psychological development already is determined in the second year of life, then we should target our preventive intervention effort mainly to families with children in their first year of life. If, however, the potential for positive changes is large at any point in the preschool years, early intervention might not be that crucial. The current study found both persistent and time varying effects of child temperament and family risk factors on the two outcomes. Child temperament and risk factors already present at 18 months affected the levels of symptoms three years later independent of changes in the predictors.

The Persisting Impact of Early Risk on Undercontrolled and Internalizing Problems at 4.5 Years

Three risk factors present at 18 months predicted long term changes of problem behavior over and above time varying effects of risk: Lower level of partner support and higher level of child emotionality predicted higher slope of undercontrolled problems and higher level of family stress predicted higher slope for internalizing problems. Although the environmental factors we assessed are relatively distal, our findings emphasize the developmental susceptibility of young children to conditions of early family life, which are likely to be mediated through parents’ warmth, responsivity and style of parenting (Sanson et al. 2002). The findings point to the critical importance of early preventive efforts.

The Impact of Time Varying Risk on Time to Time Change

As expected, changing levels of risk factors over time predicted changes in problem scores at t2 and t3 over and above the effects of initial risk on long term change. More specifically: Increases in emotionality and activity at t2 and t3 predicted higher level of undercontrolled problems at both t2 and t3. Decreases in partner support at both t2 and t3 predicted increases in the level of undercontrolled problems at t3. Increases in emotionality, shyness and family stress at t2 and t3 predicted increases in internalizing problems at both t2 and t3. Other effects that were significant but not as consistent across time included: Decreases in child sociability from t1 to t2 predicted increases in internalizing problems at t2, increases in maternal depressive symptoms from t2 to t3 predicted increases in internalizing problems at t3, and increases in child shyness from t2 to t3 predicted increases in undercontrolled problems at t3. These results suggest that child outcomes might improve substantially from initial level if maternal depressive symptoms, partner support and family stress improve at t2 or t3, or if there are changes in child temperament. The findings point to the critical importance of ongoing preventive efforts.

Although it is evident that risks can vary over time, few studies have investigated their impact on child outcomes (Essex et al. 2003). To our knowledge, no other study has used latent growth modeling to examine the effects of changes in family environment and temperament on time to time changes in the two problem dimensions over and above early effects of the same predictors on long term developmental change.

However, the relationships between temperamental risk factors and problem scores identified here correspond well with the relationships Prior et al. (1992) found in their study of stable and transient behavior problems among children followed from 2 to 6 years in the large ATP sample. Our results extend those results since the children were even younger at the start of the current study and we were able to demonstrate selective impacts of risk factors on undercontrolled and internalizing behavior problems, as well as the impact of changes in those risk factors over time.

Although we did not find gender differences in our study, our findings with respect to partner support correspond with results reported from analyses of the sample of boys in the WSTW-study (Essex et al. 2003) which showed a cumulative effect of exposure to first maternal depressive symptoms and then to marital conflict. The results suggest that programs that keep fathers involved and supportive of mothers will help lower population levels of internalizing and undercontrolled problems for young children.

Like all studies, there are important limitations of the current work. Although growth curve methods eliminate reliability concerns about the slope and intercept, reliability is still an issue in predicting time to time change and also with the predictors. Modest reliability in the outcomes would tend to weaken the prediction of time to time change and would favor prediction of the intercept and slope. Modest reliability in the predictors would tend to weaken all predictor effects. Differential reliability among multiple, competing predictors would tend to favor the most reliable predictors. Our index of undercontrolled problems does not include aggressive or destructive behaviors. This makes some of our results less comparable to results from studies that have included such symptoms in their index of externalizing behaviors.

Mothers were the informants about problem behaviors as well as the predictor variables at all three periods of time. High stability may be caused by stability in the mother’s style of responding as well as by stability in the children’s behavior, although stable reporting styles would not bias estimates of change. The relations between mothers’ ratings of problem behavior and independent observations of the children’s behaviors have, however, been examined in several studies (Schmitz et al. 2001; Campbell et al. 1994). Findings generally have supported the validity of maternal reports of concrete child behaviors (Bates and Bayles 1984). In a review of studies addressing this matter, Rothbart and Bates (1998, 2006) argue that parents provide a useful perspective on their children, drawn from their observation of a wide range of child behaviors. Further, the theoretically meaningful specific patterns of results found here for the different outcome measures suggest that the measures are valid. Still, future corroboration of these findings with non-maternal ratings of child behavior would be useful. A separate but related issue is that of father influence on child outcomes, either in addition to or instead of maternal influence, and the possibility that paternal influence moderates the effects of maternal influence. These are issues that we cannot address in this study.

Another issue is the possible confounding between assessment of behavior problems and temperament dimensions. In one of the first studies particularly addressing this question, Sanson et al. (1990) found temperament and behavior-problem questionnaires had many similar items, and their data did not fully support the ability of parent-report scales to distinguish adequately between the two. Bates (2001) has noted that the content overlap may be theoretically meaningful, and that the extent of overlap may not, in fact, account for a large portion of the linkage. In support of these assertions, Lengua et al. (1998) (and later: Lemery et al. 2002) found that the deletion of items that appeared to be confounded did not substantially weaken relationships between temperament scales and behavioral problems.

In summary, this study has documented the significance of early appearing internalizing and undercontrolled problem behaviors, the need to consider such behaviors separately, and the need for a dynamic approach to analyses of risk factors. The fact that some environmental and intrinsic child factors have selective impact on the two domains of behavior problems has relevance for how models of risk are developed and for the planning of prevention. It appears important to consider pathways to internalizing and undercontrolled problems separately even at very young ages. Child temperament and risk factors already present at 18 month affected the levels of symptoms three years later independent of changes in the predictors. The findings point to the critical importance of early preventive efforts. Many risks are, however, not static over time, and the current study has shown that changes in risk status over time impact on outcome. Our results suggest that children’s behavior can improve substantially from initial level if family environment improves. Significant effects for time varying risk factors point to the need for prevention efforts to be ongoing. From a methodological point of view, this suggests that it is necessary to measure risk status at multiple points in time; from a theoretical perspective, the findings reinforce the dynamic nature of the interplay between risks and child adjustment.