Adolescents who experience high levels of internalizing symptoms, defined as depressive and anxiety symptoms, can experience significant problems in functioning, as well as increased risk of depression and other mental health disorders (Saluja et al. 2004; Ingoldsby et al. 2006; Rueter et al. 1999; Wesselhoeft et al. 2013). Evidence shows that several interventions to prevent youth mental health problems, including internalizing symptoms and depression, are efficacious (Neumark-Sztainer et al. 2010; Merry et al. 2011; Stice et al. 2009). These interventions have targeted different factors that protect youth from mental health problems, such as improving cognitions, coping and interpersonal skills, and parenting and family factors. Yet, substantial heterogeneity exists in the effects of prevention interventions on depression and internalizing symptoms, with certain groups of youth benefiting more than others (see Horowitz and Garber 2006; Sandler et al. 2014; Stice et al. 2009). This paper explores variation in response to Familias Unidas, a family-based preventive intervention for Hispanic youth, by analyzing trajectories of internalizing symptoms for different youth subgroups.

Several studies have established that the Familias Unidas intervention is efficacious in preventing and reducing adolescent substance abuse and sexual risk behaviors (see for example Pantin et al. 2009; Prado et al. 2007, 2012). While not designed to prevent internalizing symptoms, it has also been found to reduce internalizing symptoms in trials of high-risk Hispanic youth with high levels of externalizing symptoms, such as conduct and aggression problems (Perrino et al. 2014; Perrino et al. 2016). This may be in part because the intervention strengthens parenting behaviors and family functioning, which are common factors that broadly promote behavioral health (NRC/IOM 2009). These unexpected intervention findings on internalizing symptoms are noteworthy because youth who exhibit behavioral problems are also at elevated risk for developing internalizing symptoms and disorders, and those showing co-occurring problems evidence significantly poorer behavioral and developmental outcomes than those with either of these types of problems alone (Wolffe and Ollendick 2006).

Importantly, the risk of elevated internalizing symptoms and disorders increases across adolescence (see Garber 2006; Kessler et al. 2005; Zahn-Waxler et al. 2000). For example, from age 13 to 18, 12-month prevalence estimates for major depressive disorder rise from 4.5 % at age 13 to 8.6 % at age 15 to 10 % at age 17 (Avenevoli et al. 2015). While only a subset of adolescents show elevated internalizing symptoms and mood disorders such as depression (Kessler et al. 2012; Zahn-Waxler et al. 2000), the increased risk during adolescence supports the importance of examining trajectories of internalizing symptoms across time and of identifying interventions that can reduce this risk.

Meta-analyses of prevention interventions for adolescents have found that youth with higher risks often benefit more than those with lower risks (see Horowitz and Garber 2006; Sandler et al. 2014; Stice et al. 2009). For instance, preventive interventions that target depression or depressive symptoms have demonstrated greater benefits for youth exhibiting higher levels of these symptoms (see Horowitz and Garber 2006; Stice et al. 2009). Trudeau et al. (2012) found that the effects of another family-focused preventive intervention on youth internalizing symptoms were stronger for those using substances earlier in life. Greater family risk has also been found to affect response to preventive interventions, with youth having poorer relationships or worse communication with their parents benefiting more from family-focused preventive interventions (Perrino et al. 2014; Tein et al. 2004). Recent Familias Unidas analyses across several trials indicate that this intervention was more efficacious in reducing internalizing symptoms for youth with poorer levels of parent–adolescent communication and that improvements in parent–adolescent communication mediated the intervention’s effects on internalizing symptoms for youth who started with poorer communication (Perrino et al. 2014).

Examining intervention response based on other markers of youth risk can help to better target interventions to those at greatest need. Youth with externalizing problems are at elevated risk for developing internalizing problems (Wolffe and Ollendick 2006), as are youth whose families experience acculturative stressors and exposure to stress (see Garber 2006; Hovey and King 1996). Thus, analyzing whether response to interventions is affected by these risk markers can be very useful. Because these risks rarely operate independently of one another, examining how patterns of risk factors jointly influence intervention response is an important objective.

Methodological advances can extend our knowledge about differential benefits of interventions. Synthesis studies, sometimes referred to as integrative data analysis (IDA; Curran and Hussong 2009), combine individual participant data across multiple trials. IDA can present numerous challenges in the harmonization of disparate measures, heterogeneity in the follow-up schedules, and differences in sample characteristics but also provides an opportunity to examine response to interventions across a significantly larger sample, representing a wider range of participant risk and protective factors. This pooling of data also leads to greater statistical power and opportunities for more complex statistical models (Brown et al. 2013). IDA has been used to combine data across cancer intervention trials (Adams et al. 2015), alcohol intervention trials (Mun et al. 2014), family therapy intervention trials (Greenbaum et al. 2015), and trials investigating associations between cannabis and depression (Horwood et al. 2012).

Growth mixture modeling (GMM) is an exploratory, person-centered approach to analyzing longitudinal data. GMM identifies groups of individuals that are formed around multiple response trajectories and allows for different relationships between covariates, such as intervention condition, and outcomes, such as internalizing trajectories, within these groups. As a result, GMM has been described as a combination of mixed effects modeling and cluster analysis, allowing for identification of unobserved heterogeneity in participant trajectories and the prediction of latent class membership (Muthén et al. 2008). In the context of preventive interventions, GMM provides an opportunity to identify whether the intervention works differently for these unique classes of participants, thus addressing questions of moderation of intervention effects.

This paper aims to enhance our understanding of the Familias Unidas intervention and its impact on adolescent internalizing symptoms by identifying and characterizing variation in intervention response using GMM across a combined sample of individual participant data spanning four separate trials of the Familias Unidas intervention. The combination of GMM in the presence of pooled individual participant data across trials is novel. The pooling of these data allows us to capitalize on the wider range of baseline levels of risk presented in the four distinct samples. We sought to (1) identify distinct classes of youth representing unique trajectories of internalizing symptoms; (2) determine whether class membership moderates intervention effects; and (3) use demographics, baseline risk factors, parental acculturative stress, and proximal change in parent–adolescent communication, a key modifiable risk factor, to predict and characterize class membership. Given previous findings that this intervention has unanticipated, beneficial effects on youth internalizing symptoms, identifying which adolescents benefit most from this intervention can allow us to direct the intervention to youth who are most likely to benefit from it. Indeed, examining factors that influence variation in prevention intervention response can improve the effects of future prevention efforts by allowing better matching of interventions to youth needs (Stice et al. 2009; Sandler et al. 2011). Among the unique aspects of these analyses is the potential to identify sets of factors that act in concert rather than separately to shape differential intervention response. This work can also represent a substantive contribution to the IDA literature through the application of these methods to assessment of preventive interventions on adolescent depression.

Methods

Population and Intervention

This study combined data across four trials of the Familias Unidas intervention, a family-based intervention for Hispanic families aimed at preventing adolescent substance abuse, sexual risk, and externalizing behaviors (Pantin et al. 2009; Prado et al. 2007, 2012; Estrada 2015). The pooled sample included 881 Hispanic adolescents with varying baseline risk levels based upon externalizing behavior problems. All four trials tested the efficacy of the Familias Unidas prevention intervention. There was some heterogeneity across the four trials in intervention duration, type of control condition, and inclusion criteria (Table 1). All studies were approved by the University of Miami’s Institutional Review Board. Informed consent (parents) and assent (adolescents) was obtained from all participants. In all trials, intervention fidelity was monitored using independent ratings of facilitator behaviors as shown in videotaped group and family sessions. Mean intervention adherence ratings for each trial were in the “considerable/good” range, specifically from 3.72 to 4.98 on a scale of “0 = not at all/very poor” to “6 = extensive/excellent” levels (see Estrada 2015; Pantin et al. 2009; Prado et al. 2007; Prado et al. 2012). Table 1 contains characteristics of each of the four intervention trials, their control groups, and participants, including age, gender, and baseline internalizing distributions. Appendix 1 (available online) has additional information about each of the four trials.

Table 1 Trial descriptions

Measures

The measures for this synthesis study were common to all four trials and were all reported by parents. Assessment time-points for each trial varied. The two universal trials (trials 1 and 2) had measures at baseline, 6 months, 12 months, and 24 months. The universal trial for 8th graders (trial 1) also assessed at 36 months. Trial 3 (targeted: referred) measured at baseline, 6 months, 18 months, and 30 months, and trial 4 (targeted: adjudicated) measured at baseline, 6 months, and 12 months.

Socio-Demographic Characteristics

Because girls have been shown to be at greater risk for depression and depressive symptoms, and the prevalence of depressive symptoms increases with adolescent age (see Garber 2006; Saluja et al. 2004), baseline age and gender were included as control variables in all analyses.

Adolescent Internalizing Symptoms

Adolescent internalizing symptoms were assessed with the Anxiety-Withdrawal Subscale of the Revised Behavior Problem Checklist (RBPC), an 11-item measure of items such as “Depressed; always sad,” and response choices ranging from “0 = no problem” to “2 = severe problem” (α = 0.90). Sum scores can range from 0 to 22 (Quay and Peterson 1993). Internal consistency, test–retest reliability, and construct validity for the RBPC have been established, including discrimination between clinic-referred and community samples of youth (Quay and Peterson 1993). Reported norms for this scale indicate that mean (SD) scores for community youth are 4.47 (4.07) for females and 3.85 (3.66) for males, while for clinical youth are 11.12 (4.77) for females and 9.71 (4.58) for males (Quay and Peterson 1993). The raw scores were square root transformed for analysis to account for positive skew (baseline mean = 5.09, baseline skew = 1.29, baseline kurtosis = 4.02) but are reported here in their original metric.

Parent–Adolescent Communication

Parent–adolescent communication was assessed using the Parent–Adolescent Communication Scale (Barnes and Olson, 1985), a 20-item scale that measures the quality of parent–adolescent communication (α = 0.82). Each item is rated on a 5-point Likert scale from “1 = strongly disagree” to “5 = strongly agree.” Item examples include the following: “I find it easy to discuss problems with my child” and “I openly show affection to my child.” Possible scores ranged from 20 to 100 with higher scores indicating better parent–adolescent communication.

Parent Acculturative Stress

Parent acculturative stress was measured using the Hispanic Stress Inventory (HSI; Cervantes et al. 1991). The HSI is a 73-item scale that asks about the extent to which individuals have experienced stressors associated with occupational/economic, parental, family/cultural, marital, and immigration issues. Response choices range from “0 = no problem” to “5 = extremely worried/tense.” The total stress score, a sum of all items, was used for these analyses (α = 0.78). Higher values indicate more stress.

Adolescent Externalizing Symptoms

Adolescent externalizing symptoms were measured using four subscales of the Revised Behavior Problem Checklist (Quay and Peterson 1993): attention problems (16 items; α = 0.95), motor excess (5 items; α = 0.84), socialized aggression (17 items; α = 0.93), and conduct disorder (22 items; α = 0.96). Sample items are as follows: “hyperactive; always on the go;” “Fights.” Response choices range from “0 = no problem” to “2 = severe problem,” with higher scores indicating higher levels of externalizing symptoms. A latent variable composed of these four indicators was used as an index of externalizing symptoms. Consistent with previous work (Perrino et al. 2014), four subscales of the RBPC served as indicators of the latent construct: conduct disorder, attention problems, socialized aggression, and motor excess. Standardized loadings were 0.77, 0.67, 0.59, and 0.87, for attention problems, motor excess, socialized aggression, and conduct disorder, respectively, and model fit was good (CFI = 1.000, SRMR = 0.014).

Analysis Plan

Latent growth modeling is a common approach to modeling longitudinal data, which enables estimation of an average change trajectory and a variance around that trajectory to account for individual differences. GMM is an extension of latent growth modeling. Instead of regarding individual differences in the trajectories as continuous, GMM identifies discrete classes, or subgroups of individuals, based on commonalities in their trajectories and then estimates the probability of class membership for each individual. In randomized clinical trials, we are interested in whether these classes moderate intervention effects. We used GMM to identify distinct clusters of symptom trajectories and to interpret intervention versus control differences in these growth trajectories as causally informed by a potential outcome framework.

The potential outcome framework assumes that each person recruited into a trial has two potential outcomes: the trajectory they would follow if assigned to intervention and the trajectory they would follow if assigned to control. Muthén and Brown (2009) suggested that people may fall into different discrete categories based on common potential outcomes. Those categories may reflect similar or differential intervention response. For example, one set of participants may show no differences regardless of whether they are assigned to intervention or control, while another set may show symptom reduction under intervention but not under control. Muthén and Brown (2009) also suggested that GMM could be used to identify discrete groups of individuals who had common potential outcomes based on the trajectories they followed after assignment to condition. This requires that group membership be independent of condition assignment so that the discrete groups defined by trajectory are not influenced by whether the participants are assigned to intervention or control. This independent preserves balance across important baseline characteristics which may lead to confounding of intervention condition and outcome within a particular group.

Building on this work, Jo et al. (2009) proposed a two-step modeling method designed to reduce potential confounding when using GMM to identify groups of participants with common potential outcomes. The first step of this approach is to use GMM to identify the optimal number of distinct latent trajectory classes among control participants. Because this estimation is done using only control cases, the resulting classes are independent of condition assignment. The step 2 GMM uses both intervention and control cases. This model retains the optimal number of classes, matches each individual’s probability of class membership for the control cases to their respective posterior probabilities in step 1, while estimating latent growth curves for the intervention group that have the same baseline structure (i.e., same proportion of class membership in intervention versus control and the same latent baseline mean and variance for intervention as for control).

This approach identifies distinct groups of participants where condition assignment is not confounded with baseline characteristics only if several assumptions are met. First, intervention and control subjects must be randomly assigned for the study as a whole, consistent with the study design for all four of the Familias Unidas trials. Second, detection of potential outcome class cannot be influenced by whether someone is assigned to intervention or control (the intervention ignorability assumption). If ignorability is not met, then participants in one potential outcome class could be mistakenly assigned to a different outcome class, leading to within-class imbalance on pretest characteristics that could confound intervention effects.

Finally, intervention effects within trajectory class are assumed to be similar across different levels of the covariates (an additive treatment assignment effect; Jo et al. 2009). If these assumptions are satisfied, the model can be identified, and intervention effects that are conditional on these classifications can be interpreted as how the trajectories of individuals in the control condition might shift if exposed to intervention (Jo et al. 2009).

Our GMM consisted of a latent intercept and latent trajectory (slope) indicated by the repeated measures of parent-reported adolescent internalizing symptoms. We selected a linear, rather than quadratic, model based on examination of the observed individual and average internalizing scores across time for the four trials. Age, gender, and baseline parent–adolescent communication were included as predictors of class assignment and within-class covariates of the intercept and slope. Dummy variables for trial were also included as predictors of class assignment to control for between-trial heterogeneity (Curran and Hussong 2009; Hussong et al. 2013). The step 1 analysis estimated this model for the control cases only and saved the probabilities of all class assignments for each control participant to a new dataset. To retain the same fit for each control case when the intervention cases were added, we used a procedure based on pseudo-class draws, then relied on standard multiple imputation methods to conduct hypothesis tests (Wang et al. 2005; Jo et al. 2009). We created five datasets of pseudo-class draws, where each control case was assigned to a latent class based on a random draw using each control case’s posterior probabilities from step 1.

In step 2, we took each of the five datasets and estimated a linear latent growth model for all participants, treating class membership as known for control cases and unknown for intervention cases. Within each class, we constrained the distribution of the latent intercept to be equal across intervention condition, as we would expect baseline equivalence in a randomized trial, and we estimated a main effect of intervention on the latent slope (see Appendix 2 online for sample Mplus code). This step 2 analysis was run on all five datasets resulting from the pseudo-class draws. We then used standard multiple imputation methods (Schafer 1997) to combine the within and between dataset variances and obtain a summary estimate of the impact of intervention on slope within each latent trajectory class. Results from the analysis using the first pseudo-class draw are reported, as the findings were very similar across all five analyses.

The four trials had unique follow-up schedules (described above). We addressed these differences through a missing data approach in which latent growth curves were modeled using all possible time points. If a given trial did not have data at a particular time point due to study design, those data were considered missing at that time point. This approach enabled use of all available data through the 24-month follow-up. Data became sparse across the four trials after 24 months; thus, analyses were limited to the 24-month follow-up.

Results

Trial-specific sample demographic and internalizing characteristics are provided in Table 1. Across the four trials, 43 % of the participants were female, and the mean age at baseline was 14.20 (SD = 1.20). The mean baseline internalizing symptoms was 5.10 (SD = 5.14), which is close to previously reported means for community samples (female mean = 4.47 (SD = 4.07); male mean = 3.85 (SD = 3.66)). Reported means for clinical samples of youth have been higher, specifically 11.12 (SD = 4.77) for females and 9.71 (SD = 4.58) for males (Quay and Peterson 1993). To assess balance across condition at baseline, we tested for equivalence across intervention condition in the distributions of age, gender, baseline parent–adolescent communication, and internalizing symptoms. There were no baseline differences on these variables across condition in any of the individual trials or across condition in the overall sample.

Our first analysis was a single latent growth curve model using data from all four trials. This analysis serves as comparison for the growth mixture models. We found decreasing depressive symptoms over time for both intervention and control conditions, with no significant overall intervention effect on the trajectory of adolescent depressive symptoms, consistent with previous work on these trials (Perrino et al. 2014).

We then employed the two-step growth mixture model described above (Jo et al. 2009). Using just the control cases, we examined 2-class (LL = −2229.40, sample size adjusted BIC = 4534.00, entropy = 0.70), 3-class (LL = −2207.95, SSA-BIC = 4518.19, entropy = 0.93), and 4-class (LL = −2163.24, SSA-BIC = 4455.82, entropy = 0.75) models. Although the 4-class model demonstrated modestly smaller fit statistics, the smallest group represented only 3.5 % of cases, and this model had poorer entropy compared to the 3-class model. Thus, the 3-class model was retained, and five datasets with known class membership for the control cases were created using the pseudo-class draws described above. Using the class membership as observed for the control cases and missing for the intervention cases, we re-ran the 3-class growth mixture model with all participants on each of the five datasets (see Fig. 1), including a separate regression of the slope on intervention condition within each class (LL = −4075.291, entropy = 0.95; see Fig. 1).

Fig. 1
figure 1

Growth mixture model. The dotted line indicates separate estimates of the effect of intervention condition on slope for each of the three latent classes

Before proceeding with interpretation of intervention effects within each class, we assessed for balance across intervention condition by comparing within-class means on key covariates. Though there were no differences across condition on the overall sample, this balance may not necessarily be maintained once participants are separated into the latent classes. We used multiple imputation to summarize estimates across five pseudo-class draws, using a single pseudo-class draw based on the class probabilities from each of the five step 2 analyses. Specifically, we estimated within-class mean differences across condition on baseline measures of parent–adolescent communication, parent acculturative stress, conduct disorder, socialized aggression, attention problems, and motor excess. Though none of these comparisons were statistically significant at α = .05, conduct disorder (\( \overline{b} \) = 4.37, p = 0.11), attention problems (b = 2.86, p = 0.11), and motor excess (\( \overline{b} \) = 1.04, p = 0.09) showed trends toward significance in one of the classes. To adjust for this potential imbalance, we re-ran the GMM including a latent variable for externalizing as a predictor of the latent slope, with separate regression coefficients in each of the three classes. This resulted in a latent slope that was adjusted for possible differences across intervention condition in externalizing symptoms. This model exhibited better fit with a log-likelihood value closer to 0 (LL = −3883.69) and improved entropy (entropy = 0.96).

The adjusted trajectories, accounting for externalizing symptoms, are in Fig. 2. We identified the classes with names of “low internalizing symptoms,” “moderate internalizing symptoms,” and “high internalizing symptoms” based on the mean baseline level of internalizing symptoms, and what has been reported as means for community and clinical samples of youth (Quay and Peterson 1993). Before interpreting the treatment effects, we examined how participants within trials were distributed across the three classes, pooling proportions across the five pseudo-class draws. Sixty-eight percent of participants in the low internalizing class were from the universal trials. Ninety-three percent of participants in the moderate internalizing class were from the targeted (referred and adjudicated) trials, and 59 % of participants in the high internalizing class were from the targeted trials.

Fig. 2
figure 2

Intervention (solid lines) and control (dashed lines) trajectories for each of the three latent classes when externalizing is in the model. Vertical dark lines indicate the interquartile range (IQR) for baseline (IQR = 1–8) and 24-month (IQR = 0–4) observed value. Statistically significant differences in intervention effect exist only for the high class

Individuals in the low internalizing class (60 % of the sample) had the lowest average baseline value of internalizing symptoms (mean = 2.14), and both intervention and control cases showed significant decreases in internalizing symptoms over time (mean intervention slope = −0.03, SE = 0.005; mean control slope = −0.03, SE = 0.003). Individuals in the moderate internalizing class (27 % of the sample) had a moderate average level of baseline internalizing symptoms (mean = 4.69), with both intervention and control participants significantly decreasing internalizing symptoms over time (mean intervention slope = −0.04, SE = 0.007; mean control slope = −0.03, SE = 0.007). Individuals in the high internalizing class (13 % of the sample) had the highest average level of baseline internalizing symptoms (mean = 7.01), with intervention participants experiencing no change over time (mean intervention slope = 0.002, SE = 0.018) and control participants significantly increasing internalizing symptoms over time (mean control slope = 0.04, SE = 0.01). The intervention effect was not statistically significant in the low nor medium internalizing symptom classes, suggesting no difference in the trajectories between the two conditions. However, the intervention effect on the internalizing trajectory was statistically significant in the high internalizing class using multiple imputations across the five pseudo-class-filled datasets (\( \overline{b} \) = −0.04, SE = 0.02, p = 0.03).

To better understand likelihood of class membership, we examined how gender, age, and baseline parent–adolescent communication influenced the probability of class membership in the high internalizing class. Gender was a significant predictor of class membership, with girls being significantly more likely to be in the high internalizing class relative to the moderate internalizing class (OR = 4.13, 95 % CI 2.47–6.93) but not the low internalizing class. Older age also predicted a higher odds of being in the high internalizing class relative to the moderate class (OR = 3.53, 95 % CI 2.78–4.47) and low class (OR = 1.58, 95 % CI 1.32–1.89). Parent–adolescent communication also differentially predicted membership, such that poorer baseline parent–adolescent communication was associated with higher probability of membership in high internalizing class versus low (OR = 1.07, 95 % CI 1.05–1.10).

To further characterize the classes, we also estimated the class means for parent acculturative stress (a family risk factor not used in the original analysis) using estimates based on posterior probability-based imputations, which preserve uncertainty of class assignment. These estimates are obtained through the auxiliary function in Mplus and have no impact on the original latent class analysis. Parents of adolescents in the high internalizing class reported the highest level of parent acculturative stress (M = 8.28, SD = 0.40). By contrast, adolescents in the low internalizing class had the lowest parent acculturative stress (M = 5.68, SD = 0.17). All three pairwise comparisons of mean levels of parent acculturative stress were statistically significant (high vs moderate χ2 = 7.00, p = <0.01, high vs low χ2 = 35.41, p < 0.001, moderate vs low χ2 = 15.93, p < 0.001). We also examined the mean latent externalizing factor scores and found that the high internalizing class had significantly higher latent factor means on externalizing compared to the moderate (b = 4.558, p = 0.006) and low (b = 13.482, p < 0.001) internalizing classes.

Because change in parent–adolescent communication was an important mediator in previous work, we calculated a difference score on parent–adolescent communication by subtracting the baseline level from the level at 6 months (post-intervention) so that positive values indicated improvements in communication. As before, we used multiple imputation to summarize estimates across five pseudo-class draws to compare within-class differences on change in parent–adolescent communication across intervention condition. In the low internalizing class, neither the intervention nor control cases showed significant change in communication (intervention mean = 1.20, 95 % CI −0.08–2.47; control mean = 0.05, 95 % CI −0.93–1.03), and the means were not significantly different from one another. In the moderate internalizing class, the intervention cases showed significant improvements in communication (mean = 6.44, 95 % CI 4.13–8.75) while the control cases showed no significant change (mean = 0.77, 95 % CI −1.04–2.58). The difference between these means was statistically significant (\( \overline{b} \) = 5.67, p < 0.001). In the high internalizing class, only the intervention cases demonstrated improvements in parent–adolescent communication (intervention mean = 2.95, 95 % CI 0.34–5.55; control mean = −0.21, 95 % CI −2.11–1.68), and this difference showed a trend toward significance (\( \overline{b} \) = 2.78, p = 0.08).

Discussion

The goal of this exploratory study was to use novel analytic methods to examine differential intervention response among youth with different patterns of risk across four trials of the Familias Unidas prevention intervention. In previous studies, Familias Unidas was found to unexpectedly reduce internalizing symptoms (Perrino et al. 2014; Perrino et al. 2016). Thus, specifying which adolescents benefit most from this intervention can help to direct this intervention to youth most likely to gain from it. The present synthesis analyses identified three unique trajectory classes distinguishable by initial baseline levels of internalizing symptoms: “low,” “moderate,” and “high internalizing symptoms” classes, and there was evidence of differential intervention response depending on class membership.

These groups cover a relatively comprehensive spectrum of risk levels. Youth with low and moderate levels of internalizing symptoms at baseline comprised the largest groups, and both intervention and control participants in these classes showed reductions in internalizing symptoms across time. Approximately 60 % of participants were classified as “low internalizing” youth, exhibiting very low levels of internalizing symptoms compared to the other classes and to prior community samples of youth at baseline (see Quay and Peterson 1993). The “moderate internalizing” symptom youth represented 27 % of the sample and showed higher symptoms than the low internalizing group, but similar levels to previous community samples (see Quay and Peterson 1993). This is consistent with previous work showing that most adolescents do not experience serious, clinically elevated depressive or internalizing symptoms and disorders (see Kessler et al. 2005; Zahn-Waxler et al. 2000). In both of these classes, intervention and control participants’ internalizing symptoms decreased across time, with no significant difference in the average trajectory of symptoms between Familias Unidas and control cases. These youth may represent a group with minimal intervention need, as their initial symptoms are low and seem to diminish on their own with time (see Horowitz and Garber 2006).

The Familias Unidas intervention was most beneficial for the youth with “high internalizing” symptom levels, who represented 13 % of the sample, and had internalizing symptoms that fell between levels reported for previous community and clinical samples (see Quay and Peterson 1993). In this class, adolescents assigned to Familias Unidas exhibited no significant change on internalizing symptoms across time, maintaining slightly elevated levels of internalizing relative to a community sample. By contrast, the trajectory of symptoms for control participants in this class increased substantially, resulting in an average internalizing score at 24 months consistent with norms for clinical youth (Quay and Peterson 1993). This is indicative of a preventive effect of the intervention, in which youth exposed to the intervention appear to be protected from increases in symptoms they would otherwise be at risk for had they not participated in the intervention.

Additional analyses of this high internalizing symptom class better characterize this group on concurrent risk factors and help explain the differential intervention response. These youth showed multiple risk factors and higher risk profiles, including the highest levels of externalizing symptoms at baseline. This is important because conduct problems often increase risk for also developing internalizing symptoms and disorders (Wolffe and Ollendick 2006). Thus, an intervention like Familias Unidas that can reduce the risk of internalizing problems among youth who also show high externalizing symptoms is consequential (Perrino et al. 2016).

Parents of adolescents in this high internalizing symptom class were also highest on acculturative stress, a measure of occupational, economic, parental, cultural, marital, and immigration stressors that captures stressful experiences Hispanic families may face in the USA. Previous research has documented that exposure to stress, including acculturative stressors, is related to youth internalizing symptoms (see Garber 2006; Hovey and King 1996). High scores on acculturative stress among the high internalizing group suggests that these families may be experiencing additional, contextual risk factors that are compounding youth’s risk for internalizing symptoms. The Familias Unidas intervention directly addresses some of these stressors, such as parenting, family and cultural stressors through their parent support groups, and other intervention facets (see Pantin et al. 2009). Future research should examine changes in parent acculturative stress and the potential impact of family interventions like Familias Unidas.

Finally, within-class means for the “high internalizing” class suggests that there were significant improvements in parent–adolescent communication from baseline to post-intervention for Familias Unidas participants, but not control participants. This is consistent with the Familias Unidas intervention’s hypothesized mechanisms of action and supports previous findings on family communication as a mediator of intervention effects on internalizing symptoms (Perrino et al. 2014). Positive parenting and family relationships are important in preventing internalizing and depressive symptoms (see Biglan et al. 2012; Restifo and Bögels 2009). Interestingly, recent analyses of youth with high levels of externalizing symptoms exposed to the Familias Unidas intervention found that this intervention influenced internalizing symptoms through cascading effects, starting with improved parent–adolescent communication, which subsequently reduced youth externalizing symptoms, ultimately reducing internalizing symptoms (Perrino et al. 2016).

While the current study provides useful information about this intervention’s differential impact on internalizing symptoms, it has several limitations. First, because reducing internalizing symptoms was not a primary target of the intervention, the measurement of internalizing symptoms was limited to a parent report measure of youth symptoms, the Revised Problem Behavior Checklist or RBPC (Quay and Peterson 1993). Although parent reports are not a direct measure of youth internalizing symptoms and they can differ from youth self-reports, previous research has found that parent and youth self-reports about internalizing symptoms using the RBPC are strongly correlated (Thomas et al. 1990). Second, there was variation in intervention duration, control conditions, and sample characteristics across the four trials. We took a fixed effects approach to handling this variation by controlling for trial in our analyses, which adjusts for between-trial variability. This approach has implications for the generalizability of our findings limiting inferences to these specific samples (Curran and Hussong 2009). Treating the control conditions, which differed across trials, as equivalent has the potential to complicate interpretation of the counterfactual. For example, some control conditions were more active controls than others and may have provided benefits like simple support. Thus, the comparison to an intervention in these trials is a test of the intervention components beyond support. In this paper, our understanding of the effects due to general exposure to the intervention condition should not be affected by ignoring this heterogeneity in control condition components. Finally, GMM is an exploratory method of examining data. One limitation to this approach is the inability to confirm that randomization across treatment conditions is maintained within the latent classes. We were able to maintain the uncertainty of class assignment through the use of pseudo-class draws based on class probabilities so that the step 2 analyses maintained the probabilistic nature of class assignment for control cases. However, these imputation methods using pseudo-class draws in IDA are an area of active research, and advances in the coming years may improve on the approach we took in these analyses. Another limitation of this approach when combined with IDA is the possibility of imbalance across condition for a specific trial within a particular class. Our approach achieved excellent balance on condition across the trials, but there is some imbalance across condition when we examine individual trials within class, particularly when a trial has very low representation in a class. Though we cannot be sure about the implications of this for causal inference, the problem is likely limited by the fact that we have excellent balance on a range of baseline covariates. This is an area for future research, and additional studies are needed to confirm the findings presented here.

The strengths of this work come from the examination of a large, diverse sample of adolescents resulting from the synthesis of individual participant data across four Familias Unidas intervention trials. Follow-up assessments spanning 24 months permitted the use of advanced statistical modeling to uncover distinct, unique trajectories of internalizing symptoms. The findings provide evidence of differential intervention response based on trajectory class and provide insights about the possible reasons for this heterogeneity. We were able to see that adolescents with low and moderate baseline risk tended to experience decreased internalizing symptoms over time regardless of intervention exposure. On the other hand, the class with the highest risk profile on internalizing and externalizing symptoms, poor parent–adolescent communication, and high acculturative stress had the best response to the Familias Unidas intervention in internalizing symptom reductions. Specifically, these high-risk adolescents who were exposed to the Familias Unidas intervention experienced no change in internalizing symptoms over time, while those in the control condition showed an increase in internalizing symptoms over time. This suggests a preventive effect of the intervention and supports previous research that those with greater risk show better prevention intervention response (see Horowitz and Garber 2006; Sandler et al. 2014; Stice et al. 2009). Considering that Familias Unidas is efficacious in reducing drug use and sexual risk behaviors (see Pantin et al. 2009; Prado et al. 2007, 2012), but was not designed to reduce youth internalizing symptoms, its positive effects on internalizing symptoms among high-risk youth are an unexpected intervention bonus. It supports the value of this family-focused intervention, especially given the problems in functioning that internalizing symptoms can create (NRC/IOM 2009).