Introduction

There is extensive support for the effectiveness of evidence-based parenting and family interventions in successfully preventing and treating/managing symptoms related to mental health disorders in children (Chamberlain et al. 2008; Eyeberg et al. 2008; Greenberg et al. 2001; Kazdin and Weisz 1998; Sandler et al. 2011). To better understand how or through what mechanisms these interventions achieve positive outcomes, a large number of studies have been conducted to test one or more putative mediator variables (Fagan and Benedini 2016; Forehand et al. 2014; Kazdin and Nock 2003; Maric et al. 2012; Weersing and Weisz 2002). This substantial body of literature provides unique opportunities to review from a methodological perspective in what ways, and how well, the family-based intervention studies tested statistical mediation.

Statistical mediation affords a unique look into why interventions achieve or alternatively fail to achieve results, and thus are a critical complement to efficacy and effectiveness studies that aim to build a body of evidence for a given intervention or class of interventions. Drawing on the general notion that a mediator is an intervening variable which explains how or why two other variables are related (Baron and Kenny 1986; Judd and Kenny 1981), statistical mediation is well-suited for examining how an intervention and a specific outcome are connected. Mediation models assume that temporal ordering of the independent variable, mediator, and dependent variable is known and is rooted in theory (MacKinnon, 2008; Preacher and Hayes 2008). Statistical mediation analysis can test how these variables are conceptually linked. Furthermore, statistical mediation tests the conceptual framework and theoretical underpinnings of how interventions effect change (Chen, 2005; Fairchild and MacKinnon 2014; Kazdin 2007; Kazdin and Nock 2003). Testing underlying principles that guide an intervention might advance theoretical understanding of risk and protective factors related to a specific outcome. Statistical mediation can also evaluate which intervention elements are effective and contribute to positive outcome, as demonstrated by several substantive studies (e.g., Hanisch et al. 2014; Koning et al. 2011; Punamäki et al. 2013). Identifying these elements might contribute to revising or developing more efficacious interventions and focusing implementation efforts on those components that are most responsible for beneficial outcomes (Fairchild and MacKinnon 2014; Kazdin, 2007; Kazdin and Nock 2003).

During the last three decades, the field has seen advances in methodologies for statistical mediation (Fairchild and MacKinnon 2014; MacKinnon 2008; Pracher 2015). Researchers need to know how well these advances are being incorporated into family-based intervention studies, and how to proceed in future studies. Consumers of the intervention studies benefit from having a better understanding of mediation and its role in intervention outcomes. With these goals in mind, we conducted a review of the family-based intervention literature to elucidate how statistical mediation was implemented, with an emphasis on methodological quality.

In this review, we identified a large set of family-based intervention studies that examined a variety of putative mediators and focused on outcomes which included externalizing behaviors, internalizing behaviors, substance use, high-risk sexual activity, and academic achievement. We reviewed the selected approaches to assessing mediation, the specific tests of statistical significance (if employed), and the adequacy in the reporting of parameter estimates. In line with the goals of this review, the article is organized as follows: (1) description of the procedures and inclusion criteria used for identification of studies included in the review; (2) foundational description of the approaches to mediation and the specific tests of mediation; (3) concise summary of the interventions, outcomes, and putative mediators; (4) detailed assessment of the methods and quality of statistical mediation in the reviewed studies; and (5) conclusions and recommendations.

Identification of Studies for Review

Inclusion criteria for the review included quantitatively based outcome studies that: (a) evaluated outcomes related to externalizing behaviors, internalizing behaviors, substance use, high-risk sexual activity, and academic achievement in children or adolescents, (b) tested parenting and family-based interventions, (c) used a randomized controlled design (RCT), and (d) assessed one or more putative mediators. Statistical mediation studies were excluded when quasi-experimental or observational designs were used, if the target sample consisted of adults (e.g., parents) rather than children or adolescents, if the outcome was not one of the five outcomes identified above (e.g., obesity, diabetes), or if the study combined or compared medication with parenting/family intervention. The decision to exclude evaluation of nonrandomized intervention designs was based on the low prevalence of these studies in the literature. That is, we did not feel there was a large enough foundation of quasi-experimental work in the area from which we could draw a reasonable characterization of study quality.

A literature search of studies published in English in peer-reviewed journals between January 1981 and December 2015 was completed using the PsycINFO database. The search term configuration was: (“statistical mediation” or “process analysis”) and (“parenting intervention” or “family intervention” or “family-based intervention”). This search generated abstract records for 8822 articles. An additional 235 abstract records were added from citations found in prior reviews (Forehand et al. 2014; Kazdin 2007; Maric et al. 2015; Maric et al. 2012) and in the reference sections of identified mediation studies, yielding a total of 9048 records. The PRISMA flow diagram shown in Fig. 1 summarizes the reduction of this record pool down to 123 articles based on the exclusion reasons listed, and then how this set was further reduced. The screening process yielded a total of 73 statistical mediation studies which met inclusion criteria for the review.

Fig. 1
figure 1

PRISMA flow diagram for statistical mediation review

Testing Mediator Effects: Approaches and Statistical Tests

The single mediation model is displayed in Fig. 2 and defined by Eqs. 13 below:

$$\hat{Y} = b_{01} + cX$$
(1)
$$\hat{Y} = b_{02} + c^{{\prime }} X + bM$$
(2)
$$\hat{M} = b_{03} + aX$$
(3)
Fig. 2
figure 2

Single mediation model

The top panel of Fig. 2 depicts a causal process (in theory) between the independent variable (X) and the dependent variable (Y). The c path quantifies the total effect of X on Y and does not include any other variables. The bottom panel of the figure represents the components that are believed to comprise the total effect, with the a path quantifying the effect of X on the mediator, the b path quantifying the effect of the mediator on Y controlling for X, and the c′ path quantifying the direct effect of X on Y, controlling for the mediator. Implicit in the examination of mediation effects is the requirement of correct temporal ordering of the variables in the model. Other assumptions include those associated with basic OLS regression analyses, as well as no interaction between the independent and mediating variables.

Mediation is tested by assessing the relationships between variables in two or more of the equations above. Three general approaches to mediation are discussed in this review: causal steps, difference in coefficients, and product of coefficients. Though other frameworks for mediation have been posited (e.g., Kraemer et al. 2008), these were not represented in the reviewed studies and thus are not considered further. One of the main distinctions of the Kraemer et al. approach for examination of mediation includes the requirement that an interaction between the intervention and the mediator be modeled. Readers interested in learning more about the “Kraemer Method” are referred to Kraemer (2008) and Kraemer et al. (2008).

Causal Steps

The causal steps approach to mediation tests the relationship between each variable in the mediation model depicted in Fig. 2 (Baron and Kenny 1986; Judd and Kenny 1981). The approach describes four conditions that are necessary for mediation: (a) a significant relationship between the independent variable and the dependent variable, (b) a significant relationship between the independent variable and the mediator, (c) a significant relationship between the mediator and the dependent variable, and (d) a reduction in the magnitude of the total effect when M is controlled. If there is no significant direct effect of the independent variable on the dependent variable when controlling for the mediator, the process is termed complete mediation, such that the entire influence of X on Y is conveyed through M. When there remains an effect of X on Y when controlling for M, the process is termed partial mediation (Baron and Kenny 1986).

Though the causal steps approach has been a commonly used method for testing mediation hypotheses, there are several limitations to this approach (Fritz and MacKinnon 2007; Fairchild and McQuillin 2010; MacKinnon et al. 2002; Preacher and Hayes 2008). First, the requirement of an overall significant effect of X on Y might be fallible and overly restrictive, particularly when dealing with multiple mediators or distal processes (MacKinnon et al. 2007; MacKinnon and Fairchild 2009; Shrout and Bolger 2002). Second, the causal steps approach does not easily extend to multiple mediator models making it difficult to test theoretically complex relationships, and simulation studies have found that it is underpowered and typically requires large sample sizes to reach adequate power (Fritz and MacKinnon 2007; MacKinnon et al. 2002). Finally, the causal steps approach provides neither a parameter estimate nor standard error of the mediation effect, and thus cannot be used to construct confidence intervals of the indirect effect (MacKinnon et al. 2002, 2007).

The joint test of significance is a variation of the causal steps approach. In this test, the direct effect is ignored and the significance of the relationship between the independent variable and the mediator and the relationship between the mediator and the dependent variable are used to assess mediation. Mediation is present if both these relationships are significant. Though the joint test of significance does not provide an estimate of the mediated effect, simulation studies have shown that the test has acceptable performance with respect to Type I error rates, power, and sample size requirements, in contrast to the causal steps approach (Fritz and MacKinnon 2007; MacKinnon et al. 2002).

Difference in Coefficients

The difference in coefficients approach compares the relationship between the independent variable and the dependent variable before and after controlling for the mediator. The estimate of the mediated effect is the difference between the total effect of the independent variable on the dependent variable (path c) and the direct effect of the independent variable on the dependent variable accounting for the mediator (path c’). The estimate of the mediated effect is divided by the normal standard error to test for statistical significance (MacKinnon and Dwyer 1993). Like the causal steps approach, the difference in coefficients approach does not easily extend to more complex models involving multiple mediators, categorical variables, or models that assess both mediation and moderation (MacKinnon et al. 2007).

Product of Coefficients

The product of coefficients approach provides a parameter estimate of the mediated effect by taking the product of the a path, quantifying the impact of the independent variable on the mediator, and the b path, quantifying the impact of the mediator on the dependent variable (controlling for the independent variable). Historically, testing the statistical significance of this estimate has involved dividing it by the corresponding normal-theory standard error as derived by the multivariate delta method (Sobel 1982). However, more recent literature has suggested asymmetric confidence limits for statistical significance testing of the estimate afford a more accurate and powerful approach to testing statistical mediation. The product of coefficients approach easily extends to complex models involving multiple mediators and multiple outcomes (Fairchild and McQuillin 2010; MacKinnon et al. 2002). Simulation studies have assessed the power of different tests used with the product of coefficients approach (Fritz and MacKinnon 2007; MacKinnon et al. 2002). In general, this approach yields more accurate Type I error rates and greater power than the causal steps approach, and typically requires smaller sample sizes to detect effects with adequate power (Fritz and MacKinnon 2007; MacKinnon et al. 2002). We expand further on the varying ways to test statistical significance of the product of coefficients approach below.

Sobel Test

The Sobel test, based on normal theory, is the most commonly used method to test statistical significance of the mediated effect when it is parameterized as the product of coefficients, ab (Sobel 1982). Mediation is tested by dividing the ab estimate by its normal-theory standard error and comparing this value to a normal distribution to test for significance. There are several alternative formulae (differentiated by invoking higher-order derivatives or not) that can be used to calculate the normal-theory standard error; however, in this review these formulas are all categorized as the Sobel test as the variants have been shown to be minimally different. In simulation studies, the Sobel test has been shown to perform better than the causal steps approach with regard to Type I error rates, power, and sample size requirements (Fritz and MacKinnon 2007; MacKinnon et al. 2002). The Sobel test assumes that the mediated effect has a normal sampling distribution, which is not necessarily problematic with large sample sizes as the test is asymptotically efficient. However, more advanced tests are available that relax this assumption and are more appropriate when faced with the limitation of a smaller sample size.

Asymmetric Confidence Limits

The sampling distribution of the product of two random normal variables (such as the two regression coefficients that define the mediated effect in the product of coefficients approach) is not normally distributed. Rather, it is asymmetric and kurtotic (Bollen and Stine, 1990; MacKinnon et al. 2004; Hayes 2009). Using critical values from a normal sampling distribution is thus not associated with the correct p value for the ab sampling distribution for statistical significance testing. Using methods that create asymmetric confidence limits to test the significance of the mediated effect is more powerful with smaller sample sizes and more accurately reflects the shape of the ab sampling distribution (Fritz and MacKinnon 2007; MacKinnon et al. 2002, 2004). Two methods of obtaining asymmetric confidence limits are discussed here: the distribution of the product method and a resampling method called bootstrapping.

The distribution of product method uses a mathematical approach to derive the sampling distribution of the product of two normally distributed random variables (Meeker et al. 1981). Meeker et al. (1981) created tables of the adjusted critical values for the distribution of the product of two random variables. The values derived by Meeker et al. can be used to create asymmetric confidence limits for ab and to test significance of the mediated effect. This method is better suited to single mediator models and less so for more complex mediation models. Tofighi and MacKinnon (2011) have provided an R package to implement conducting the method in an accessible way.

Bootstrapping is a resampling method that creates confidence limits based on the empirical sampling distribution of the original data (Efron and Tibshirani 1993). The bootstrap method was identified as the method of choice for testing statistical mediation in a simulation study comparing the distribution of product method and the bootstrap method (MacKinnon et al. 2004; Williams and MacKinnon 2008). Three methods of bootstrapping are defined here: the nonparametric percentile bootstrap, the bias-corrected bootstrap, and the accelerated bias-corrected bootstrap.

The percentile bootstrap method uses a random sample (with replacement) from the original sample of data to calculate an indirect effect estimate. This process is repeated a large number of times to create an empirical distribution of the indirect effect estimates which is then used to determine confidence limits based on corresponding percentiles in the distribution (Efron and Tibshirani 1993). The bias-corrected bootstrap method is similar to the percentile bootstrap method but corrects for bias in the indirect effect estimate when the median bootstrap estimate does not equal the value of the mediated effect in the original sample (Efron and Tibshirani 1993). If there is a difference between the mediated effect obtained from the original sample and the median estimate obtained from the bootstrap distribution, then a correction factor will be invoked to correct the bias. In simulation studies, the bias-corrected bootstrap method has more power and requires a smaller sample size to detect mediated effects (Fritz and MacKinnon 2007; MacKinnon et al. 2004; Williams and MacKinnon 2008). The accelerated bias-corrected bootstrap method corrects for bias in the indirect effect estimate and for skew in the sampling distribution created by the bootstrap procedure (Efron and Tibshirani 1993). Confidence limits produced by the accelerated bias-corrected bootstrap method are thought to be more accurate than those produced by the other bootstrap methods (DiCiccio and Efron 1996; Efron and Tibshirani 1993), but might have elevated Type 1 error in some cases.

Illustrative Example

To illustrate how all of the aforementioned methodological considerations play out in scientific practice, one study is provided here as an example. Fang and Schinke (2014) tested two mediators, mother–daughter relationship and youth self-efficacy, in a randomized controlled study of a family-based substance-use preventive intervention. The outcome variables of interest were alcohol use, marijuana use, and substance-use intention. Mediator and outcomes variables were tested at baseline, one-year follow-up, and two-year follow-up, creating a full longitudinal design tested in a path analytic framework. The study used the product of coefficients approach and bias-corrected bootstrap significance test with a sample size of 108. The results section reported overall model fit, estimates of the mediated effects, and the effect sizes. The results indicated that mother–daughter relationship and youth self-efficacy were significant mediators for all three outcomes.

Interventions, Outcome Variables, and Putative Mediators in the Review

Interventions

The interventions in this review reflect parenting and family interventions. These interventions have repeatedly shown efficacy and effectiveness in preventing and treating youth mental health disorders (both externalizing and internalizing), substance use, high-risk sexual activity, and increasing academic achievement (Brody et al. 2006; Brody et al. 2010; Chamberlain et al. 2008; Ennett et al. 2001; Eyeberg et al. 2008; Greenberg et al. 2001; Henderson et al. 2009; Kazdin and Weisz 1998; MVPP 2012; Pantin et al. 2009; Prado et al. 2007; Prinz 2012; Sandler et al. 2011; Schoenfelder et al. 2015; Zhou et al. 2008). The underlying theories of this class of interventions support positive change for these heterogeneous outcomes.

Outcome Variables

The outcome variables in the review were externalizing behaviors, internalizing behaviors, substance use, high-risk sexual activity, and academic achievement. Externalizing behaviors included child or adolescent conduct problems, noncompliance, and delinquency. Internalizing symptoms included child or adolescent anxiety and depression. Substance use included alcohol, illicit drug, and tobacco use. High-risk sexual activity was defined as engaging in unprotected sex and risk associated with the number of sexual partners in a 12-month period. Academic achievement was defined using the grade point average (GPA).

Putative Mediators

Across the five outcomes, parenting-related variables were the most common putative mediators tested in the studies. The putative mediators related to parenting included positive parenting, negative parenting, parent–child relationship, parent monitoring/supervision, parent mental health, and parent confidence. Several youth variables were also assessed as mediators, and these include deviant peer association, youth coping skills, early childhood disruptive behavior, academic achievement, school engagement, self-esteem, social skills, and youth mental health. A detailed breakdown of which mediators are tested in each study is found in the Appendix.

Description and Quality of Statistical Mediation in the Reviewed Studies

The studies included in this review were assessed for quality of the mediation analysis conducted in the papers based on the following criteria: sample size, temporal precedence, approach to statistical mediation, specific significance test of mediated effect used (most applicable to the product of coefficients approach), report of parameter estimates, and mediator model (i.e., whether a single mediator or multiple mediators were assessed; for multiple mediators, whether they were assessed individually or together). Each of these topics is outlined below.

Sample Size

Simulation studies have assessed various approaches and tests of mediation to determine sample size requirements to detect mediation effects (Cheong 2011; Fritz and MacKinnon 2007; MacKinnon et al. 2002). Consistently, findings indicate that the causal steps approach has low power and requires a large sample size to detect mediation, particularly when the direct effect is zero (Fritz and MacKinnon 2007; MacKinnon et al. 2002). For complete mediation to be detected by the causal steps approach, a sample size of over 20,000 is required to detect statistical mediation effects when the effect sizes of the indirect paths, a and b, are small (Fritz and MacKinnon 2007). In contrast, the product of coefficients approach requires considerably smaller samples to detect significant mediation effects, regardless of the test used for significance testing. The Sobel test requires a sample size of 600 when effect sizes for the indirect paths are small (Fritz and MacKinnon 2007). Under the same conditions, the asymmetric confidence limits method requires a sample size of 539 based on the product of two normally distributed random variables, a sample size of 558 based on the percentile bootstrap, and a sample size of 462 based on the bias-corrected bootstrap (Fritz and MacKinnon 2007). For the causal steps approach and the product of coefficients approach, the sample size requirement is even lower for partial mediation as well as when the effect sizes for the indirect paths are larger (Fritz and MacKinnon 2007; MacKinnon et al. 2002). When the effect size of indirect paths a and b is both large, the causal steps approach requires a sample size of 92 for complete mediation and a sample size of 42 for the Sobel test (Fritz and MacKinnon 2007). When the effect size of indirect paths a and b is both large, the asymmetric confidence limits require a sample size of 35 based on the product of two normally distributed random variables, a sample size of 36 based on the percentile bootstrap, and a sample size of 34 based on the bias-corrected bootstrap (Fritz and MacKinnon 2007).

The median sample size for all of the studies in the review was 238 (see Table 1). For the subset of studies that used the causal steps approach, the median sample size was 238. Studies using the product of coefficients approach had a median sample size of 244. Studies that used both approaches had a median sample size of 151. Studies that used the Sobel test had a median sample size of 557, and studies that used the asymmetrical confidence limits (i.e., percentile bootstrap, bias-corrected bootstrap, and accelerated bias-corrected bootstrap) had a median sample size of 183.

Table 1 Quality of mediation analyses in studies of family-based interventions (N = 73)

Temporal Precedence Design

With respect to temporal precedence in the context of intervention studies, it is preferable in mediation designs to have at least three measurement points in time so that changes in a putative mediator can be evaluated after the intervention is instituted, but before the outcome occurs (Kraemer et al. 2002; MacKinnon et al. 2007; Maric et al. 2012; Preacher and Hayes 2008). This ideal scenario is called a full longitudinal design. A half longitudinal design might be invoked by utilizing ANCOVA or difference score models when only two time points of data are available (MacKinnon 2008). With respect to temporal precedence design for the 73 studies, 36 used a full longitudinal design within a regression framework, 16 used a half longitudinal design within a regression framework, and the remaining 21 used growth curve modeling. Some of the studies that used growth curve modeling explored full longitudinal processes, while others evaluated parallel relationships among change in latent growth parameters associated with M and Y (von Soest and Hagtvet 2011).

Mediation Approach

The quality of statistical mediation in this review was evaluated based on several facets: mediation approach, methodological considerations, specific statistical tests of the mediated effect, and reporting of mediation estimates. Table 1 provides information about each of these considerations for each of the 73 reviewed studies.

The product of coefficients approach to mediation was the sole approach used in 45 of the 73 studies (62%). The causal steps approach was the sole approach used in 17 of the 73 studies (23%). The remaining 11 studies (15%) used both approaches by first establishing that the conditions required for the causal steps approach were met and then adopting the product of coefficients approach to statistically test the mediated effect(s). Given that a significant relationship need not exist between the independent variable and the dependent variable for indirect effects to occur (MacKinnon and Fairchild 2009; Preacher and Hayes 2008), it would have been preferable that the product of coefficient approach was used independently in studies where the two approaches were combined.

Looking at the publication time of the studies using the causal steps or product of coefficients approach, a decreasing trend is seen with greater use of the causal steps approach in the earlier studies. Later studies (i.e., from 2004 to 2015) showed greater use of the product of coefficients approach. This change likely reflects improvements in understanding of the methods as well as recognition that the causal steps approach typically demands larger sample sizes for greater power, which is often challenging for intervention studies (Eddy and Chamberlain 2000; Maric et al. 2012).

Significance Tests for Mediated Effects

Studies taking a product of coefficients approach used various significance tests for the mediated effect; 26 studies used the Sobel test and 26 used asymmetric confidence limits. For those which used the asymmetric confidence limits, 14 used a bootstrap method, while 12 used the distribution of product method. For the bootstrap-method studies, the bias-corrected bootstrap method was most commonly used (11 studies), with only two studies using the percentile bootstrap method and one study the accelerated bias-corrected method.

A number of studies used a variant of the causal steps approach, employing the joint test of significance first followed by the Sobel test to provide a parameter estimate and formal test of significance for the mediated effect. The combined use of the causal steps approach and product of coefficients approach is redundant. In these situations, if a parameter estimate is sought, it is recommended that the product of coefficients is estimated and that asymmetric confidence limits test the significance of the estimate.

Reporting of Analytic Estimates

Reporting, or failing to report, analytic estimates was evaluated for each study in the review (see Table 1). Specifically, overall model fit of the mediation model (when structural equation modeling, SEM, was used), mediation point/parameter estimates, and effect size measures for mediation were identified. The model fit provides information on the consistency of the model to the data and is an important estimate to consider before interpreting mediation results in an SEM framework (Gunzler et al. 2013). Parameter estimates of the mediated effect are generated when tests of the product of coefficients approach or difference of coefficients approach are used and should be reported along with estimates of the effect size of the mediated effect (Robey 2004). These estimates are important for comparing findings across replication studies (Robey 2004).

When SEM was used, overall mediation model fit was reported 95% of the time. Parameter estimates were reported in 75% of the studies for which an estimate would have been generated. Effect size was reported in only 60% of studies, which included path effect sizes and/or overall mediation effect sizes. Effect sizes are necessary to understand the magnitude of the mediated effect and to permit comparisons across studies (MacKinnon 2008).

Single and Multiple Mediator Models

Single mediator models assess the intervention effect on an outcome through one mediator (MacKinnon 2008). In family-based interventions, more than one mediator might account for changes in the outcome and might need to be tested at the same time (Fairchild and McQuillin 2010; Maric et al. 2012). Multiple mediator models assess the intervention effect on an outcome through two or more mediators either at the same time using a parallel model or testing of the relationships between mediators in a serial model (MacKinnon 2008). Single mediator models were used in 73% of the studies in this review, including studies where multiple mediators were tested individually.

Note Regarding Psychometric Properties of Mediator Measures

This review did not examine the psychometric properties of the measures that were employed to represent the putative mediators. The reliability, validity, and utility of mediator measures are clearly relevant to the quality of a statistical mediation analysis (as well as other analytic frameworks) but beyond the scope of this review. By and large, the studies taken as a whole provided substantial evidence for the psychometric adequacy of the outcome and mediator measures, and further often made use of multiple methods and sources of assessment.

Quality Summary for the Most Frequently Tested Mediators

One might ask how the methodological quality of statistical mediation as observed in this body of studies bears, if at all, on the interpretation of evidence in support of each putative mediator. Perhaps it might have been assumed that mediation studies conducted with better methodology might have produced fewer significant findings for any given mediator. This review was not undertaken in attempt to try to explain, or explain away, mediation results as a function of methodological quality. To this point, the more commonly tested putative mediators reflected both significant and nonsignificant results whether with preferred or nonpreferred methods. This issue is further complicated by the failure all too often to control for Type I errors when several mediators and outcome variables were tested in the same study.

That said and going beyond the general appraisal of methodological quality reported in the previous section, one might ask how specific substantively defined mediators fared with respect to the quality of statistical mediation employed. Quality of statistical mediation is summarized here for the most frequently tested mediators (i.e., tested in 10 or more studies) collapsing across outcomes. Readers interested in a more detailed examination of all of the specific mediators and outcome variables can make use of the Appendix and Table 1.

The most commonly evaluated mediator was positive parenting, which was tested in 30 studies. Approximately two-thirds of these studies (21 of 30) more adequately represented temporal precedence by using either a full longitudinal design or growth curve modeling, as opposed to a half longitudinal design. Most of the studies (23 of 30) used the product of coefficients approach, which is preferred over the causal steps approach. For the studies where a significance test was appropriate, 12 used asymmetric confidence limits (preferred), while the other 11 studies used either the Sobel test or the joint test of significance.

Negative parenting was tested in 13 studies. More than half of these studies (8 of 13) used either a full longitudinal design or growth curve modeling, as opposed to a half longitudinal design. Most of the studies (10 of 13) used the preferred product of coefficients approach, instead of the causal steps approach. For the studies where a significance test was appropriate, half (5 of 10) used the preferred asymmetric confidence limits test, rather than either the Sobel test or the joint test of significance.

Parent–child relationship was tested as a mediator in 11 studies. The majority of these studies (10 of 11) used either a full longitudinal design or growth curve modeling, as opposed to a half longitudinal design. Most of the studies (9 of 11) used the product of coefficients approach instead of the causal steps approach. In the 9 studies where a significance test was appropriate, 4 used the preferred asymmetric confidence limits test, while the other 5 used either the Sobel test or the joint test of significance.

Monitoring/supervision was tested in 10 studies. With respect to temporal precedence, 6 of these studies used either a full longitudinal design or growth curve modeling, while the other 4 studies used a half longitudinal design. Most of the studies (7 of 10) used the product of coefficients approach instead of the causal steps approach. For significance testing, 4 studies used asymmetric confidence limits, while 3 studies used either the Sobel test or the joint test of significance.

Conclusion and Recommendations

This review examined 73 outcome studies, all of which tested putative mediators intended to link parenting and family-based interventions to a variety of child and youth outcomes. To our knowledge, no published review has appraised the methodological quality of statistical mediation employed in this large collection of studies. This review supports several conclusions about overall quality of statistical mediation in the 73 studies.

Taken as a whole, the studies used designs that adequately addressed temporal precedence: 78% used either a full longitudinal design within a regression framework, or used growth curve modeling, while the remaining 22% used a half longitudinal design within a regression framework. All of the studies appropriately refrained from resorting to a cross-sectional design.

With respect to mediation approach, the picture was somewhat mixed in that 62% of the studies used product of coefficients as the sole approach, 23% used causal steps as the sole approach, while the remaining 15% combined the two approaches by first establishing that the conditions for the causal steps approach were met and then applying the product of coefficients approach to test the mediated effect(s). Despite the popularity of the causal steps approach in past decades, there are compelling reasons to move away from it. First, it is possible for mediation to occur without all four causal steps conditions to be met. Second, the causal steps approach necessitates greater sample sizes, which can be problematic in intervention studies, particularly in situations where there is complete (rather than partial) mediation. Third, the causal steps approach does not provide a parameter estimate of the mediation effect and cannot be used to construct confidence intervals of the indirect effect. Finally, the causal steps approach does not easily extend to multiple mediator models, which makes it more difficult to test theoretically complex relationships. By contrast, the product of coefficients approach is preferred and appears to be the predominant choice in the more recent of the reviewed studies.

In the reviewed studies, methods of statistical significance testing did not always reflect optimal choices. This discussion only applies to studies using the product of coefficients approach to mediation. Some investigators chose the Sobel test to examine statistical significance when the sample size in the study was not sufficiently large. However, use of the Sobel test often requires an unduly large sample size. Researchers would be wise to capitalize on more modern approaches for statistical inference within the mediation framework (i.e., asymmetric confidence levels via bootstrapping or Markov Chain Monte Carlo methods), as these methods afford more flexibility with small sample size and complex functions.

Future mediation studies can improve through better documentation and reporting of details. When SEM was used, for example, overall model fit was consistently provided, but documentation quality was uneven with respect to the reporting of parameter estimates and effect sizes for the mediated effects. Though the science and development of effect size measures for mediation is still in progress, these estimates provide crucial insight into the practical significance of effects and should be considered vital information to examine in analyses. These statistics are essential to understanding methods and findings, and for making comparisons and summarizing across studies.

Additionally, the field would benefit from integrating newer causal inference-based approaches for conducting statistical mediation (e.g., Pearl 2011; Vanderweele 2015). Though causal inference methodology was previously a difficult endeavor for substantive researchers, a growing body of research and technical guidance has made these methods more accessible.

The family-based interventions literature has had an increase in the use of statistical mediation over time. More often investigators are identifying mechanisms of change for prevention and treatment studies. Typically, single mediators were assessed even though theory suggests that multiple mediators might influence positive youth outcomes. Overall, studies of family-based interventions are improving the use of statistical mediation by conducting more statistical mediation with full longitudinal designs and using more advanced methodologies. Nonetheless, more studies need to consider how complex mediator models might explain intervention effects on children and adolescents, as such models more likely provide a more reasonable representation of reality.

The following recommendations are offered for future research evaluating statistical mediation in family-based intervention outcome studies:

  1. 1.

    Investigators are encouraged to make use of full longitudinal designs (which can include growth curve modeling) whenever possible so that temporal precedence is attained.

  2. 2.

    With respect to mediation approach, product of coefficients is preferred over causal steps.

  3. 3.

    When statistical mediation involves significance testing, the use of asymmetric confidence limits is recommended.

  4. 4.

    When more than one mediator is being tested, it is recommended that a parallel, multiple mediator model be employed rather than implementing a series of single models.

  5. 5.

    Investigators need to control for Type I error when multiple tests of mediators are involved.

  6. 6.

    Full documentation in published articles with respect to mediation-related parameter estimates is a necessary part of scientific transparency and reproducibility. When SEM is involved, for example, reporting should include at a minimum the overall model fit of the mediation model, the mediation point/interval estimates, and the effect size(s) and type of effect size.