Introduction

Approximately 13% of youth aged between 10 and 19 worldwide live with a diagnosed mental health disorder (United Nations Children’s Fund, 2021). There is evidence that rates of mental health concerns among youth have risen over the last decade; the percentage of youth who seriously considered suicide in the USA increased from 13.8% in 2009 to 18.8% in 2019 (Center for Disease Control and Prevention, 2019). Globally, suicide accounts for 9.1% of deaths among young people (Wasserman et al., 2005). The sheer number of youth in need of mental health support is staggering. Unfortunately, between 47 and 54.7% of youth living with mental health disorders do not receive mental health treatment (Green et al., 2013; Islam et al., 2022). To address these unsettling trends, it is critical that youth access to evidence-based mental health interventions is amplified and streamlined.

Traditional, outpatient mental health services are not sufficient for addressing the mental health needs of youth at scale; barriers such as cost, limited locations, and a shortage of providers limit the accessibility of traditional services (Health Resources and Services Administration, 2015; Wells et al., 2002). Calls for non-traditional, innovative, and scalable methods for providing mental health support to young people have been repeated throughout the last decade (Gruber et al., 2021; Kazdin, 2019; Kazdin & Rabbitt, 2013). It has been suggested that mental health clinicians and researchers “meet youth where they are at” rather than expecting youth or parents to overcome barriers to access (Benningfield, 2016; Hardy et al., 2020). Schools, where young people spend much of their daily life, are ideal settings to accomplish this goal. Schools hold promise to provide convenient and free access to mental health services, thus reducing common barriers. Particularly for youth from vulnerable, ethnic, and economic minoritized groups who are less likely to access needed services, school-based mental health support “democratises access to services” (Alegria et al., 2010; Fazel et al., 2014).

Schools are one of the most common settings where youth access needed mental health interventions (Duong et al., 2021; Mohamed et al., 2018). One meta-analysis including 43 studies of school-based mental health interventions (including both treatment and prevention programs; n = 49,941) found a Hedges g of 0.39 (Sanchez et al., 2018) and another meta-analysis including 63 studies of school-based mental health interventions (including both treatment and prevention programs; n = 15,211) found a Hedges g of 0.50 (Mychailyszyn et al., 2012), suggesting that school-based interventions can be effective at improving mental health outcomes. Meta-analyses focused solely on school-based prevention programs for depression and anxiety demonstrate smaller effects overall. One meta-analysis including 81 studies of school-based prevention programs (n = 31,794) found a Hedges g of 0.11 for depression and 0.13 for anxiety (Werner-Seidler et al., 2017). In an updated review including 118 studies (n = 45,924), effect sizes were slightly larger with a Hedges g of 0.21 for depression and 0.18 for anxiety (Werner-Seidler et al., 2021). Another review of school-based prevention programs for depression and anxiety found across 137 studies (n = 56,620), there were no significant effects besides one finding that mindfulness and relaxation-based interventions significantly reduced anxiety symptoms in universal secondary settings (Caldwell et al., 2019). These results are in line with research suggesting that the effects of prevention programs tend to be smaller than for treatment programs (Sandler et al., 2014), yet small preventative effects can theoretically hold great public health impact when delivered at scale (Shamblen & Derzon, 2009). In practice, the positive impact of school-based treatment and prevention programs may not be fully realized because of the complexity and length of existing, evidence-based intervention programs (Lyon, 2021). Traditional, manualized treatments for mental health are typically delivered once weekly for 8–12 weeks by a licensed provider in a clinical setting. School-based programs often mirror the clinical model by consisting of 8–12 sessions, though some are longer with as many as 40 sessions (Werner-Seidler et al., 2021). Previous writers highlight the variability of school settings compared to typical clinical settings and the need to reduce burden on already overworked teachers and students (Sohn, 2022). This reflects an overall call from implementation scientists to redesign interventions to better suit the unique needs of real-life delivery contexts (Lyon & Bruns, 2019; Schleider, 2023). Multiple studies and reviews have found that the intensity and high burden of intervention delivery, including the time commitment required, is a barrier to implementation among school-based mental health programs (Fox et al., 2021; Moore et al., 2022). Further, a greater number of sessions that are expected to be delivered as part of an intervention directly increases difficulties in ensuring adherence and fidelity.

As an emergent promising solution, brief interventions aim to deliberately deliver intervention content in a limited number of sessions (typically four or less; Schleider & Weisz, 2017). The concentrated focus and intentional brevity of shorter interventions may allow them to be more easily completed or more targeted to a specific concern. Additionally, given the limitations of time and space in a shorter intervention, the content that is presented must be selected especially carefully with a high standard regarding effectiveness. In essence, teaching one highly effective, evidence-based skill that a student will use may be superior to teaching ten skills with mixed effectiveness that a student will not use because they are overwhelmed.

One systematic review found that wise interventions, brief interventions (typically four sessions or less) focused on teaching only one specific skill or strategy, had positive effects on mental health in 16 out of 25 RCTs (Schleider et al., 2020c). In a meta-analysis of single-session interventions, which are “specific, structured programs that intentionally involve just one visit or encounter with a clinic, provider, or program” (Schleider et al., 2020b), authors found a 58% likelihood that youth receiving a single-session intervention would have improved outcomes compared to a control group. In some domains, the interventions had comparable effects on mental health symptom reduction as typical length interventions; for example, the meta-analysis found that single-session interventions have a comparable effect size for reducing anxiety (Hedges g = 0.56; Schleider & Weisz, 2017) as full length interventions (Hedges g = 0.61; Weisz et al., 2017). Support for less intensive interventions also comes from multiple meta-analyses showing that greater intervention time is associated with smaller effect sizes (Öst & Ollendick, 2017; Weisz et al., 2017). Brief interventions have been delivered in a variety of settings, including outpatient waitlist settings (Schleider et al., 2021) and through digital formats (Dobias et al., 2021; Schleider et al., 2020a, 2022), and for numerous kinds of problems, including anxiety, depression, conduct problems, and substance use (McDanal et al., 2022; Schleider & Weisz, 2017).

The evidence that brief interventions can implement change warrants an examination of brief interventions in school settings. In previous meta-analyses of school-based mental health interventions, minimal discussion of brief interventions was included. One meta-analysis found that “low-dose” interventions, averaging 354.87 min, had an equivalent effect size (standardized mean gain effect size = 0.32) as “high-dose” interventions, averaging 682.50 min (standardized mean gain effect size = 0.32; Mychailyszyn et al., 2012). However, the meta-analysis did not describe the low-dose interventions in detail; further, the low-dose interventions, while briefer than the average length of interventions, are not considered “brief” by current standards (Schleider et al., 2020c). Therefore, it is not clear how often or in what manner brief interventions are offered in school settings, and no evidence synthesis exists to delineate the effectiveness of school-based brief interventions in reducing mental health symptoms or improving youth well-being. This systematic review and meta-analysis aims to strategically collect and synthesize findings from relevant literature on the effects of school-based brief interventions for youth mental health problems and well-being, including both treatment and prevention programs. This review promises to characterize the state of the literature on brief school-based mental health interventions, identifying gaps in existing knowledge and key directions for future work in this area.

Methods

All study procedures, as well as the coding manual used for data extraction, were pre-registered with Prospero (CRD42021255079) and the Open Science Framework (https://osf.io/kf56w/). The present review adhered to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (Shamseer et al., 2015; see Table 1 in Appendix A).

Search Strategy

We conducted searches in multiple databases (MEDLINE with Full Text, APA PsycInfo, Embase, Web of Science, OpenDissertations, ProQuest Dissertations & Theses Global, ERIC, and PsyArXiv) to identify peer-reviewed or unpublished studies. The full syntax for our search including all search terms is included in Appendix B. Additionally, we reached out to researchers within the field of child and adolescent psychopathology to determine if additional published or unpublished studies may be included that did not appear in our original searches. The first search was conducted on May 21st, 2021. An updated search was conducted on December 18, 2023.

Inclusion and Exclusion Criteria

Inclusion criteria for articles were as follows: available in English, included a brief (≤ four sessions or 240 min of intervention time; Schleider et al., 2020c) psychosocial intervention, conducted within a Pre-K through 12th-grade school setting, include at least one treatment outcome evaluating mental health or well-being outcomes, designed as randomized controlled trials, quasi-experimental studies, or nonrandomized open trials and published since 2000. Studies were excluded if the primary intervention target (per authors’ descriptions) was verbal communication skills (e.g., speech therapy) or academic outcomes (e.g., math tutoring). Studies were excluded if interventions were administered in a post-secondary school setting (i.e., college/university setting). Intervention time and number of sessions was inclusive of components delivered in school and/or outside of school. Additionally, interventions were included whether they were designed to be preventative or designed as treatments.

Data Extraction

Following the initial extraction of articles, four independent members of the review team (KC, IA, JL, & MY) used Rayyan systematic review software to conduct a screening process using titles and abstracts to identify articles that appeared to meet inclusion criteria. Once eligible articles were identified, five independent members of the review team (KC, IA, SI, JL, & MY) accessed full texts to complete another round of screening to identify whether articles met inclusion criteria. Studies meeting inclusion criteria were coded according to the project codebook (see below) by seven independent members of the review team (KC, SI, JL, MY, AS, AR, & SH). Inter-rater reliability (IRR) was calculated on the full sample of coded studies. IRR was calculated as Cohen’s κ for categorical data or intra-class correlation coefficient for continuous data. When IRR was below 0.8, coders met to discuss the discrepancy and re-code the variable until IRR was above 0.8. Regardless of IRR, all disagreements were resolved through discussion between coders.

Study-level codes included the country/continent where studies took place, the year when studies took place, whether the study was voluntary/opt-in (compared to studies where all students were expected to participate), whether participants were compensated, demographic characteristics of participants (e.g., age, grade levels, race/ethnicity, sex), school type (e.g., public, private), trial type (i.e., individually randomized, cluster randomized, not randomized), and whether any academic outcomes were measured (e.g., attendance, grades, discipline referrals). Academic outcomes were investigated to characterize the studies included in this review, but were not included in the meta-analysis.

Interventions were characterized as universal prevention, selected prevention, indicated prevention, or treatment (Gordon, 1983). Universal prevention interventions were defined as interventions that are designed to reach an entire population, without regard to individual risk factors. Selected prevention interventions were defined as programs that are designed to target subgroups of the general population that are at risk for a specific target problem. Indicated prevention interventions were defined as interventions designed to target individuals who are experiencing early signs of a target problem. Treatment interventions were defined as interventions designed to target those experiencing a clinically significant psychological problem. Interventions were also coded for type of provider (e.g., self-administered, teacher-delivered), format (i.e., digital, in-person, hybrid), intervention length (in minutes and sessions), and intervention completion rate.

Data Analysis

We used the R statistical software to calculate characteristics that were present or absent across studies in terms of means and percentages (R Core Team, 2022). If a control group was included in the trial, between-group effect sizes were computed using the appropriate Cohen’s d formula specified by the Campbell Collaboration’s online resource for computing effect sizes within systematic reviews (Wilson, 2001). Each effect size was calculated twice by two separate members of the study team to ensure accuracy. We estimated meta-analytic correlated-effects models using robust variance estimation with small-sample correction (Hedges et al., 2010) using the R package robumeta (v2.0; Fisher et al., 2017). Meta-analytic findings are presented such that greater effect sizes correspond to better outcomes for symptoms or well-being (i.e., a positive effect size corresponds to a beneficial outcome favoring the treatment group). Effect sizes were grouped based on the time point they were collected post-intervention: less than or equal to one month, greater than one month and less than or equal to six months, greater than six months and less than or equal to one year, greater than one year and less than or equal to two years, and greater than two years. For each time point, a forest plot was generated.

To determine if effects significantly differed by study or group characteristics, we additionally conducted moderator tests. Following previous protocols, for categorical moderators, we calculated the meta-analytic mean using only studies belonging to a specific subgroup. For continuous variables, we calculated coefficients of the variable when added to the model. For all variables, we conducted t tests to determine the significance of the moderator when added to the overall model (Ahuvia et al., 2022). The following variables were pre-registered as potential moderators: Year published, publication status, pre-registration, mean age, percentage of females, percentage of white participants, percentage of sexual minority participants, percentage of gender minority participants, percentage of students qualifying for reduced lunch, percentage of participants in special education, facilitator training, facilitator supervision, intervention length in minutes, intervention number of sessions, intervention completion rate, study completion rate, and school type. Two variables were added to moderation analyses after pre-registration: intervention type and intervention delivery format.

Risk of Bias and Study Quality

The Cochrane Consumers & Communication Review Group Study Quality Guide (Ryan et al., 2013) was used to assess the risk of bias and quality of included studies. In addition, to investigate the quality of included studies, we coded each study for blind assignment to study group, presence of treatment manual, presence of pre-intervention training for facilitators, participant attrition, and presence of pre-registration.

We additionally created a funnel plot of the study’s effects by the effects’ standard errors and conducted an Egger’s test by regressing each normalized effect estimate (estimate divided by its standard error) against its precision (reciprocal of the standard error of the estimate) and testing the significance of the intercept (Egger et al., 1997).

Results

Article Selection and Characteristics

A total of 6702 records were identified through database searching and five through other sources (e.g., manual searches; personal communications). After duplicates were removed, 3892 records remained. During the first round of screening, 3321 records were excluded. A total of 571 full-text articles were assessed for eligibility, of which K = 81 were ultimately selected for inclusion (see Fig. 1). Articles were published between the years of 2002 and 2023. The highest number of articles was published in 2021 (K = 13). The majority were peer-reviewed articles (K = 75) and six were dissertations. Full references for included articles are available in Appendix C.

Fig. 1
figure 1

CONSORT Flowchart. Alt text: One box at the top of the chart says, “Records identified through database searching (K = 6702).” Another box at the top says, “Additional records identified through other sources (e.g., manual searches; personal communications) (K = 5).” These two boxes both point to a box below that says, “Records after duplicates removed (K = 3892).” An arrow from that box leads to a box below that says, “Records screened (K = 3892).” To the right there is an arrow pointing to a box that says, “Records excluded (K = 3321).” Below there is an arrow pointing to a box that says, “Full text articles assessed for eligibility (K = 571).” To the right there is an arrow pointing to a box that says, “Full text articles excluded, with reasons (K = 490). 419 Did not include a brief (< = 4 sessions or 240 min of intervention time) psychosocial intervention. 28 Not conducted within a Pre-K through 12th-grade school setting. 23 Did not include data for at least one treatment outcome evaluating mental health or well-being outcomes. 18 Full texts could not be retrieved. 2 Full text not available in English.” Below there is an arrow pointing to a box that says, “Articles included in systematic review (K = 81). k = 75 unique studies. Below there is an arrow pointing to a box that says, “Studies included in meta-analysis (k = 55).” To the right there is an arrow pointing to a box that says, “Studies excluded from meta-analysis (k = 20). 13 did not include control groups. 7 did not include sufficient data.”

Study Characteristics

The total number of unique studies was k = 75. One article included two separate studies (Vanderkruik, 2019), three articles investigated data from the same study of Preventure (Conrod et al., 2013; Mahu et al., 2015; O'Leary-Barrett et al., 2013; NCT00776685), three articles investigated data from a second study of Preventure (Conrod et al., 2010; Edalati et al., 2019; O'Leary-Barrett et al., 2010; NCT00344474), three articles investigated data from a third study of Preventure (Goossens et al., 2015; Lammers et al., 2015, 2017; NTR1920), and two articles investigated data from a fourth study of Preventure (Grummitt et al., 2022; Newton et al., 2022; ACTRN12612000026820). Tables 2 and 3 in Appendix D include information on each study’s main characteristics.

A total of 40,498 students were included across studies. The majority of studies were not pre-registered (k = 42; 56%). Out of the 75 studies, 31 studies were cluster randomized, 33 were individually randomized, and 11 were not randomized. Of the studies that were randomized, the majority utilized blind assignment (k = 38). Of the clustered randomized control trials, the majority were clustered by school (k = 18), while the remaining were clustered by classes (k = 12) or a combination of school/classes (k = 1). In most studies, participation was voluntary and/or students were allowed to opt out. In 12 studies, every student in the setting was expected to participate. Participants were compensated in 13 studies.

Ages of participants ranged between four and 19 years (pre-kindergarten to 12th grade). The mean age across all studies was 13.63 (SD = 3.03). Studies reported either “gender” or “sex” and no studies reported sex assigned at birth separately from gender identity, although five studies reported a third option for gender other than girl/boy. The proportion of students choosing the third option ranged from 0.7 to 4.1%. The average proportion of females/girls across all studies was 53.7% (SD = 16.65). No studies reported students’ sexual orientation. A total of 41 out of 75 studies reported data on the race/ethnicity of participants. The average proportion of white students across all studies that reported data on race was 31.27% (SD = 34.42).

The majority of studies were conducted in the USA (k = 30). The remaining studies were conducted in Africa, Asia, Australia/Oceania, Europe, North America (excluding the USA), and South America. The majority of studies (50.67%) did not specify what type of school the study was conducted in (e.g., public, private, charter). Of the studies that specified which type of school was studied, the majority were public schools (k = 22, 29.33%). The majority of studies (70.67%) did not specify what type of geographical region the school was located in (urban, rural, suburban, or not reported). Of the studies that specified which type of geographical region was studied, the majority were urban (k = 12, 16%). No studies reported the number/proportion of students in special education; no studies evaluated interventions that were focused on special education populations. Only ten studies reported the number/proportion of students who qualified for reduced or free lunch, which ranged from 0 to 93.88%.

Sixty-two studies included control groups to compare to interventions. Of these 62 control conditions, 33 were waitlist/no treatment conditions, 20 were active control conditions (e.g., students completed a neutral writing activity; students learned study skills), and nine simulated “standard care” or treatment as usual (e.g., students attended the standard school drug education curriculum provided in the school).

A total of 324 effect sizes were calculated for 55 studies. Effect sizes were not able to be calculated for 20 studies (13 because they did not include control groups, seven due to insufficient available data). For each study with insufficient data, authors were contacted to request additional information necessary for calculating effect sizes. However, not all of the authors that were contacted provided additional information.

Interventions

A total of 75 unique interventions were examined. Table 4 in Appendix E characterizes each intervention. The majority of interventions were classified as universal prevention efforts (n = 57; 76%). There were 16 interventions classified as indicated prevention efforts (21.33%) and two interventions classified as selective prevention efforts (2.67%). None were classified as strictly treatments. The majority of interventions took place in-person (n = 55; 73.33%), while three interventions (4%) took place entirely digitally (i.e., web-based intervention completed outside of school during students’ own time, web-based intervention completed during virtual class), and 17 interventions (22.67%) took place through a combination of in-person and digital activities (i.e., web-based intervention completed at school during class time; digital components completed partially outside of school and partially in school). Digital interventions were web based with the exception of one telephone-based intervention (Quach et al., 2011) and one VR-based intervention (Shaw & Lubetzky, 2021).

The length of interventions ranged from 10 to 240 min, with an average of 127.8 min (SD = 75.9). The number of sessions in the interventions ranged from one to 22, with an average number of sessions of 3.59 (SD = 3.19). Twenty-three interventions (30.67%) were single-session interventions. Eighteen interventions were self-administered (24.66%), 14 were delivered by research staff (19.18%), 12 were delivered by therapists/clinicians or mental health professionals (16.44%), 13 interventions were delivered by teachers or other school staff (17.81%), thirteen interventions were delivered by lay providers (17.81%), and three were delivered by a combination of multiple types of providers (2.74%). Intervention provider type was unclear for two interventions.

The interventions most commonly targeted anxiety problems (n = 31, 41.33%) and mood problems/depression (n = 29, 38.67%). Other intervention targets included well-being (n = 27, 36%), self-injurious thoughts or behaviors (n = 4, 5.33%), eating problems (n = 5, 6.67%), substance use (n = 5, 6.67%), conduct/behavioral problems (n = 4, 5.33%), hyperactivity/inattention (n = 2, 2.67%), and trauma symptoms (n = 3, 4%). Twenty-five interventions (33.33%) targeted general distress or combined problems (e.g., Total Difficulties Score on the Strengths and Difficulties Questionnaire; Children’s Anxiety and Depression Scale). Thirty-nine interventions (52%) were included in multiple categories (e.g., Preventure targeted substance abuse, depression, anxiety, and more).

Overall Effects

Less Than or Equal to One-Month Post-Intervention

A correlated-effects model with robust variance estimation tested the overall effect of interventions compared with control conditions across 136 effect sizes (k = 40) collected less than or equal to one-month post-intervention. Interventions were associated with significant improvements in mental health or well-being outcomes relative to controls, with an estimated small meta-analytic effect size of g = .18 (95% CI .06, .29, p = .004). The estimated effect heterogeneity statistics suggested significant between-study variance, as I2 = 92.86% of total variation in these estimates was due to heterogeneity between studies. A forest plot is available in Fig. 2 in Appendix F.

Greater Than One-Month and Less Than or Equal to Six-Month Post-intervention

A correlated-effects model with robust variance estimation tested the overall effect of interventions compared with control conditions across 112 effect sizes (k = 29) collected greater than one-month and less than or equal to six-month post-intervention. Interventions were associated with significant improvements in mental health or well-being outcomes relative to controls, with an estimated small meta-analytic effect size of g = .15 (95% CI .05, .26, p = .006). The estimated effect heterogeneity statistics suggested significant between-study variance, as I2 = 96.15% of total variation in these estimates was due to heterogeneity between studies. A forest plot is available in Fig. 3 in Appendix F.

Greater Than Six Months and Less Than or Equal to One-Year Post-Intervention

A correlated-effects model with robust variance estimation tested the overall effect of interventions compared with control conditions across 33 effect sizes (k = 11) collected greater than six-month and less than or equal to one-year post-intervention. Interventions were associated with significant improvements in mental health or well-being outcomes relative to controls, with an estimated small meta-analytic effect size of g = .10 (95% CI .01, .19, p = .03). The estimated effect heterogeneity statistics suggested between-study variance, as I2 = 72.79% of total variation in these estimates was due to heterogeneity between studies. A forest plot is available in Fig. 4 in Appendix F.

Greater Than One-Year and Less Than or Equal to Two-Year Post-Intervention

Only six studies collected outcomes greater than one-year and less than or equal to two-year post-intervention, suggesting meta-analytic results should be interpreted with caution. A correlated-effects model with robust variance estimation tested the overall effect of interventions compared with control conditions across 27 effect sizes (k = 6). Interventions were not associated with significant improvements in mental health or well-being outcomes relative to controls, with an estimated small meta-analytic effect size of g = .06 (95% CI −.03, .14, p = .14). The estimated effect heterogeneity statistics suggested some between-study variance, as I2 = 52.181% of total variation in these estimates was due to heterogeneity between studies. A forest plot is available in Fig. 5 in Appendix F.

Greater Than Two-Year Post-Intervention

Only two studies collected outcomes greater than two-year post-intervention, suggesting meta-analytic results should be interpreted with caution. A correlated-effects model with robust variance estimation tested the overall effect of interventions compared with control conditions across 16 effect sizes (k = 2). Interventions were not associated with significant improvements in mental health or well-being outcomes relative to controls, with an estimated small meta-analytic effect size of g = 0.02 (95% CI  − .19, .23, p = .47). The estimated effect heterogeneity statistics suggested some between-study variance, as I2 = 31.84% of total variation in these estimates was due to heterogeneity between studies. A forest plot is available in Fig. 6 in Appendix F.

Moderation Analyses

Detailed moderation results are presented in Tables 5, 6, 7, 8 in Appendix G. As per our pre-registration, we conducted moderation analyses only when each subgroup included greater than or equal to three studies. As a result, some variables or variable levels could not be reliably analyzed as moderators, either due to a lack of available data or lack of variability. Among these were the percentage of sexual minority participants (no data), percentage of participants in special education (no data), and school type (insufficient data). For one-month post-intervention outcomes, percent gender minority was excluded due to lack of variability. For six-month post-intervention outcomes, publication status was excluded due to lack of variability. For one-year post-intervention outcomes, publication status, pre-registration, supervision, percent gender minority, training, and percent reduced lunch were excluded due to lack of variability. For two-year post-intervention outcomes, publication status, pre-registration, percent gender minority, training, supervision, and percent reduced lunch were excluded due to lack of variability. For all time points, intervention type was examined as a moderator, but selective prevention interventions were excluded as a variable level because not enough studies examined selective prevention interventions. Moderation analyses could not be conducted on outcomes greater than two-year post-intervention due to lack of variability.

Results suggested that among outcomes collected less than or equal to one-month post-intervention, indicated prevention programs had significantly higher effect sizes compared to universal programs, t(8.11) =  − 2.64, p = .03. Among outcomes collected greater than 1-month and less than or equal to six-month post-intervention, more recent publications were associated with lower effect sizes, t(8.8) =  − 2.56, p = .03. Additionally, a higher percentage of white participants were associated with lower effect sizes, t(15.09) =  − 2.62, p = .02. No variables tested as moderators were statistically significant among outcomes collected greater than six months and less than or equal to one-year post-intervention. Among outcomes collected greater than one-year and less than or equal to two-year post-intervention, longer intervention length measured in minutes was associated with lower effect sizes, t(2.54) =  − 4.9, p = .02. No other variables tested as moderators were statistically significant at any time point.

Intervention Effectiveness on Anxiety Problems/Phobias

Thirty-one interventions targeted anxiety problems/phobias. Twenty-eight of these interventions had sufficient data to be included in the meta-analysis; their effect sizes ranged from d =  − 0.755 (Morrell, 2018) to d = 0.72 (Ginsburg et al., 2021; see Table 9 in Appendix H).

Interventions Evaluated in More Than One Trial

Several interventions were studied more than once, including the Shamiri Intervention, Preventure, CALM, Healthy Kids, and The Body Project.

Shamiri Intervention. Three studies investigated the Shamiri Intervention, which teaches youth about growth mindset, gratitude, and values. In one study, the intervention was structured as a digital, self-guided, single-session intervention (Osborn et al., 2020a). In the other two studies, it was structured as an in-person, group-based, four-session intervention delivered by trained lay providers (Osborn et al., 2020b, 2021). A fourth study investigated individual components of the digital, self-guided Shamiri program (growth mindset, gratitude, and values) as separate interventions. There was not a statistically significant between-group difference in anxiety found at the two-week follow-up for the single-session version, with a small effect size of d = 0.24 (Osborn et al., 2020a). In the two studies investigating the four-session version, participants in the intervention group showed a statistically significant reduction in anxiety compared to a study skills activity at follow-up, with small to medium effect sizes ranging from d = 0.23 to d = 0.66 (Osborn et al., 2020b, 2021). In a study of each individual component delivered digitally, there were statistically significant improvements in anxiety among participants in the growth intervention and values intervention compared to participants who completed a study skills activity, with small effect sizes of d = 0.04 and d = 0.21, respectively (Venturo-Conerly et al., 2022).

Preventure. Two studies investigated the in-person, counselor-delivered Preventure intervention, which includes components of CBT and motivation enhancement therapy. In the study by Goossens and colleagues (2015), participants assigned to complete Preventure did not demonstrate statistically significant reductions in anxiety compared to a no-treatment control, with small effect sizes ranging from d = 0.02 to d = 0.05 at two to twelve-month follow-ups. However, in the study by O'Leary-Barrett and colleagues (2013), participants assigned to complete Preventure had statistically significant reductions in anxiety (as measured by the anxiety subscale of the Brief Symptoms Inventory) compared to a no-treatment control group with a small effect size of d = 0.15 at the two-year follow-up. Results from a Panic Attack Questionnaire indicated that although scores improved among the intervention group at the two-year follow-up, they did not improve relative to the control group, leading to a negative effect size of d =  − 0.08.

CALM. Two studies investigated Child Anxiety Learning Modules (CALM), a CBT-based, nurse-administered, in-person intervention. One study showed statistically significant pre-post-within-subject reductions in anxiety; however, the study did not include a control group to examine between-subject differences (Muggeo et al., 2017). Another study found that students assigned to CALM had significantly greater reductions in anxiety at post-intervention and three-month follow-up compared to a relaxation-only control condition, with small to large effect sizes ranging from d = 0.03 to d = 0.72 (Ginsburg et al., 2021).

Healthy Kids. Two studies investigated Healthy Kids, an intervention including one-on-one sessions with a health coach to build resilience among students. In one study, the intervention was delivered over six 30-min sessions. Due to the pandemic, some sessions were in-person, while others were virtual. This study found that scores in anxiety improved in the intervention group compared to the control group at post-intervention with a small effect size of d = 0.49, but improvements were not statistically significant (Moran et al., 2023). In another study, the intervention was delivered completely in-person over six 15-min sessions. This study reported that scores for anxiety significantly improved at post-intervention among students with elevated negative affectivity at baseline; however, the study did not include a control group to examine between-subject differences (Sabin et al., 2023).

The Body Project. One article included two separate investigations of The Body Project, a peer-delivered, in-person intervention focused on reducing thin-ideal internalization (Vanderkruik, 2019). Although there was a statistically significant pre-to-post-intervention within-group reduction in anxiety in the first investigation, there was not a statistically significant effect of group assignment on change in anxiety in the second investigation despite a medium effect size of d = 0.61.

Interventions Evaluated in One Trial

Several interventions were evaluated in only one trial each. Among these, results regarding efficacy were mixed.

Efficacious Interventions. One study investigated a School-Based Anxiety Prevention Program, an in-person intervention delivered by research staff to provide psychoeducation about anxiety to students. It reported statistically significant reductions in worry among participants in the intervention compared to participants in a no-treatment control immediately post-intervention and at the 3-month follow-up, with small effect sizes of d = 0.20 and d = 0.17, respectively (Ab Ghaffar et al., 2019). In a study of Moodgym, a web-based program including modules on cognitive behavioral therapy, students assigned to complete the intervention showed statistically significant reductions in anxiety compared to students in a waitlist condition immediately post-intervention and at a six-month follow-up, with small effect sizes of d = 0.15 and d = 0.25, respectively (Calear et al., 2009). In a self-administered writing-based intervention where students spent 150-min writing about thoughts and feelings related to middle school over the course of three weeks, participants showed statistically significant reductions in anxiety compared to a placebo writing activity at post-intervention with a small effect size of d = 0.45 (Haraway, 2003).

Non-Efficacious Interventions. An in-person Pain Neuroscience Education intervention delivered by clinicians to provide psychoeducation about the neurophysiology of pain did not demonstrate statistically significant reductions in state or trait anxiety at post-intervention compared to a no-treatment group, although effect sizes were small to large (d = 0.18 to d = 0.61; Andias et al., 2018). One study examined the Take a Stand Against Bullying intervention, an in-person anti-bullying intervention delivered by research staff. It found that although there were within-group reductions in school violence anxiety in the intervention group, there was no evidence of statistically significant between-group differences in school violence anxiety at post-treatment compared to a no treatment control. On two subscales, scores among the control group were better than the intervention group, leading to small negative effect sizes of approximately d =  − 0.03 (Bennett, 2008).

Across both self-paced and guided-paced delivery formats, a video-based slow diaphragmatic breathing curriculum showed no significant improvements in trait anxiety compared to a treatment-as-usual control at the one-week follow-up, with a small effect size of d =  − 0.06 (Bentley et al., 2022). One study investigated the Brief Intervention for School Clinicians (BRISC), an in-person intervention delivered by school mental health providers that focuses on problem-solving. Scores for anxiety decreased similarly across the intervention and treatment-as-usual control groups over six months, with no statistically significant difference between the groups and a small effect size, d = 0.14 (Bruns et al., 2023). A study examining a video-based yoga intervention found that although there were within-group reductions in anxiety in the intervention group, there was no evidence of statistically significant between-group differences in anxiety at post-intervention compared to a no-treatment control; scores among the control group were superior relative to the intervention group at post-intervention, leading to medium negative effect sizes between d =  − 0.60 and d =  − 0.43 (Busch et al., 2023). The anti-bullying STAC (stealing the show, turning it over, accompanying others, and coaching compassion) intervention delivered in-person by graduate students resulted in reductions of anxiety scores among the intervention group; however, scores among the intervention group were higher compared to a waitlist control group at a 30-day follow-up, leading to a medium negative effect size of d =  − 0.45 (Midgett et al., 2017).

An in-person Brief Guided Mindfulness Meditation intervention demonstrated within-group reductions in anxiety immediately post-intervention; however, there was not a statistically significant between-group effect compared to participants in a placebo condition. Participants in the placebo condition in fact had lower anxiety scores, leading to a small negative effect size of d = − 0.24 (Morrell, 2018). An in-person, self-administered Mandala Drawing intervention did not show significant between-group effects on anxiety at post-intervention among the intervention group compared to a placebo group of students reading short stories. Although scores decreased from pre- to post-intervention, participants in the intervention group had higher anxiety scores than those in the control group at post-intervention, leading to a large negative effect size of d = − 0.76 (Morrell, 2018).

One study examined two similar in-person interventions delivered by mental health professionals focused on providing psychoeducation about sexual violence; one was a single-session 90-min intervention while the other was two sessions that lasted 180 min in total. Both interventions resulted in within-group reductions in anxiety at a six-month follow-up, but there were not statistically significant between-group effects for either intervention when compared to a waitlist condition with small effect sizes ranging from d = 0.01 to d = 0.19 (Muck et al., 2021).

In a study of SPARX-R, a video-game-like intervention delivered within classrooms during school hours, participants showed a statistically significant within-group reduction in anxiety. However, there was not a statistically significant between-groups difference in anxiety when compared to an attention-matched placebo condition with small effect sizes ranging from d = 0.02 to d = 0.10 (Perry et al., 2017). One study examined the impact of in-person therapist-led Qigong exercises. Although scores in anxiety decreased at post-intervention among the Qigong group, they were inferior relative to a control group of students who watched a relaxing documentary, leading to small negative effect sizes of d =  − 0.06 and d =  − 0.14 for state and trait anxiety, respectively (Rodrigues et al., 2021).

Growing Minds, a self-guided digital intervention focused on teaching growth mindset to students, did not show statistically significant reductions in scores on the avoidance subscale from the Social Phobia Inventory compared to an attention-matched control at the four-month follow-up with a small effect size of d = 0.15 (Schleider et al., 2020a, 2020b, 2020c, 2020d). One study examined a virtual reality-based intervention that encouraged students to become physically active. Scores on anxiety decreased from baseline to post-intervention, but were not significantly different compared to an active control condition where students participated in an in vivo exercise activity; the effect size was small, d = 0.01 (Shaw & Lubetzky, 2021).

Potentially Iatrogenic Effects

One study examined the Climate Schools intervention, a mental health course combining online and teacher-led components to teach students about cognitive behavioral principles. The study found that scores on the GAD-7 deteriorated among students at post-intervention and 1.5-year follow-up; however, authors reported that increases in anxiety similarly occurred in the control group, and there was no statistically significant main effect of condition when comparing Climate Schools to usual health classes, with small effect sizes ranging from d =  − 0.07 to d = 0.01 (Andrews et al., 2023). The Writing for Recovery intervention delivered in-person by mental health professionals aimed to help adolescents in war-torn areas of Gaza process trauma through expressive writing. A study showed anxiety scores increasing among the intervention group compared to a waitlist control group immediately post-intervention, with a small effect size of d =  − 0.20. However, the authors noted that the effect was not statistically significant (Lange-Nielsen et al., 2012).

Interventions Not Included in Meta-Analysis

Students who participated in online group-based Emotion drawing or Mandala drawing interventions showed no statistically significant within-subject reductions in anxiety from pre- to post-intervention; the study did not include a control group to examine between-subject differences (Malboeuf-Hurtubise et al., 2021). Creating Opportunities for Personal Empowerment (COPE), an in-person, research staff-delivered intervention focused on improving self-management among students with asthma, showed statistically significant within-subject reductions in anxiety at a six-week follow-up; however, the study did not include a control group to examine between-subject differences (McGovern et al., 2019).

Intervention Effectiveness on Mood Problems/Depression

Twenty-nine interventions targeted mood problems/depression. Twenty-three of these interventions had sufficient data to be included in the meta-analysis; their effect sizes ranged from d = − 1.25 (Lange-Nielsen et al., 2012) to d = 0.75 (Moran et al., 2023; see Table 10 in Appendix H).

Interventions Evaluated in More Than One Trial

Several interventions were studied more than once, including the Shamiri Intervention, MoodGym, Preventure, Healthy Kids, and The Body Project.

Shamiri Intervention. Three studies investigated the Shamiri Intervention (Osborn et al., 2020a, 2020b, 2021). In all three, participants in the Shamiri Intervention showed statistically significant reductions in depression symptoms compared to participants in a study skills placebo activity, with small to medium effect sizes ranging from d = 0.18 to d = 0.53. In a study of each individual component delivered separately, there were within-group improvements in depression among all participants, but no statistically significant between-group effects when interventions were compared to a study skills activity. Effect sizes ranged from d = 0.09 to d = 0.27 (Venturo-Conerly et al., 2022).

MoodGym. Students in one study of MoodGym (implemented within classrooms under teacher supervision) did not show statistically significant reductions in depression compared to students in a waitlist condition immediately post-intervention, with small effect sizes ranging from d = 0.13 to d = 0.15 (Calear et al., 2009). Similarly, in a study which implemented a longer version of MoodGym entirely online, there were no statistically significant differences in depression outcomes between the intervention and a waitlist condition. Although outcomes improved some over time among the intervention group, they did not improve relative to the control group, leading to a negative effect size of d =  − 0.10 (Lillevoll et al., 2014).

Preventure. Two studies investigated the Preventure intervention. One study did not find statistically significant reductions in depression compared to a no-treatment control group, with a small effect size of d = 0.02 at the two-month follow-up. At the six and 12-month follow-ups, outcomes among the intervention group improved, but did not improve relative to the control group, leading to small negative effect sizes of d =  − 0.05 and d =  − 0.13, respectively (Goossens et al., 2015). In another study, participants in Preventure did show statistically significant reductions in depression compared to a no-treatment control group with a small effect size of d = 0.11 at the two-year follow-up (O'Leary-Barrett et al., 2013).

Healthy Kids. In the longer version of Healthy Kids, Moran and colleagues (2023) found that although scores on depression improved in the intervention group, improvements were not statistically significant when compared to the control group, despite a large effect size of d = 0.75 (Moran et al., 2023). In the shorter version of Healthy Kids, the authors reported that scores for depression significantly improved at post-intervention among students with elevated negative affectivity at baseline; however, the study did not include a control group to examine between-subject differences (Sabin et al., 2023).

The Body Project. In one investigation of The Body Project, there were statistically significant pre- to post-intervention reductions in depression and negative affect. In the second investigation, which included a waitlist control group, there was not a statistically significant effect of group assignment on depression at post-treatment, with a small effect size of d = 0.28, but there was a statistically significant effect of group assignment on negative affect, with a small effect size of d = 0.33 (Vanderkruik, 2019).

Interventions Evaluated in One Trial

Several interventions were evaluated in only one trial each. Among these, results regarding efficacy were mixed.

Efficacious Interventions. Participants who completed SPARX-R showed statistically significant reductions in depression compared to an attention-matched placebo condition with small effect sizes ranging from d = 0.16 to d = 0.25 (Perry et al., 2017). Growing Minds showed statistically significant reductions in depression compared to an attention-matched control at the four-month follow-up with a small effect size of d = 0.12 (Schleider et al., 2020a, 2020b, 2020c, 2020d).

Non-Efficacious Interventions. In their study of BRISC, Bruns and colleagues (2023) found that, similar to results for anxiety, scores on depression decreased for both the intervention and treatment-as-usual control groups over six months, with no statistically significant difference between the groups and a small effect size, d = 0.08. A study of Dove Confident Me, a partly digital and partly teacher-delivered intervention focused on promoting positive body image found no significant differences in negative affect between the intervention and a treatment-as-usual control between post-intervention and three-year follow-ups, with small effect sizes ranging from d =  − 0.03 to d = 0.08 (Diedrichs et al., 2021). One study examined an online, self-guided growth mindset intervention and found that although girls in the intervention group experienced significant decreases in depression scores at post-intervention, there was not a statistically significant between-group difference in depression scores for both genders when compared to a condition where students learned about athletic ability, with a small effect size of d = 0.15 (Heaman et al., 2023). A study of an in-person Incremental Theory of Personality Intervention delivered by research staff reported that although the proportion of participants with clinically significant levels of depression increased among the intervention group, it increased significantly less than the placebo condition at a 9-month follow-up, with a small effect size of d = 0.32 (Miu & Yeager, 2015).

In an in-person intervention, students were instructed to spend 10–15 min daily for five days writing a letter to someone to express their gratitude. The intervention did not result in statistically significant between-subject reductions in negative affect compared to a placebo journaling condition at post-intervention or at one- and two-month follow-ups. Negative affect decreased among the intervention group, but scores were inferior relative to the placebo group, leading to negative effect sizes ranging from d =  − 0.21 to d =  − 0.06 (Froh et al., 2009). Similarly, a Written Emotional Expression intervention delivered in-person by research staff did not result in statistically significant between-subject reductions in depression or negative affect compared to a placebo condition at a one-month follow-up with small effect sizes of d = 0.20 and d = 0.08, respectively (Curry & Harrell, 2011).

Potentially Iatrogenic Effects

In their study of Climate Schools, Andrews and colleagues (2023) found that depression decreased among the intervention group up to 1-year post-intervention, but increased at 18 months. However, authors reported that there was no statistically significant main effect of condition when comparing Climate Schools to usual health classes; effect sizes were small, ranging from d =  − 0.05 to d = 0.02. One study examined the SoMe Social Media Literacy Program, an in-person intervention delivered by researchers to improve positive body image and well-being. Participants in the intervention showed increased depression outcomes up to one-year post-intervention, with small effect sizes ranging from d =  − 0.04 to d = 0.11. However, scores similarly increased among the treatment-as-usual control and authors reported that between-group differences were not statistically significant (Gordon et al., 2021). The Writing for Recovery intervention showed depression scores increasing among the intervention group compared to a waitlist control group immediately post-intervention, with a large effect size of d =  − 1.25. The authors suggested that the increase in depression symptoms may have been due to a “temporary negative effect of the processing of traumatic memories (Lange-Nielsen et al., 2012).” The anti-bullying STAC intervention showed depression scores increasing among the intervention group compared to a waitlist control group at a 30-day follow-up, with a large effect size of d =  − 0.72; however, the authors reported that the time by group interaction was not statistically significant due to the control group having lower depression scores at baseline compared to the intervention group (Midgett et al., 2017). A study investigating a Sleep Education Program delivered in-person by teachers found that depression scores increased among the intervention group at post-intervention. However, depression decreased at the one-month follow-up and there were no significant between-group differences, with small effect sizes ranging from d =  − 0.08 to d =  − 0.03 (van Rijn et al., 2020).

Interventions Not Included in Meta-Analysis

One study investigated the effects of “Energy Pod” and “Sleep Wing” devices placed in school-based health centers to improve sleep. It found that students who received time with either device showed within-subject improvements in mood, but improvements did not differ between devices; the study did not include a control group (Lynch et al., 2019). Students who participated in online group-based Emotion drawing or Mandala drawing interventions showed no significant within-subject reductions in depression from pre- to post-intervention; the study did not include a control group to examine between-subject differences (Malboeuf-Hurtubise et al., 2021). The COPE intervention showed no significant within-subject reductions in depression at a six-week follow-up; the study did not include a control group to examine between-subject differences (McGovern et al., 2019). Listen Protect Connect, an in-person intervention designed especially for trauma symptoms and delivered by school nurses, demonstrated statistically significant within-group reductions in depression at a 2-month-follow-up; the study did not include a control group to examine between-subject differences (Ramirez et al., 2013).

Interventions Targeting Well-being

Twenty-seven interventions targeted well-being. Nineteen of these interventions had sufficient data to be included in the meta-analysis; their effect sizes ranged from d =  − 0.39 (O'Connor et al., 2022) to d = 3.16 (Vanderkruik, 2019; see Table 11 in Appendix H).

Interventions Evaluated in More Than One Trial

The Shamiri Intervention, Healthy Kids, and The Body Project were studied more than once.

Shamiri Intervention. Two studies that examined the Shamiri Intervention collected well-being outcomes. There were no statistically significant between-group differences in mental well-being found at the two-week follow-up for the single-session version, with a small effect size of d = 0.19 (Osborn et al., 2020a). There were no statistically significant between-group differences in perceived social support or perceived control found at post-intervention for the four-session version, with small effect sizes of d =  − 0.01 and d = 0.20, respectively (Osborn et al., 2020b). Intervention participants saw improvements in perceived social support, but did not improve as much as the control, leading to a negative effect size.

Healthy Kids. In the longer version of Healthy Kids, Moran and colleagues (2023) found that scores on self-efficacy among the intervention group remained the same at post-intervention but were inferior compared to the control group, leading to a small negative effect size of d =  − 0.08. Scores on resilience improved in the intervention group compared to the control group at post-intervention with a small effect size of d = 0.42, but improvements were not statistically significant. In the shorter version of Healthy Kids, the authors reported that scores for self-efficacy significantly improved at post-intervention among all students; however, the study did not include a control group to examine between-subject differences (Sabin et al., 2023).

The Body Project. In one investigation of the Body Project, participants in the intervention showed statistically significant pre-post-increases in self-compassion and increases in self-esteem. Similarly in the other investigation, participants in the intervention showed statistically significant pre-post-increases in self-compassion and increases in self-esteem; participants in the Body Project had significantly greater improvements in self-compassion and self-esteem compared to a waitlist control group, with large effect sizes of d = 2.07 and d = 3.16, respectively (Vanderkruik, 2019).

Interventions Evaluated in One Trial

Several interventions were evaluated in only one trial each. Among these, results regarding efficacy were mixed.

Efficacious Interventions. The Enhanced Psychological Mindset Session for Adolescents intervention studied by Perkins and colleagues (2021) showed statistically significant improvements on measures of personality mindset, self-esteem, psychological flexibility, and self-compassion at the one-month and two-month follow-ups in the intervention group compared to a no-treatment control group with effect sizes ranging from d = 0.05 to d = 1.65. One study examined the Better Learning Program (BLP2), a teacher-led CBT-based program, among students from the Gaza strip who were exposed to a traumatic event. The authors found positive effects of the intervention on well-being, self-efficacy, hope, and self-regulation at post-intervention compared to students in a no-treatment control, with medium to large effect sizes ranging from d = 0.41 to d = 0.99 (Forsberg & Schultz, 2023). One intervention focused on sleep and was delivered by research staff over three sessions. It provided psychoeducation about sleep hygiene; some components were delivered in-person, while others were delivered over the phone. Compared to a control group, participants in the intervention had statistically significant improvements in quality of life as at the six-month and twelve-month follow-ups, with small effect sizes ranging from d = 0.21 to d = 0.43 (Quach et al., 2011).

Non-Efficacious Interventions. In the study by Ab Ghaffar and colleagues (2019), a School-Based Anxiety Prevention Program did not result in statistically significant between-group increases in self-esteem, although scores were higher in the intervention group than in the no-treatment control group at the 3-month follow-up, with a small effect size of d = 0.17. One study investigated an in-person, lay provider intervention focused on training students in problem-solving. It found that participants in the intervention had improvements in well-being as measured by the Short Warwick Edinburgh Mental Well-being Scale, but improvements were not statistically significant when compared to a control group with a small effect size of d = 0.15 (Michelson et al., 2020). In a study of a Brief Alcohol Intervention, which included education surrounding the risks of alcohol consumption, there was no evidence of between-group differences in well-being compared to a treatment-as-usual control at a 1-year follow-up, although scores were slightly higher in the intervention group, with a small effect size of d = 0.03 (Coulton et al., 2022). In their study of Dove Confident Me, Diedrichs and colleagues (2021) found that self-esteem was significantly higher among participants in the intervention at the two- and six-month follow-ups compared to the control, but there were no significant differences between one- and three-year follow ups, with small effect sizes ranging from d =  − 0.02 to d = 0.12.

In their study of an online, self-guided growth mindset intervention, Heman and colleagues (2023) found no evidence of statistically significant between-subject improvements in subjective happiness or life satisfaction at post-intervention, with small effect sizes of d = 0.25 and d = 0.18, respectively. One study examined a therapist-delivered in-person Brief Contextual Intervention based on Acceptance and Commitment Therapy and Functional Analytic Psychotherapy. At post-intervention, participants in the intervention had higher scores in satisfaction with life and psychological flexibility compared to a no-treatment control, but the authors reported that the main effect of group was not statistically significant despite a medium effect size of d = 0.55 for life satisfaction and small effect size of d = 0.09 for psychological flexibility (Macias et al., 2022).

Three studies investigated interventions that involved writing activities. A Written Emotional Expression intervention delivered in-person by research staff did not result in statistically significant between-subject improvements in positive affect compared to a placebo condition at a one-month follow-up with a small effect size of d = 0.04 (Curry & Harrell, 2011). In the letter-writing intervention studied by Froh and colleagues (2009), the intervention did not result in statistically significant between-subject increases in positive affect compared to a placebo journaling condition at post-intervention or at one and two-month follow-ups with small effect sizes ranging from d = 0.01 to d = 0.15. Participants who spent 150-min writing about thoughts and feelings related to middle school over the course of three weeks did not show statistically significant between-subject improvements in sense of coherence or self-concept compared to a placebo writing activity at post-intervention with small effect sizes of d = 0.26 and d =  − 0.28, respectively. Intervention participants saw improvements in self-concept, but did not improve relative to the control group, leading to a negative effect size (Haraway, 2003).

Potentially Iatrogenic Effects

A study investigating a psychodramatic intervention delivered in-person by professional actors showed that scores on social-emotional competence were slightly lower among the intervention group compared to the control group at the two-week follow-up with a small effect size of d =  − 0.01. However, the authors reported that this effect was not statistically significant (Agley et al., 2021). In their study of SoMe, Gordon and colleagues (2021) found that self-esteem decreased among the intervention group at post-intervention and 1-year follow-up, with small effect sizes ranging from d =  − 0.17 to d =  − 0.003. However, authors reported that between-group differences were not statistically significant when compared to a control group. One study examined an in-person, teacher-delivered, process-based CBT intervention. Scores on positive mental health and resilience deteriorated among the intervention group at post-intervention and follow-up, with small effect sizes ranging from d =  − 0.39 to d =  − 0.10. However, the authors report that between-group differences were not statistically significant when compared to a no-treatment control (O'Connor et al., 2022).

Interventions Not Included in Meta-Analysis

One study investigated a videogame intervention, empowerED, where students made decisions to complete “mini-stories” and learned about restructuring negative thoughts. There were no significant differences in self-efficacy at post-test when compared to an active control condition where students reviewed a public website; effect sizes were not able to be calculated due to insufficient information (Fernandes et al., 2023). One study investigated an in-person Universal Mental Health Promotion Program, including yoga and mindfulness components, delivered by occupational therapy students. The authors found no evidence of significant improvements in emotional self-efficacy at post-intervention; the study did not include a control group to examine between-subject differences (Lin et al., 2022). One study examined BodyKind, an in-person, teacher-led program to improve positive body image. The authors reported that there was not sufficient power to detect within-subject differences on well-being outcomes; the study did not include a control group to examine between-subject differences (Mahon et al., 2023). In the study of Listen Protect Connect from Ramirez and colleagues (2013), participants in the intervention had significant within-group improvements in perceived social support at the two-month follow-up; the study did not include a control group to examine between-subject differences. One study investigated a digital, self-administered “adaptive theory of emotion” intervention. This online intervention provided psychoeducation and taught students emotion regulation strategies over two 45-min sessions. Compared to a placebo control condition where students learned about the brain, students in the intervention group had higher scores of emotional well-being at school at the two- to six-week follow-up; effect sizes were not able to be calculated due to insufficient information (Smith et al., 2017). In a study of an in-person mindfulness-based intervention, Winters (2022) reported statistically significant within-subject improvements in prosocial behaviors at post-intervention among first and fourth graders, but not third graders; effect sizes were not able to be calculated due to insufficient information.

Interventions Targeting Self Injurious Thoughts and Behavior

Four interventions targeted self-injurious thoughts and behavior. Two of these interventions had sufficient data to be included in the meta-analysis (see Table 12 in Appendix H).

Signs of Suicide was evaluated by two studies. Signs of Suicide is an in-person, teacher-delivered intervention that provides psychoeducation around suicide and encourages self-monitoring skills. Both studies found that participants in the intervention group had significantly fewer self-reported suicide attempts at a three-month follow-up compared to a waitlist control group, with large effect sizes of d = 1.06 and d = 1.07, respectively (Aseltine & DeMartino, 2004; Aseltine et al., 2007).

Preventure was evaluated by two studies. In one study, Preventure resulted in statistically significant reductions in suicidal ideation compared to a no treatment control group with a small effect size of 0.09 at the two-year follow-up (O'Leary-Barrett et al., 2013). In another study, students in Preventure had significant decreases in suicidal ideation up to three-years post-intervention compared to a treatment-as-usual control, with small effect sizes ranging from d = 0.13 to d = 0.31 (Grummitt et al., 2022).

One study investigated a two-hour, in-person, research-staff-delivered, single-session intervention designed to prevent suicide by reducing hopelessness among students; however, the authors did not find evidence of statistically significant reductions in hopelessness when compared to a control group at post-intervention; effect sizes were not able to be calculated due to insufficient information (Portzky & van Heeringen, 2006).

One study examined a Peer Leadership Training intervention. This in-person, lay-provider-delivered intervention focused on improving students’ leadership skills and encouraging community service. At a one-week follow-up, there was a statistically significant within-subject reduction of suicidal ideation; the study did not include a control group to examine between-subject differences (Wulandari et al., 2019).

Interventions Targeting Eating/Body Image Problems

Five interventions targeted eating or body image problems; all five had sufficient data to be included in the meta-analysis (see Table 13 in Appendix H).

In their study of Dove Confident Me, Diedrichs and colleagues (2021) found no significant differences in dietary restraint between the intervention and a treatment-as-usual control between post-intervention and three-year follow ups, with small effect sizes ranging from d =  − 0.04 to d = 0.04.

One article included two separate investigations of The Body Project. In the first investigation, there was a statistically significant pre- to post-intervention reduction in restrained eating. In the second investigation, which included a waitlist control group, there was a significant effect of group assignment on restrained eating at post-treatment, with a large effect size of d = 0.94 (Vanderkruik, 2019).

One article included two separate interventions designed to address thin-ideal internalization: a mindfulness-based intervention and a dissonance-based intervention. Both were delivered in-person by research staff. Neither showed statistically significant effects of group assignment on eating disorder outcomes at post-intervention, 1-month, or 6-month follow-ups compared to a no-treatment control group, with small effect sizes ranging from d = 0.002 to d = 0.21 (Atkinson & Wade, 2015).

In their study of the SoMe Social Media Literacy Program, Gordon and colleagues (2021) found that, among the intervention group participants, weight and shape concerns decreased at post-intervention, but increased at the six-month and one-year follow-ups. Dietary restraint decreased at post-intervention and the six-month follow-up, but increased at one year. Overall there were no significant between-group differences when compared to a treatment-as-usual control, with small effect sizes ranging from d =  − 0.05 to d = 0.01.

Interventions Targeting Substance Use Problems

Five interventions targeted substance use problems; four had sufficient data to be included in the meta-analysis (see Table 14 in Appendix H).

Preventure, an in-person, counselor-delivered intervention, was studied multiple times; the results of these studies were mixed. Three articles investigated data from the same study of Preventure (Conrod et al., 2013; Mahu et al., 2015; NCT00776685). In their 2013 study, Conrod and colleagues found that participants in Preventure had statistically significant reductions in alcohol use compared to a treatment-as-usual control over the two-year follow-up period, with a large effect size of d = 0.69. Mahu and colleagues (2015) found that participants in Preventure had statistically significant reductions in marijuana use at the six-month follow-up compared to the treatment-as-usual control group (i.e., standard drug education), with a small effect size of d = 0.09. However at one-year and 1.5-year follow-ups, outcomes deteriorated and participants in the intervention had higher levels of marijuana use compared to the treatment-as-usual group, with small effect sizes of d =  − 0.11 and d =  − 0.06, respectively. For participants who used marijuana, there were higher frequencies among the intervention group at the six-month follow-up compared to the control group (d =  − 0.28), but there was a statistically significant reduction in frequency among the intervention group at the one-year and 1.5-year follow-ups with small effect sizes of d = 0.33 and d = 0.24, respectively.

Three articles investigated data from another study of Preventure (Conrod et al., 2010; Edalati et al., 2019; O'Leary-Barrett et al., 2010; NCT00344474). In their 2010 study, Conrod and colleagues found that participants in the intervention had statistically significant reductions in rates of drug use compared to a treatment-as-usual control (i.e., standard drug education) over the two-year follow-up period, with small effect sizes ranging from d = 0.13 to d = 0.29. Edalti and colleagues (2019) found that participants in Preventure had statistically significant reductions in drinking frequencies at a two-year follow-up compared to the control group, with a small effect size of d = 0.19. Participants in Preventure had non-statistically significant decreases in quantity of drinks (d = 0.11), frequency of binge drinking (d = 0.07), and scores on the Rutgers Alcohol Problem Index (d = 0.02). O'Leary-Barrett and colleagues (2010) found that participants in the intervention were significantly less likely to drink at the six-month follow-up compared to the control with a small effect size of d = 0.12, but there was not a statistically significant difference for binge drinking (d = 0.07). Participants in the intervention had non-statistically significant reductions in alcohol use (Quantity by Frequency) with a small effect size of d = 0.15. Participants also had non-statistically significant reductions in scores on the Rutgers Alcohol Problems Index (d = 0.09).

Two articles investigated data from a third study of Preventure (Lammers et al., 2015, 2017; NTR1920). The 2015 study did not find evidence of between-group differences in alcohol use at the 12-month follow-up when Preventure was compared to a no-treatment control group. Effect sizes ranged from d = 0.14 to d = 0.17. Results from the 2017 study from Lammers and colleagues similarly showed no statistically significant between-group differences; effect sizes were not able to be calculated due to insufficient information.

Lastly, one article included substance use outcomes from a fourth study of Preventure (Newton et al., 2022; ACTRN12612000026820). This study included data up to seven-years post-intervention. Results indicated that substance use tended to increase over time among both the intervention and treatment-as-usual control group as expected, given that the average age of students when the trial occurred was 13.4 years old. At the seven-year follow-up, the intervention group had significantly reduced likelihood of alcohol-related harms. Scores on other substance use outcomes were mixed, but the author reported no other statistically significant differences. Effect sizes ranged from d =  − 0.59 to d = 0.29.

One study examined a Peer Educator Intervention, wherein peer educators were trained to lead groups that discussed drug abuse. Compared to a no-treatment control, participants in the intervention had significantly better self-efficacy for drug abuse at post-intervention, with a large effect size of d = 1.01 (El Mokadem et al., 2021).

In their study of a Brief Alcohol Intervention, Coulton and colleagues (2022) found no evidence of between-group differences in alcohol use compared to a treatment-as-usual control at a 1-year follow-up, with small effect sizes ranging from d =  − 0.13 to d = 0.11. MAKINGtheLINK did not result in statistically significant reductions in alcohol or drug use compared to a waitlist control, with small effect sizes ranging from d = 0.09 to d = 0.15 at six-week to twelve-month follow-ups (Lubman et al., 2020). InCharge, an in-person intervention delivered by mental health practitioners that provides psychoeducation around drug use, did not result in between-subject reductions in alcohol use at a twelve-week follow-up when compared to a no-treatment control; effect sizes were not able to be calculated due to insufficient information (Mesman et al., 2021).

Interventions Targeting Oppositional/Conduct/Behavioral Problems

Four interventions targeted oppositional/conduct/behavioral problems; all four had sufficient data to be included in the meta-analysis (see Table 15 in Appendix H).

Preventure was investigated for effects on conduct problems by two studies. In the study from O'Leary-Barrett and colleagues (2013), participants in Preventure reported a statistically significant decrease in conduct problems at the two-year follow-up compared to participants in a no-treatment control group, with a small effect size of d = 0.16. However in the study of Preventure from Goossens and colleagues (2015), participants in both the intervention and the no-treatment control groups experienced increases in delinquent behavior at a one-year follow-up. Delinquent behavior was slightly higher in the intervention group at the 1-year follow-up compared to the control group, with a small effect size of d =  − 0.04.

In Haraway’s study (2003) of a writing-based intervention, there were increases in anger in both the intervention and the placebo activity control group at post-intervention. However, participants in the intervention group had non-significantly lower scores at post-intervention compared to the placebo control group, with a small effect size of d = 0.36.

In the study of Growing Minds from Schleider and colleagues (2020b), there were increases in conduct problems in both the intervention and a placebo control group at the 4-month follow-up. The intervention group had non-significantly higher scores at the 4-month follow-up compared to the control group with a small effect size of d =  − 0.14.

One study examined the in-person, teacher-led Social Thinking and Academic Readiness Training (START) program, specifically the Academic Readiness (AR) lesson. This program aimed to improve executive functioning and self-regulation among students exposed to adversity after a natural disaster. Compared to a no-treatment control, students in START had significantly greater reductions in externalizing behavior at post-intervention, with a medium effect size of d = 0.57 (Yamamoto et al., 2022).

Interventions Targeting Attention/Hyperactivity Problems

Two interventions targeted attention/hyperactivity problems; one (Preventure) had sufficient data to be included in the meta-analysis (see Table 16 in Appendix H). Preventure did not result in statistically significant reductions in hyperactivity when compared to a no-treatment control group; scores were non-significantly higher in the intervention group at 6-month and 12-month follow-ups, with small effect sizes of d =  − 0.1 and d =  − 0.06, respectively (Goossens et al., 2015).

In a study by Malboeuf-Hurtubise and colleagues (2021) participants either completed an emotion-based drawing or a mandala drawing exercise. Both interventions showed statistically significant within-subject reductions in hyperactivity from pre- to post-intervention, but there were no statistically significant within-subject reductions in inattention; the study did not include a control group to examine between-subject differences.

Interventions Targeting Trauma Symptoms

Three interventions targeted trauma symptoms; two (Writing for Recovery and BLP2) had sufficient data to be included in the meta-analysis (see Table 17in Appendix H). The Writing for Recovery intervention studied by Lange-Nielsen and colleagues (2012) showed significant within-subject reductions in trauma symptoms, but did not show significant between-subject differences when compared to a waitlist control group, with a small effect size of d = 0.11.

In their study of the Better Learning Program (BLP2), Forsberg and Schultz (2023) found that students in the intervention had significantly greater reductions in traumatic stress symptoms at post-intervention compared to participants in a no-treatment control, with a large effect size of d = 0.71.

Participants in an in-person, therapist-delivered Post-Disaster Trauma Treatment focused on expressing difficult emotions and coping with loss showed a statistically significant within-group decrease in trauma-related symptoms at a one-year follow-up; the study did not include a control group to examine between-subject differences (Chemtob et al., 2002).

Interventions Targeting General Distress or Combined Problems

Twenty-five interventions targeted general distress or combined problems. Twenty had sufficient data to be included in the meta-analysis; their effect sizes ranged from d = − 0.18 (Andrews et al., 2023) to d = 1.14 (Macias et al., 2022; see Table 18 in Appendix H).

Efficacious Interventions

An in-person School-Based Anxiety Prevention Program reported statistically significant reductions in scores on the Revised Child Anxiety and Depression Scale among participants in the intervention compared to participants in a no treatment control immediately post-intervention and at the 3-month follow-up, with small effect sizes of d = 0.16 and d = 0.13, respectively (Ab Ghaffar et al., 2019). In their study of BRISC, Bruns and colleagues (2023) found that students in the BRISC condition had significantly greater improvements in the seriousness of their top problems at both the 2- and 6-month follow-ups compared to a treatment-as-usual control, with small effect sizes of d = 0.17 and d = 0.24, respectively. In their study of a video-based yoga intervention, Busch and colleagues (2023) found statistically significant within-group reductions among the intervention group in scores on the Preschool Pediatric Symptom Checklist, which includes items related to anxiety, attention, aggression, and more. Scores in the intervention group were lower at post-intervention compared to a no treatment control with a small effect size of d = 0.05. One study examined an online single-session intervention for problem-solving called Project Solve. Compared to an active control condition, students who completed Project Solve had significantly greater reductions in hopelessness immediately post-intervention, d = 0.23, and significantly greater reductions in internalizing symptoms at one-month and three-month follow-ups, with small effect sizes of d = 0.11 and d = 0.35, respectively (Fitzpatrick et al., 2023). One study investigated an in-person, three-session Rational Emotive Behavior Therapy intervention and found significantly reduced scores on a measure of depression and anxiety at a 6-month follow-up compared to a no-treatment condition with a medium effect size of d = 0.69 (Sælid & Nordahl, 2017).

In their study of a Brief Contextual Intervention, Macias and colleagues (2022) found that participants in the intervention had significantly greater decreases in distress relative to participants in the control, with a large effect size of d = 1.13. One study investigated an in-person, lay provider intervention focused on training students in problem-solving. It found that participants in the intervention had statistically significant between-group reductions in emotional problems as measured by the Youth Top Problems at the 6-week and 12-week follow-ups with small effect sizes of d = 0.39 and d = 0.39, respectively. Students in the intervention also had statistically significant between-group reductions in scores on the difficulties subscale of the Strengths and Difficulties Questionnaire, with small effect sizes of d = 0.16 and d = 0.18 (Michelson et al., 2020). In their study of Healthy Kids, Moran and colleagues (2023) found that participants in the intervention had significantly greater reductions in emotion regulation difficulties at post-intervention compared to a no-treatment control condition, with a small effect size of d = 0.19. One study examined an Enhanced Psychological Mindset Session for Adolescents, a digital intervention delivered within classrooms during school hours. It found that participants in the intervention group showed statistically significant reductions in scores on a measure of depression and anxiety at the one- and two-month follow-ups compared to a no-treatment control group with small effect sizes of d = 0.46 and d = 0.35, respectively (Perkins et al., 2021).

Two studies investigated interventions designed to improve sleep problems. One study examined an in-person Sleep Promotion Program; it provided psychoeducation about sleep hygiene and taught relaxation skills over five sessions. Compared to a no-treatment group, participants in the intervention had greater reductions in emotional distress as measured by the PedsQL Present Functioning Visual Analogue Scale at a 2-week and 6-week follow-up, with medium effect sizes of d = 0.57 and d = 0.56, respectively (John et al., 2016). In the second study, a sleep intervention was delivered by research staff over three sessions. It provided psychoeducation about sleep hygiene; some components were delivered in-person, while others were delivered over the phone. Compared to a control group, participants in the intervention had greater reductions in prosocial problems at the six-month and twelve-month follow-ups, with small effect sizes of d = 0.39 and d = 0.24, respectively (Quach et al., 2011).

Delivered alone, E-health4Uth did not result in statistically significant between-group reductions in difficulties scores on the Strengths and Difficulties Questionnaire or the Youth Top Problems questionnaire when compared to a no treatment control group, with small effect sizes of d = 0.03 and d = 0.04 at a 17-week follow-up, respectively. However, E-health4Uth plus a consultation session with a school nurse resulted in both within-group and between-group reductions in difficulties scores and Youth Top Problems, with small effect sizes of d = 0.13 and d = 0.13, respectively.

In their study of the START program, Yamamoto and colleagues (2022) found that, compared to a no-treatment control, students in START had significantly greater reductions in internalizing symptoms at post-intervention, with a large effect size of d = 0.87 (Yamamoto et al., 2022).

Non-efficacious Interventions

MAKINGtheLINK, an in-person intervention focused on increasing mental health literacy, did not result in statistically significant reductions in scores on a measure of depression and anxiety compared to a waitlist control, with small effect sizes ranging from d = 0.08 to d = 0.15 at six-week to twelve-month follow-ups (Lubman et al., 2020). One study examined Healthy Sleep, Health School Life, an in-person sleep education intervention and found no statistically significant differences in parent-reported difficulties between participants in the intervention and no-treatment control group at a one-month follow-up, with a small effect size of d =  − 0.03 (Chen et al., 2023). In their study of a virtual-reality-based intervention, Shaw and Lubetzky (2021) found that scores on psychological stress decreased from baseline to post-intervention among the intervention group, but were not significantly different compared to the active control condition; the effect size was small, d = 0.05 (Shaw & Lubetzky, 2021).

Potentially Iatrogenic Effects

In their study of Climate Schools, Andrews and colleagues (2023) found that internalizing symptoms increased among both the intervention and control groups between six- and 18-month follow-ups, with small effect sizes ranging from d =  − 0.18 to d =  − 0.04; however, analyses revealed no statistically significant main effect of condition. In the study by Atkinson and Wade (2015), neither the mindfulness-based intervention or the dissonance-based intervention showed statistically significant effects of group assignment on the Fear/Anxiety, Sadness, Guilt subscales of the Positive and Negative Affect Schedule at post-intervention, 1-month, or 6-month follow-ups compared to a no-treatment control group, with small effect sizes ranging from d =  − 0.01 to d = 0.08. Among participants in the mindfulness-based intervention, scores deteriorated at the 6-month follow-up, leading to a small negative effect size of d =  − 0.01; however, the effect was not statistically significant.

Interventions Not Included in Meta-Analysis

One study investigating Counselors-CARE, an in-person motivational interviewing intervention delivered by research staff, reported significantly greater reductions in scores on the High School Questionnaire (a composite measure, including suicide risk behavior, depression, and drug involvement) among the intervention group compared to a “usual care” control at a 10-week follow-up, but the effect size was unable to be calculated due to insufficient data (Eggert et al., 2002).

One study examined a Teen Mental Health First Aid program. This in-person, therapist-delivered program taught students about mental health literacy over three 75-min sessions. The study did not find statistically significant reductions in students’ scores on the K6 (measuring psychological distress) at the three-month follow-up; the study did not include a control group to examine between-subject differences (Hart et al., 2019).

The study of the CALM intervention found significant pre-post within-subject improvements in global functioning as measured by the Children’s Global Assessment Scale; the study did not include a control group to examine between-subject differences (Muggeo et al., 2017).

An in-person, self-guided Expressive Writing Technique Intervention resulted in within-subject reductions in scores on a measure of depression and anxiety immediately post-treatment in one study; the study did not include a control group to examine between-subject differences (Mukhils et al., 2020).

Immune for Life, an in-person, teacher-delivered intervention that aimed to improve students’ coping skills showed significantly greater improvements in coping behavior, as measured by the Young Adult Coping Orientation for Problem Experiences, and greater reductions in general problems, as measured by the Thai Mental Health Questionnaire, at the one-month follow-up when compared to a no-treatment control group; effect sizes were not able to be calculated due to insufficient information (Phuphaibul et al., 2005).

Academic Outcomes

Academic outcomes were investigated to further characterize the studies included in this review but were not included in the meta-analysis. Six studies (8%) reported academic outcomes in addition to mental health/well-being outcomes. In their study of BLP2, Forsberg and Schultz (2023) collected data on students’ grades in Math and Arabic. They found that students’ grades in both subjects improved after participating in the intervention, while national averages remained consistent. In their study of SPARX-R, Perry and colleagues (2017) gathered data on students’ final exam results. They found no significant differences in exam scores at post-treatment between students in the intervention and those in an attention-matched placebo condition. In their examination of the Shamiri intervention, Osborn and colleagues (2020a) collected students’ average grades during the school term before the intervention and the school-term after the intervention. They found that students in the intervention had significantly greater improvements in academic outcomes compared to students in a study skills placebo activity. In their study of CALM, Muggeo and colleagues (2017) conducted Woodcock–Johnson Tests (Achievement and Cognitive Batteries) at pre- and post-intervention, but found no statistically significant improvements. In their study of a sleep-focused intervention, Quach and colleagues (2011) conducted Wechsler Individual Achievement Tests at a six-month follow-up and found no significant differences in scores between students in the intervention and those in the control group. Lastly, in a study examining a mindfulness-based intervention, Winters (2022) examined students’ scores on standardized tests (i.e., NWEA/MAP) during the school quarter before and after the intervention was delivered. Winters reports that reading scores improved among first graders and math scores improved among first and fourth graders.

Risk of Bias

Detailed results on the risk of bias of each study are presented in Table 19 in Appendix I. Thirty-seven studies were classified as high risk (11 due to lack of randomization, 26 due to lack of blinding in concealed allocation). Twelve studies were classified as having some concerns because of significant baseline imbalances across groups despite appropriate blinding. Twenty-six studies were classified as low risk of bias because they employed blind randomization and did not report evidence of significant baseline imbalances across groups.

Egger’s test did not indicate a statistically significant relationship between effect sizes and their standard errors, for outcomes collected less than or equal to one-month post-intervention (p = 0.95), outcomes collected greater than one-month and less than or equal to six-month post-intervention (p = 0.26), outcomes collected greater than six-month and less than or equal to one-year post-intervention (p = 0.19), outcomes collected greater than one-year and less than or equal to two-year post-intervention (p = 0.21), or outcomes collected greater than two-year post-intervention (p = 0.68). Funnel plots represent these patterns in Figs. 7, 8, 9, 10, 11 in Appendix J.

Discussion

The high rates of children who express mental health concerns coupled with low rates of traditional service access underline the need for innovative solutions to address children’s mental health (Green et al., 2013; Rosen et al., 2021). School-based mental health interventions may play a role in ameliorating the concerning state of children’s mental health due to their implementation in settings where children of all backgrounds are present. However, school resources are often already spread thin, suggesting that brief interventions may be particularly valuable. In this systematic review, we characterized the literature on brief school-based mental health interventions and analyzed their effectiveness.

Effects of Brief School-Based Mental Health Interventions

Overall meta-analytic results suggest a small positive effect of brief school-based interventions on mental health/well-being outcomes up to one-month (g = 0.18, p = 0.004), six-month (g = 0.15, p = 0.006), and one-year (g = 0.10, p = 0.03) post-intervention. The effects past one-year post-intervention were not statistically significant, suggesting that positive findings may be applicable only to the short term (less than or equal to one-year post-intervention). However, only six studies examined effects after one year and less than or equal to two years and only two studies examined effects after two years, suggesting that meta-analytic results of outcomes past one year should be interpreted with caution and future studies should include longer follow-ups. Positive results up to one-year post-intervention results may carry importance for public mental health efforts. The overall effect sizes are smaller than those in meta-analyses of longer school-based prevention and treatment programs (ranging between g = 0.39–0.50; Mychailyszyn et al., 2012; Sanchez et al., 2018) but are similar in size to those in meta-analyses of longer school-based prevention programs (ranging between g = 0.11–0.21; Werner-Seidler et al., 2017, 2021). Interventions in the current review were designated as universal, indicated, or selective prevention programs. No programs in the current review were designated as treatments. It is possible that the brevity of the interventions in the current review does not allow for full treatment of mental health disorders but rather alleviation of symptoms or prevention of symptom deterioration.

Moderator tests showed findings that were partially consistent and partially inconsistent with previous reviews. In previous meta-analyses of longer school-based interventions, predictors of effective interventions included being either selective interventions (provided to students at risk for mental health problems according to a teacher referral or mental health screening) or targeted interventions (provided to students identified as having mental health problems) rather than universal programs (Sanchez et al., 2018). The current review found consistent results for outcomes collected less than or equal to one-month post-intervention. Universal programs were associated with lower effect sizes compared to indicated programs (defined as interventions designed to target individuals who are experiencing early signs of a target problem). It is possible that by targeting students with elevated symptoms at baseline, indicated programs allow for greater improvements in outcomes. These findings suggest that schools may optimize the usefulness of brief interventions by offering them to students with early signs of distress. At the same time, there was no significant difference between universal and indicated programs on outcomes at time points greater than one-month post-intervention. Therefore, there is no evidence to suggest long-term differences between brief mental health interventions that are delivered as indicated or universal programs.

Consistent with previous reviews, gender (% female) was not a significant moderator, suggesting that the programs were equally effective across genders. However, among outcomes collected between one-month and six-month post-intervention, a higher percentage of white participants were associated with lower effect sizes, a finding that is inconsistent with previous reviews (Sanchez et al., 2018). This moderator was not significant at other time points. Among outcomes collected greater than one-year and less than or equal to two-year post-intervention, longer intervention length measured in minutes was associated with lower effect sizes. This finding is consistent with reviews showing that greater intervention time is associated with smaller effect sizes (Öst & Ollendick, 2017; Weisz et al., 2017). However, intervention length was not a significant moderator at other time points.

It is useful to examine the positive benefits of specific brief interventions in addition to overall average effects. Ten brief interventions are remarkable for their consistent, medium to large effects on student mental health/well-being outcomes. These include the Child Anxiety Learning Modules (CALM; Ginsburg et al., 2021; Muggeo et al., 2017), the Shamiri Intervention (Osborn et al., 2020b), Rational Emotive Behavior Therapy (Sælid & Nordahl, 2017), Signs of Suicide (Aseltine & DeMartino, 2004; Aseltine et al., 2007), a Peer Education Intervention (El Mokadem et al., 2021), the Social Thinking and Academic Readiness Training (START) program (Yamamoto et al., 2022), the Body Project (Vanderkruik, 2019), the Better Learning Program-2 (BLP2; Forsberg & Schultz, 2023), the Enhanced Psychological Mindset Session for Adolescents (Perkins et al., 2021), and the Sleep Promotion Program (John et al., 2016). Across these ten interventions, there do not appear to be consistent intervention characteristics that could explain what effective interventions had in common.

Limitations

A potential limitation of this study is that it is possible that the search strategies used for this study did not capture every possible eligible article, although multiple databases were used to conduct searches and the authors attempted to determine if unpublished data were available. Although this review reports findings from studies that examined substance use outcomes, substance use was not included as a search term. As a result, several drug and alcohol prevention programs were likely missed in this review.

Implications for Future Research and Practice

Many studies provided insufficient information regarding study-level characteristics. For example, numerous studies did not report the race/ethnicity of their participants, the majority of studies did not specify the type of school the study was conducted in (public, private, etc.), and the majority of studies did not specify what type of geographical region the school was located in (urban, rural, etc.). No studies reported the percentage of participants who received accommodations or attended special education. In future studies of school-based interventions, fully characterizing the school and the participant sample using these metrics is critical. Additionally, there was large variance in the sample size of the included studies (ranging from 11—4,133). Studies with smaller sample sizes may have been underpowered to detect effects, particularly among studies of universal or prevention programs that may have expected smaller overall effects. Future studies should ensure sample sizes are large enough to detect positive effects, particularly when effects may be small yet still impactful at population level.

One participant characteristic that warrants further reflection is the sexual orientation of students. In our pre-registration, we hoped to include sexual minority status as a moderator, but we found that no studies reported this information. Only five studies reported the gender minority status of students. From one perspective, it is disappointing to see that this information was not reported by studies; numerous calls for the reporting of diverse sexual and gender identities in research have highlighted the importance of this data (Cahill et al., 2014; Suen et al., 2020). From another perspective, researchers must consider whether any harm will come to participants if they collect information regarding their diverse sexual or gender identities given the school context and the wider socio-political climate. For example, in studies where parents can access youths’ data, it may be dangerous to ask youth about their diverse sexual or gender identities due to the possibility that parents will react negatively or even violently to their children’s identities (Grossman et al., 2005; Katz-Wise et al., 2016). In states that adopt anti-LGBTQ+ legislation, or in religious schools that are non-accepting of LGBTQ+ identities, positioning researchers or school personnel to gather data regarding students’ LGBTQ+ status could directly lead to harm if the data are not confidential, or indirectly lead to harm if students’ trust in schools/researchers is compromised. In future studies of brief school-based interventions, researchers should consider both the benefits and limitations of asking participants to report sexual or gender identities, keeping in mind the school’s geographical and cultural climate as well as the security of students’ data.

A potential area for future research is the investigation of academic outcomes of interventions, such as improved grades, increased school attendance, or reduced disciplinary concerns. Only six studies reported academic outcomes of interventions; two studies found that students in the intervention had significant improvements in their grades. Collecting information on academic improvements due to mental health interventions has benefits and limitations. Characterizing students’ success by their academic performance alone rather than their holistic well-being may contribute to a culture that places an inordinate amount of pressure on students and teachers and discriminates against students with differential academic abilities (Morford, 2021; Tiikkaja & Tindberg, 2021). It is crucial that students’ positive mental health is seen as a worthy goal within itself. At the same time, to improve the likelihood that mental health interventions are seen as worthy of funding and resource allocation by school boards and policy makers, it may be helpful to demonstrate whether brief interventions could help improve academic outcomes as an added benefit. In future studies of brief school-based interventions, researchers should consider these advantages and disadvantages.

Another potential area for future research involves investigating how school-based mental health interventions may promote student autonomy. In some studies, every student was required to complete the interventions. By requiring student participation, schools can ensure that programs are adhered to. At the same time, interventions that do not promote student autonomy may have lower engagement and acceptance rates among students (Ryan et al., 2016). Allowing students to choose which programs to use, when to use them and to what degree, may promote autonomy, which should be considered in balance with promoting program adherence.

Lastly, an important factor for consideration is the proportion of studies classified as being at high risk of bias. To improve confidence in results, future studies of brief school-based interventions should employ blind randomized controlled trials and recruit large enough samples to reduce the likelihood that groups will have significant baseline differences.

Conclusion

To address the crisis of children and adolescents’ mental health, innovative and scalable solutions could be supplemented to the traditional system of mental health support. Brief, school-based mental health interventions may be one option for accessing populations in need without requiring extended time or resources. There is some evidence that brief, school-based mental health interventions could reduce mental health concerns or improve well-being among students, but more research is needed on how to optimize their real-world utility.