Introduction

In the USA, childhood trauma is relatively common and growing in prevalence among youth residing in urban environments (Fowler et al., 2009; Stein et al., 2003b). In a national sample of adolescents ages 13 to 17 years, 62% experienced at least one potentially traumatic event (McLaughlin et al., 2013). The prevalence of trauma reaches its peak in adolescence; yet most traumatized adolescents do not receive treatment. In fact, adolescents are consistently among the most underserved by mental health services among school-aged youth (Green, et al., 2013). Further, national data suggest that urban youth are at increased risk of subsequent psychiatric symptomatology associated with traumatic stress (Fowler et al., 2009; Stein et al., 2003b). Thus, adolescents in urban environments are at greatest risk of experiencing traumatic events and least likely to receive support or services for any subsequent behavioral and emotional problems.

Impacts of Trauma on Academic Functioning and Profiles of Student Behavior

Youth struggling with traumatic stress may develop a broad range of symptoms, and early trauma exposure may result in long term disruptions in functioning (Copeland et al., 2007; Perfect et al., 2016; Schilling et al., 2007). Estimates suggest that approximately 4% to 13% of youth who have experienced a traumatic event will exhibit clinical elevations on measures of post-traumatic stress (McLaughlin et al., 2013; Woodbridge et al., 2016). Beyond symptoms consistent with PTSD, studies documenting the negative sequelae associated with youth trauma exposure have reported consequences spanning internalizing (e.g., depression, anxiety) and externalizing (e.g., conduct problems, disruptive behavior) domains (Overstreet & Mathews, 2011). Importantly, among urban, low-income adolescents exposed to trauma, research suggests greater likelihood of symptoms manifesting as externalizing behavior problems rather than the typical internalizing expression of trauma observed among advantaged youth (Grant et al., 2004; Taylor et al., 2014). Externalizing symptoms associated with trauma exposure can cause significant impairment in multiple domains in which the adolescent must function.

Evidence linking psychiatric symptoms to academic impairment are particularly pronounced among youth who exhibit externalizing patterns of behavior. For instance, at school, externalizing behaviors may manifest as aggression toward peers, disruptive classroom behaviors, and academic disengagement and inattention (Goodman & West-Olatunji, 2010; Hinshaw, 1992). Researchers have outlined pathways by which externalizing behavior problems negatively affect learning and cognition (Busby et al., 2013), prosocial school behaviors, and, ultimately, educational outcomes (Wright et al., 2014). Thus, traumatic stress and its associated symptoms, especially externalizing symptoms, may greatly impact both mental health and academic functioning.

Externalizing and internalizing symptomology in the context of trauma likely also has implications for treatment. Specifically, different elements or components of therapy may work better for certain existing clinical presentations or types of trauma. For example, internalizing and externalizing symptoms have been shown to moderate treatment response in a study of a group-based treatment for traumatized youth with students with internalizing symptoms experiencing more benefits from intervention than those with externalizing symptoms (Herres et al., 2017). In addition, it appears that students with externalizing and internalizing behaviors benefit differentially from specific components of therapy. For example, students with internalizing behaviors showed most benefit from sharing narratives, which may be particularly helpful to reduce shame or negative self-attributions for those students. However, to date, there are no randomized controlled trials of group-based trauma interventions that have tested whether intervention effects on academic outcomes differ among subpopulations of students who report clinically significant externalizing or internalizing symptoms at baseline.

Traumatic stress in youth has been linked to impairments in school functioning by affecting students’ behaviors, cognitive functioning, and academic achievement (Delaney-Black et al., 2002; Feeny et al., 2004; Hardaway et al., 2012; Overstreet & Mathews, 2011; Perrin et al., 2000). Studies have documented significant decreases in cognitive abilities among children who have been traumatized, such as deficits in attention and long-term memory for verbal information, decreased IQ and abstract reasoning, and decreased reading ability (Delaney-Black et al., 2002). An important goal of adolescent mental health intervention is to reduce functional impairment, yet there is a dearth of studies that assess the efficacy of psychosocial (i.e., social, emotional, and mental health) interventions among youth report outcomes relevant to school functioning. The current study fills this gap in the research literature by evaluating intervention effects for both emotional–behavioral and academic outcomes among traumatized adolescents.

CBITS: A School-Based Approach

School-based mental health treatment models provide the most practical settings to identify and treat traumatized adolescents who are often underserved by traditional mental health service settings (Kataoka et al., 2009). These school-based psychosocial treatment models may also prove to be highly valued and cost-efficient if they improve both psychiatric symptoms and academic achievement. The Cognitive Behavioral Intervention for Trauma in Schools (CBITS) program is one evidence-based psychosocial intervention designed specifically for treating adolescents ages 11 through 15 who are symptomatic after exposure to one or more traumatic events. School-based clinicians deliver the CBITS program in a small group format to reduce students’ post-traumatic stress and related trauma symptoms and to build coping skills so that students are better able to handle stress and trauma in the future (Stein et al., 2003a). CBITS is specifically designed to be delivered in the typical school environment by a trained therapist, and it allows flexibility to adapt to changing school contexts and schedules. The program includes 10 one-hour group sessions and one individual session for students, two group educational meetings for parents, and an orientation session for teachers.

The CBITS approach to treatment is grounded in theories of cognitive and behavioral therapy (Jaycox et al., 2018; Kataoka et al., 2003). Cognitive behavior therapy (CBT) is based on the premise that thoughts, emotions, and behaviors are all interconnected with each other and influence one another. CBITS includes CBT therapeutic components that focus on reducing students’ maladaptive thoughts and destigmatizing the effects of trauma; consequently, students can express and cope with fear and grief reactions. Through social problem-solving techniques, role-playing, and coaching activities, therapists help students to communicate their needs for support and find suitable ways to support their peers in the group. The intervention also provides tools to enhance students’ affect regulation, such as relaxation techniques and exposure exercises to decrease anxiety and discomfort.

CBT with Adolescents

The CBITS program employs CBT techniques explicitly designed to be used with adolescents in middle school. The cognitive development of adolescents may help facilitate the effectiveness of CBT approaches since brain development during adolescence supports the abstract reasoning and metacognitive skills which are vital to the implementation of CBT (Oetzel & Scherer, 2003; Ollendick et al., 2001; Sauter et al., 2009). Furthermore, emotional development, emotion recognition, and regulation skills can have a significant impact on the effectiveness of CBT and better developed skills may allow adolescents to more easily learn, apply, and adapt CBT strategies (Sauter et al., 2009).

CBT approaches for youth with anxiety disorders, including post-traumatic stress, have been found to be effective immediately after treatment and at follow-up (Rith-Najarian et al., 2019; Seligman & Ollendick, 2011). In their extensive review of CBT for adolescents, Rith-Najarian et al. (2019) found that “Compared with other modalities, cognitive behavioral therapy (CBT) is the treatment approach with the most well-established support for improving symptoms in youth with anxiety, trauma, and depression (p. 226)” and according to multiple reviews and meta-analyses the majority of CBT treatments showed moderate to large effects.

Efficacy of CBITS

Past studies document the efficacy of CBITS in treating trauma symptoms in youth, and preliminary evidence suggests CBITS may improve academic outcomes; several studies have documented reductions in PTSD and depressive symptoms after CBITS intervention among diverse sets of adolescents (Kataoka et al., 1999, 2011; Stein et al., 2003a). In one study, Stein et al. (2003a) randomly assigned 126 sixth-grade students in an urban middle school to either the CBITS intervention or wait-list comparison condition. Results indicated that students in the experimental condition reported fewer depression and PTSD symptoms and psychosocial dysfunction. In another study (Kataoka et al., 2011), 122 middle school students from the same urban, public school district were assigned to CBITS or a delayed intervention comparison condition. At posttest, students participating in CBITS earned higher mean mathematics grades compared to the comparison group, and students in the intervention condition were more likely to have a passing (“C average”) grade in language arts. The authors recommended the use of additional standardized measures of academic performance (such as standardized achievement tests) in future studies to disentangle and specify CBITS effects on academic outcomes.

The goal of the present study is to build on existing research reporting the efficacy of CBITS in improving psychosocial and educational outcomes among a diverse population of urban middle school students. This research study examined both short-term (i.e., immediate post-intervention) and long-term (i.e., 1-year follow-up) student outcomes, including symptoms of post-traumatic stress and related psychological symptomatology (i.e., depression, anger, and anxiety), problem behaviors (e.g., withdrawal, aggression, impulsivity), coping skills, and academic performance. As recommended by Kataoka et al. (2011), the research team used a standardized measure of academic achievement to provide a robust indicator of academic outcomes among an urban middle school student sample. Further, because students may present with trauma symptoms across internalizing and externalizing domains, we conducted unique subgroup analyses to determine whether students who reported specific symptom types (internalizing or externalizing) at baseline yielded differential psychosocial and academic short-term or long-term outcomes. We hypothesized, as shown in previous studies, that students who participated in the CBITS intervention would reduce their problematic emotional and behavioral symptoms at posttest and improve their performance on standardized measure of academic achievement at 1-year follow-up more than adolescents in the comparison condition. Further, we conducted exploratory analysis based on clinical rationale to assess whether students with internalizing behavior problems may benefit more psychosocially from participating in CBITS (due to the group therapy context and building of interpersonal resiliency skills) than those with externalizing problems, who may benefit more academically (due to a reduction in behavior problems that impact academic engagement and learning).

Method

SRI International’s Institutional Review Board formally approved all procedures performed in this study involving human participants, and the research team complied with all approved procedures.

Participants

School Sample

Students who participated in the research sample were drawn from 12 middle schools within one large urban school district in northern California. During the study’s duration, the district’s middle schools (serving grades 6 through 8) had an average enrollment of 806 students (range = 410–1303 students) and served a diverse population: More than half (52%) were identified as Asian, 23% as Hispanic, 12% as African-American, 8% as White, and 5% as mixed races, 25% were English learners, 63% received free or reduced-price lunches, and 14% were identified for special education.

Each school had a School Social Worker (SSW) assigned to provide support to students. SSWs are masters-level mental health professionals who work to address barriers to student success, enhance the social and emotional growth and academic outcomes for all students. SSWs bring a mental health perspective to school sites and implement a wide variety of interventions to address barriers to learning and promote the healthy development of all students. All SSWs volunteered to participate in the study and implement CBITS with eligible and consented students in their school.

Screening Sample

In the fall of each school year from 2011 to 2015, the research team coordinated with middle school administrators, SSWs, and teachers to disseminate consent forms (in English, Spanish, and Chinese) to all parents of sixth-grade students requesting their children’s participation in schoolwide screening to identify students who had experienced one or more traumatic events and resulting elevated traumatic stress (as reported in detail by Woodbridge et al., 2016). Demographics indicated that the racial/ethnic makeup of the students with consent to participate in the screening varied slightly from the district population. After adjusting for multiple comparisons across each pair of racial/ethnic groups, analyses indicated that African-American students were the least likely to participate in the screening than White, Latino, or Asian students (Woodbridge et al., 2016). As illustrated in Fig. 1, the screening sample (n = 4076 students) represented 45% of grade 6 students across four school years (N = 9007) and 66% of all students who returned consent forms.

Fig. 1
figure 1

CONSORT diagram

Study Sample

Eligible students for the CBITS intervention included those sixth-grade students who self-reported experiencing one or more trauma events and accompanying symptoms of traumatic stress at an elevated threshold. Of those students screened, 13.5% (n = 550) endorsed at least one event on a trauma exposure checklist and showed elevated levels on a trauma symptom checklist. Four students were deemed ineligible for the intervention due to an occurrence of sexual abuse and/or inability to participate productively in a group therapy context. The research team obtained consent from parents of all eligible students to participate in the CBITS study, and SSWs obtained assent from students to participate in the group intervention. More than half (53.6%, n = 296Footnote 1) provided both consent and assent; the final sample was randomized to the CBITS intervention (n = 152) or services as usual (n = 144) within each school (see Table 1). After randomization, 2 CBITS students and 1 control student declined to participate, and they were removed from the study sample.

Table 1 CBITS and comparison group baseline sample characteristics and assessment scores

Procedures

CBITS Training and Supervision

Prior to implementation of the CBITS intervention, all SSWs serving the 12 middle schools completed an online 8-h CBITS introductory training (available at cbitsprogram.org), participated in an on-site 2-day interactive CBITS training conducted at the school district by a certified CBITS trainer, and received curriculum kits including the CBITS manual and all session materials. SSWs also engaged in weekly 90-min clinical supervision sessions conducted by a licensed clinical psychiatrist through the duration of CBITS delivery (approximately 12 weeks). During these weekly supervision sessions, SSWs discussed issues that arose from group sessions to ensure that the intervention was standardized across therapy groups and students remained engaged. A researcher–practitioner team, comprised of two CBITS intervention developers, the clinical psychiatric supervisor, the principal and co-principal investigator, and two district mentor SSWs also met weekly throughout the duration of the intervention to discuss and act upon any clinical supervision, CBITS implementation, and data collection issues that arose.

Intervention Components

The CBITS intervention group at each school was comprised of six to nine students who met weekly with their SSW during one nonacademic class period. In each session, the SSW introduced a new set of cognitive behavioral therapy techniques to combat the emotional and behavioral symptoms of trauma through a mixture of didactic presentation, age-appropriate examples, and practice activities to solidify concepts during and between sessions. Therapeutic strategies included educating students about trauma and common symptoms of traumatic stress, training students in relaxation techniques to remedy anxiety and reduce negative thoughts, developing coping strategies to face a serious trauma, and practicing social problem-solving skills. Between the third and sixth weeks, participants met individually with the SSW to describe their trauma experience in more depth (via a “trauma narrative” exercise) and to discuss how to process it during the group sessions and with their parents/caregivers, such as verbally or through artwork. SSWs also held one or two parent education sessions at approximately week 3 and week 7 to describe the purpose and content of the CBITS program, normalize the concept of trauma and traumatic stress, prepare the parent to hear the child’s trauma narrative, and discuss practical strategies that may encourage further parent and child communication about the trauma.

Fidelity to the Intervention

SSWs audiotaped each CBITS group session and uploaded the recording to a secured website. To monitor fidelity to the CBITS program, a random sample of 20% of the audiotapes were rated by trained and certified external CBITS clinicians to assess the adherence to the CBITS sessions and the quality of each session. Adherence items were specific to each CBITS session and ratings were based on a scale from 0 to 3 (0—“Topic not covered at all,” 1—“Only cursory reference to topic and quick review,” 2—“Topic clearly covered, with or without cooperation of group members,” and 3—“Topic thoroughly covered, integrated into larger context of therapy, and interactive”). The number of adherence items fluctuated depending on the session with an average of 4.4 items per session and a range from 2 items (Session 10) to 7 items (Session 3). Ratings of quality were more focused on how the SSWs implemented the sessions (e.g., did the SSW convey empathy to the student, use a cognitive behavioral framework, motivate participants). There were 7 items focused on quality rated for each session and ratings were based on a scale of 0–3. The average adherence rating was 2.85 and the average quality rating was 2.89. Intraclass correlations (ICC) assessing interrater reliability were conducted for 30% of the adherence ratings and quality ratings with an ICC for adherence of 0.90 and 0.92 for quality.

Comparison Condition

Within each school students were randomized to a CBITS or comparison group (services as usual). After randomization, SSWs were provided the list of students in each group and directed to begin CBITS sessions. During the study period, students in the comparison group did not participate in any CBITS groups or treatment groups that used similar therapeutic approaches (i.e., CBT). SSWs were told to provide “typical” services to students in the comparison group utilizing routine resources and processes for students suffering from exposure to trauma in their school. This included any individual meetings, other small group approaches, and referral to outside agencies as needed. SSWs reported that most students in the comparison groups received a range of typical services (e.g., individual short-term goal oriented supports, restorative justice groups, small group counseling, social skills groups, anger management groups) while some students did not participate in any formal school-based services.

Measures

Trauma Symptom Checklist–Child Version (TSCC; Briere, 1996)

The TSCC evaluates the impact of trauma as manifest in symptoms of post-traumatic stress disorder and related psychological symptomatology (i.e., depression, dissociation, anger, and anxiety). All students participating in the screening process and the final research sample completed the 44-item version of the self-report measure that excludes references to sexual abuse issues. The TSCC is suitable for children ages 8 to 16, is available in multiple languages, and is scored on a 4-point Likert scale (0 = never, 1 = sometimes, 2 = lots of times, 3 = almost all of the time). The TSCC was standardized on a large normative sample of racially and economically diverse children without histories of trauma; T scores are available for gender and age groups. Domains assessed include five clinical scales (post-traumatic stress symptoms, depression, dissociation, anger, and anxiety) and two validity scales (underresponse and hyperresponse). The clinical scales yield high internal consistency (α = 0.82 to 0.89; Briere, 1996; Sadowski & Friedrich, 2000); results also indicate strong concurrent and discriminant validity (Lanktree et al., 2008) with the Child Behavior Checklist (Achenbach, 1991a) and Youth Self-Report (Achenbach, 1991b). Internal consistency reliability ranges from 0.76 to 0.90 across three waves of data collection for our study sample.

Woodcock–Johnson III Normative Update Brief Battery (WJ III NU; McGrew et al., 2007)

To assess students’ academic achievement, trained research assistants administered four WJ III subtests. This norm-referenced test includes subscales on reading (Letter-Word Identification and Passage Comprehension) and mathematics (Applied Problems and Calculation). The battery is a nationally normed assessment tool, standardized on a sample of more than 8,700 children. Internal consistency coefficients range from 0.95 to 0.97. The technical manual reports evidence for content validity and sensitivity of the measure; items assess abilities that demonstrate growth and decline of achievement, with steep growth from ages 5 to 25 (McGrew et al., 2007).

Academic Engaged Time Observations (AET; Walker and Severson 1990)

Trained research assistants also conducted classroom observations to measure students’ engagement in academic tasks. The ratio between time spent visibly and actively engaged in attending to and working on relevant academic material within two 15-min observations was calculated for each student at each data collection time period. Observations were made in language arts classrooms to standardize the subject matter of the learning environment. All trained observers demonstrated and sustained high reliabilities prior to and during data collection periods. The research team also conducted dual observations on 14% of the AET observations to monitor interrater reliability (ICC = 0.98) and retrained staff as warranted to minimize observer drift. All data collectors were masked to condition that they observed participants in both intervention and comparison groups but were unaware of group membership.

Achenbach System of Empirically Based Assessment–Teacher’s Report Form and Youth Self-Report (TRF, YSR; Achenbach & Rescorla, 2001)

English Language Arts teachers completed the TRF for all participating students in their class. The TRF is a measure of teachers’ perceptions of the students’ academic performance and adaptive behavior, internalizing behavior (e.g., anxiety, depression, somatic complaints), and externalizing behavior (e.g., aggression, rule-breaking behavior). The TRF’s internalizing behavior subscale shows strong internal consistency (α = 0.90), as does the externalizing behavior subscale (α = 0.95) and total problems score (α = 0.97).

All participating students completed the YSR, yielding individual internalizing and externalizing behavior domain scores and a total problems score. The YSR’s internalizing behavior subscale shows strong internal consistency (α = 0.90), as does the externalizing behavior subscale (α = 0.90) and total scores (α = 0.95) (Achenbach & Rescorla, 2001).

T-scores greater than 60 on the Achenbach measures are considered in the borderline to clinical ranges; subsequent analysis in this current study used these cutoff scores on the YSR to determine each of the internalizing and externalizing symptom domain subgroups of participating students.

Service Assessment for Children and Adolescents (SACA; Stiffman et al., 2000 )

All participating students answered brief questions on a modified SACA (Stiffman et al., 2001), which queried them about additional services they received, outside the CBITS intervention, such as services provided by a community mental health center, school counselor, or residential treatment center within the last 6 months. The SACA demonstrates sufficient psychometric properties; test–retest reliability for children ranges from 0.63 to 0.77, and high but variable correspondence was found between child reports of services and documented service records (Horwitz et al., 2001).

Analytic Plan

Missing Data

There were no missing data on student demographic characteristics or treatment status. Across the 16 outcome measures, the proportion of student records with missing data ranged from 2 to 10% at baseline, 4% to 14% at posttest, and 7% to 25% at 1-year follow-up, with Academic Engaged Time having the lowest proportion of missing data and TRF having the highest proportion of missing data. The missing data pattern analyses provided no evidence that individuals dropped out at a particular time point. Following Little’s MCAR test (Little, 1988), we found that data were missing completely at random (MCAR). Furthermore, our Chi-squared test results also showed no gender or racial differences in missing data rates for all 16 measures at each time point. Thus, the HLM analyses used maximum likelihood estimation to account for missing data.

Intent-to-Treat Analysis (ITT)

ITT is the average effect of the treatment based on the initial treatment assignment regardless of how many participants actually received the treatment. The ITT analyses present the impact of assignment of CBITS instead of the impact of CBITS on students who received the CBITS intervention. The ITT impact estimate is the expected effect of CBITS when it was implemented in the real world, with less than perfect implementation and dosage. Hierarchical linear modeling (HLM) was performed to account for students being nested in schools. A series of HLMs (Raudenbush & Bryk, 2002), one corresponding to each outcome variable at posttest and follow-up, was specified to estimate the ITT treatment effects. Two-level HLM models with students (level 1) clustered within schools (level 2) were used for this purpose. In all instances, variables entered at the student level included a dichotomous treatment indicator (comparison = 0, treatment = 1), all baseline measures, a race/ethnicity dummy series, and a dichotomous gender indicator. All student-level variables except for the treatment indicator were grand mean-centered. Finally, in all instances, a random level 1 intercept was specified to allow comparison group student means to vary across schools. The two-level HLM equations are as follows:

Level 1: Students

$$ \begin{aligned} {\text{Y}}_{{{\text{ij}}}} & =\uppi _{{0{\text{j}}}} +\uppi _{{1}} \left( {{\text{treatment}}_{{{\text{ij}}}} } \right) +\uppi _{{2}} \left( {{\text{Student}}\_{\text{cov}}\_{1}_{{{\text{ij}}}} } \right) \\ & \quad +\uppi _{{3}} \left( {{\text{Student}}\_{\text{cov}}\_{2}_{{{\text{ij}}}} } \right) + \cdots +\uppi _{{\text{n}}} \left( {{\text{Student}}\_{\text{cov}}\_{\text{n}}_{{{\text{ij}}}} } \right) \\ & \quad + {\text{e}}_{{{\text{ij}}}} , \\ \end{aligned} $$

where Yij is the posttest or follow-up outcome of student i in school j, π0j is the random adjusted mean outcome of school j, π1 is the fixed main effect of treatment, π2–πn are the fixed main effects of the student covariates, and eij is the level 1 random effect.

Level 2: School

$$\uppi _{{0{\text{j}}}} =\upbeta _{00} + {\text{r}}_{{0{\text{j}}}} , $$

where β00 is the fixed adjusted mean outcome across schools and r0j is the level 2 random effect.

For each outcome model, the coefficient (π1) associated with the treatment indicator at the student level was of primary interest, as it reflected adjusted mean differences between treatment and comparison group students on the specific outcome variable (Model A). Two-tailed tests of statistical significance (α = 0.05) were used to determine statistical significance. Hedges’ g effect sizes for the main impact are calculated as dividing the HLM coefficient for the intervention’s effect by the pooled treatment and control group standard deviation (What Works Clearinghouse, 2017). In addition, treatment-by-moderator interactions were added to the HLM one at a time to examine whether treatment effect varied across different subgroups (Wang & Ware, 2013). Moderators were internalizer status (Model B) or externalizer status (Model C). The effect size among internalizers for Model B was calculated as dividing the estimated difference between CBITS internalizer and comparison internalizer from the HLM interaction model by the pooled CBITS internalizer and comparison internalizer group standard deviation. Similarly, effect size among the externalizers is calculated as dividing the estimated difference between CBITS externalizers and comparison externalizers from the HLM interaction model (Model C) by the pooled CBITS externalizer and comparison externalizer group standard deviation.

Results

Attrition Analysis

Although randomizing students to conditions should result in statistically equivalent groups, higher overall level of attrition and differential attrition between treatment and control groups may jeopardize the initial balance and impact estimate may be biased (What Works Clearinghouse, 2017). Our data analysis began with an attrition analysis. Across 16 outcomes at posttest, treatment group attrition rate ranged from 4 to 15%, control group attrition rate ranged from 4 to 12.5%, and the differential attrition rate ranged from 0 to 2.5%. Across 16 outcomes at follow-up, treatment group attrition rate ranged from 5 to 23%, control group attrition rate ranged from 9 to 26%, and the differential attrition rate ranged from 3 to 4%. ICCs ranged from 0.03 to 0.11 across the 16 outcomes at posttest and follow-up. According to the WWC standards (2017), the overall and differential attrition rate is low for this study.

Baseline Equivalence Analysis

After the attrition analysis, a descriptive analysis was conducted for CBITS students and comparison students. Table 1 presents the student background characteristics (gender, race, mental health service usage, and internalizer or externalizer status), pretest scores, and baseline equivalence test results of the participants in the CBITS and comparison groups. Statistical significance of the difference between the two groups at baseline was determined from HLM analysis. CBITS participants were not significantly different from comparison students on demographics or baseline assessment scores except that there is a significant difference between groups on Trauma Symptoms Checklist-Child Anger and Depression subscales. Students in the CBITS intervention group self-reported significantly more symptoms of anger (p < 0.05, g = 0.28) and depression (p < 0.01, g = 0.32) on the TSCC at baseline.

Intent-to-Treat Analysis Results

Primary estimates of the CBITS impacts were derived from the ITT analyses. Table 2 demonstrates that among the overall sample (Model A), students in the CBITS intervention group reported significantly reduced post-traumatic stress (PTS) symptoms (p < 0.05, g = − 0.21) and marginally significantly reduced self-reported internalizing (YSR) symptoms (p = 0.06, g = − 0.19) at posttest than the comparison group. No significant differences were detected between groups among the overall sample on any emotional–behavioral (Table 2) or academic outcomes including direct assessments (WJ III) or engaged time observations (Table 3) at the 1-year follow-up interval.

Table 2 HLM estimating treatment impact on TSCC and YSR scores at posttest and 1-year follow-up
Table 3 HLM estimating treatment impact on WJ III scores and AET at posttest and 1-year follow-up

Our moderation analysis showed that the effect of CBITS was different if students evidenced externalizing behavior problems in the clinical range at baseline or not. The HLMs with the externalizer by treatment interaction effect (Model C) suggest that CBITS students evidencing externalizing behavior problems in the clinical range at baseline improved on multiple emotional–behavioral outcomes to a greater degree than their counterparts in the comparison group at posttest. For example, among the students who experienced externalizing behavior problems at baseline, students in the CBITS group reported significantly reduced symptoms of post-traumatic stress (p < 0.05, g = − 0.55), dissociation (p < 0.05, g = − 0.48), and anger (p < 0.05, g = − 0.48) on the TSCC, and reduced internalizing behavior problems (p < 0.05, g = − 0.49), and total behavior problems (p < 0.05, g = − 0.52) on the YSR than the students in the comparison group. However, teachers rated students in the CBITS group as having significantly greater externalizing problems on the TRF than students in the comparison group (p < 0.05, g = 0.30). Further, on the WJ III Letter-Word Identification subtest, students with externalizing behaviors in the CBITS group showed significantly greater improvement in their performance at both posttest (p < 0.05, g = 0.30) and follow-up (p < 0.05, g = 0.24), and WJ III Applied Problems at follow-up (p = 0.06, g = 0.24) than their counterparts in the comparison group.

Moderation analyses examining the differential impact of CBITS by internalizer revealed that students evidencing internalizing behavior problems in the clinical range at baseline had significantly more reduction in YSR Externalizing scores than their peers without internalizing behavior problems at follow-up (p < 0.05, g = − 0.01). Additionally, at 1-year follow-up, significant interaction effects were detected between students with internalizing behaviors in the CBITS and comparison groups on academic outcomes. On the WJ III Calculations and Applied Problems mathematics subtests, students with internalizing behaviors in the CBITS intervention showed significantly greater improvement in their performance than the comparison group at follow-up (p < 0.001, g = 0.26 and p < 0.001, g = 0.23, respectively).

Discussion

In this randomized controlled trial, our research team sought to determine the efficacy of a targeted school-based intervention with middle school students who suffer from elevated traumatic stress. Specifically, this study examined whether students who participated in the CBITS intervention significantly improved on measures of emotional–behavioral symptoms and academic achievement. Results indicated, as hypothesized, that students in the CBITS group self-reported significantly fewer traumatic stress symptoms and internalizing behavior problems—key targets of the intervention—than the comparison group at posttest. These significant reductions in emotional–behavioral problems are consistent with the previous CBITS research (Stein et al., 2003a, 2003b) as are some of the nonsignificant findings on these same measures at follow-up. In an early study (Stein et al., 2003b), experimental groups did not vary significantly for symptoms of PTSD or depression at 6-month follow-up. However, the nonsignificant findings among the overall sample on academic outcomes (AET and WJ III) is somewhat inconsistent with the previous research (Kataoka et al., 2011), which reported significant increases in CBITS participants’ passing grades for English courses.

In the present study, additional analysis of subpopulations within the experimental conditions revealed significantly reduced emotional–behavioral symptoms on multiple subscales as well as improved performance on a WJ III literacy task (with strong effect sizes) for students in the CBITS group with externalizing behaviors as compared to their counterparts at posttest. An unexpected outcome for this group of students was the significant difference of teacher rated externalizing behaviors. Students in the CBITS group with externalizing behaviors were rated as having significantly increased externalizing behavior problems at posttest than students with externalizing behaviors in the comparison group (p = 0.04, g = 0.30).

No significant differences were detected at posttest between the CBITS and comparison groups for students who self-reported internalizing behavior problems at baseline. However, among these students with internalizing behaviors at 1-year follow-up, students in the CBITS condition made significantly greater improvement in their performance on WJ III math tasks than the comparison group with moderate effect sizes.

The fact that youth with internalizing behavior problems who participated in CBITS did not report fewer psychiatric symptoms than their counterparts after participating in the intervention—while youth with externalizing problems did—is counterintuitive. If CBITS is less efficacious among highly symptomatic students with internalizing behavior problems, it may indicate that the youth in our sample who were withdrawn and difficult to engage may not benefit as much from this group-based therapeutic modality. Future research should examine how students with internalizing symptoms engage in school-based group therapy, perhaps revealing more details about the dynamics and unique makeup and contexts of each small group, including the proportion of those with externalizing and internalizing symptoms that may encourage and sustain student engagement.

By contrast, students who self-reported externalizing behavior problems reported significantly reduced psychiatric symptoms after intervention than their counterparts not in the CBITS program. Clinically, CBITS is a structured, symptom- and skill building-focused program, which includes training in relaxation, coping, and social problem-solving techniques and communication strategies to help process trauma experiences. Students with behavior problems and who are subject to aggression, classroom disruption, and hyperactivity also may experience anxiety, and the intervention may have built fundamental skills to help relieve more distress and symptoms salient to these students with outward-facing behavior problems. However, students who self-reported externalizing behavior problems were rated by teachers as having significantly more externalizing behavior problems than their counterparts at posttest. While CBITS teaches relaxation and coping strategies, it also uses typical cognitive behavior therapy strategies which require the student to discuss past traumatic experiences and discuss them in a trauma narrative. This experience can lead to upsetting thoughts and feelings and these emotions may be difficult for students with externalizing behavior problems who may express frustration and by acting out in class.

Past research has suggested that psychiatric symptoms may influence academic outcomes in youth (Farmer & Bierman, 2002; McLeod & Kaiser, 2004; Needham, 2009; Needham et al., 2004). Evidence from this study show that significant reductions in symptoms may be related to academic outcomes for students in the CBITS group with externalizing behaviors at posttest and follow-up. Additionally, students in the CBITS group with internalizing behaviors demonstrated only one significant difference from comparison students in self-reported externalizing behavior (with a near zero effect size) yet evidenced greater performance on math subtests of the WJ III at follow-up.

The differential findings at posttest and follow-up for emotional and behavioral symptoms and academic outcomes for students exhibiting internalizing or externalizing behaviors at baseline warrants additional research. The links between children and youth with emotional and behavioral disorders and lower academic achievement have been firmly established (Benner et al., 2008; Wagner et al., 2005). However, the directionality of those links, for example, do externalizing and internalizing problems lead to poor academic outcomes or do academic difficulties lead to behavior problems, is less understood (Algozzine et al., 2011; Kulkarni et al., 2020; Masten & Cicchetti, 2010; Moilanen et al., 2010; Okano et al., 2020). Regardless of the directionality, recognizing and understanding how traumatic stress impacts psychiatric symptoms and school functioning are essential to addressing the intense needs of students affected by trauma (Perfect et al., 2016).

It was hypothesized, and confirmed in this study, that students in the CBITS group would reduce their problematic emotional and behavioral symptoms at posttest and the effects could possibly dissipate at follow-up as found in the previous research. It was also suggested that students in the CBITS group would improve their performance on standardized measures of academic achievement at 1-year follow-up more than adolescents in the comparison condition. While we did not find significant academic outcomes at posttest or follow-up for the full sample, we did find several significant academic outcomes for students in the CBITS group who self-reported internalizing or externalizing behavior problems when compared with students in control group. Future research should replicate this study design and include skills-based mediators of academic treatment effects.

Limitations

Minor implementation challenges introduced some limitations to the current study. First, the CBITS program includes two parent education sessions to introduce participating parents to the CBITS program content and format, simple relaxation techniques, and helpful coping and communication strategies. Throughout each implementation cohort, SSWs reported significant challenges to holding parent education meetings even when resources (e.g., childcare, transportation, meals) were provided to incentivize parents’ participation; as a result, SSWs collapsed the meetings into one session and reviewed the information individually with parents who could not otherwise attend. The extent to which this reduced parent involvement affected outcomes in this study is unknown.

This study was also limited by some measurement sensitivity issues. Although we collected implementation fidelity data for each session indicating SSW’s adherence to the CBITS manual, we did not collect detailed information on the engagement of student participants in the lessons. It is possible that students with externalizing behaviors engaged in lessons differently than students with internalizing behaviors, which contributed to their differential behavioral and academic outcomes. It may also be important for future research to consider how group composition (i.e., the proportion of students with externalizing or internalizing behaviors in each CBITS group) might affect overall implementation and outcomes.

One goal of this study was to assess whether CBITS impacted academic outcomes as measured by brief direct observations of academic engaged time in the classroom (AET) and standardized academic assessments (WJ III). It was theorized that academic engagement in the classroom could be a precursor to distal academic gains. However, there were no group differences found on the AET at posttest or 1-year follow-up. The lack of differences on AET could be attributed to the engagement measure and protocol used in this study not being sensitive or accurate or that the CBITS intervention did not significantly impact engaged time in the classroom setting where the observations were conducted. Finally, although the WJ III is a well-established, norm-referenced academic measure, it was not particularly sensitive to change over short periods. Future research studies may consider using a more sensitive or accurate engagement measure or adding assessments of cognitive processes to demonstrate other mechanisms by which CBITS may facilitate improvements in academic outcomes within a one-year period.

Implications

Each year, more than 5,000,000 children in our country experience some extreme traumatic event—such as abuse and neglect, community violence, war and refugee experiences, poverty, health and medical issues, or the loss of a loved one (Spitalny et al., 2002). The present study found that approximately one out of seven students experienced elevated traumatic stress. At baseline, participating students self-rated their internalizing behavior in the borderline to clinical range (YSR mean T = 62.1); however, teachers of these same students rated them in the normal range (TRF mean T = 51.7). The significant differences in these ratings (t = 11.76, p = 0.001, d = 0.65) illustrate the importance of conducting a systematic screening to identify students who may be in need of a school-based mental health services interventions similar to CBITS. Relying on teacher or staff referral alone is not sufficient to identify students with internalizing disorders as these students are often overlooked because their behaviors do not disturb other students, challenge the teacher’s authority, and often actually meet the teacher’s behavioral expectations (Gresham & Kern, 2004).

The purpose of the current study was to examine the short- and long-term efficacy of CBITS on the psychosocial and educational outcomes among a diverse population of urban middle school students. While the study did not systematically randomize students with internalizing and externalizing behaviors to treatment and comparison groups, the design allowed for unique subgroup analyses that discovered interesting differential effects for these two groups of students. These findings can inform practitioners and clinicians at multiple phases of the intervention. Practitioners can modify their outreach and identification efforts to ensure they are reaching all students in need. Groups can be formed to best meet the needs of the individual students, for example, including a good ratio of students with externalizing and internalizing behaviors in a group or even forming groups with only students with internalizing behaviors. Additionally, the context and format for the intervention can be customized depending on students’ needs and profiles. Pre-correcting behaviors and developing a good group-based behavior management plan to keep students on task and engaged could be used if practitioners are aware that there are students with significant externalizing behaviors in the group. Or additional activities to get students engaged in the lessons may be needed if there are students with withdrawn or anxious internalizing behaviors in a group. These adaptations could strengthen and further improve the outcomes for students with additional challenges in the groups.

Finally, this study adds to the evidence based on CBITS and illustrates how it can be an effective and important approach in urban, low-income school settings where students experience greater trauma exposure and exhibit poorer behavioral health and academic outcomes.