Introduction

Although it is estimated that one in six children experience mental health problems, there are significant disparities in access to treatment (Whitney & Peterson, 2019). In a national prevalence study including 46.6 million children, about 7.7 million were estimated to experience mental health problems (Whitney & Peterson, 2019). Of those, approximately 49% did not receive needed treatment. Given the potential impact for mental health problems to influence individuals’ well-being and safety, which may also lead to family and community disruption (Tacoma-Pierce County Health Department, 2016), it is essential that effective treatment be provided in accessible settings. One accessible setting is community mental health (CMH), which provides the majority of Medicaid-funded care for children (Brooks-LaSure & Tsai, 2021; Marchette et al., 2018). In Washington State, for example, 63,815 children received CMH services in 2020 (Substance Abuse & Mental Health Services Administration, 2020). Evidence-based practices (EBPs) are treatments supported by empirical studies and incorporate research evidence, clinical expertise, as well as patient values and preferences (Marchette et al., 2018). Despite the promise of CMH for providing access to effective care, EBPs are underutilized in community settings (Dorsey et al., 2013) due to well-documented barriers, such as the need for greater treatment flexibility and the ability to deliver EBPs in contexts where high comorbidity is present (Marchette et al., 2018; Peterson et al., 2018; Stein et al., 2013). In order to improve outcomes for children receiving care in CMH, it is essential to find ways to support EBP delivery in CMH in a way that is responsive to the setting and client population.

Most EBP models are “focal treatments” (i.e., training and interventions designed to treat a specific mental health condition), which pose challenges to implementation in real-world settings, including CMH. A major challenge to EBP implementation in CMH is the high degree of comorbidity experienced by clients (Barnett et al., 2013, 2019; Galla et al., 2012; Rohde et al., 2004; Weisz et al., 2012). In a U.S. sample of 10,123 children aged 13 to 18, 40% of all children with a mental health disorder met diagnostic criteria for an additional disorder (Merikangas et al., 2010). The high prevalence of comorbidity has also been demonstrated in other studies (Ackerman et al., 1998; Kessler & Wang, 2008; Kessler et al., 2005; Vasileva et al., 2020). To appropriately treat comorbidity in CMH clients, training approaches with greater flexibility are needed. Flexible and multi-problem trainings may also address fiscal and administrative challenges experienced by CMH organizations, given the costs of supporting staff to attend multiple trainings for multiple EBPs (Stewart et al., 2016).

The Washington State CBT+  Initiative provides training in EBPs for four mental health targets: Cognitive Behavioral Therapy (CBT) for depression, CBT for anxiety, Trauma-Focused Cognitive Behavioral Therapy (TF-CBT) for posttraumatic stress (PTS), and Behavioral Management Training (BMT) for behavioral difficulties (Dorsey et al., 2016). These treatment targets were selected as they represent the four most common child mental health problems in the State of Washington and cover 70% of youth seeking treatment in CMH (Burley, 2009). CBT+  was informed by modular treatment approaches (Chorpita et al., 2005) such as Modular Approach to Therapy for Children with Anxiety, Depression, Trauma, or Conduct Problems (MATCH-ADTC; Weisz et al., 2012) and relies on initial target categorization as a basis for the selection and application of treatment components. Clinicians identify a primary clinical target and follow a flowchart that outlines the clinical components for that clinical target. However, to address the high comorbidity of CMH clients, clinicians are trained to add elements to address co-occurring treatment targets as needed, such as depression and behavioral problems (Dorsey et al., 2016). CBT+  evaluations have demonstrated improved clinician self-reported skill for all EBPs (Dorsey et al., 2016; Triplett et al., 2020). Recently, the CBT+  training approach has been expanded to Maryland, New York, Maine, and Oklahoma, where it has been merged into the Partnering for Success Model to serve children through child welfare systems (Kerns et al., 2022). In their CBT+  evaluation, Kerns and colleagues (2022) found significant clinical improvements across all treatment targets (Kerns et al., 2022). However, in the Kerns et al. (2022) study, baseline scores for children were just at or barely above the clinical cutoffs for all targets but trauma, and the study did not evaluate whether time or number of treatment sessions influenced outcomes. Aside from Kerns et al. (2022) and an evaluation focused on adults (Peterson et al., 2018), there is limited research examining clinical outcomes for children treated by clinicians trained in multi-problem treatment approaches, as part of routine care, in CMH.

The current study examines the mental health outcomes of children who were treated by a CBT+  trained clinician during their participation in the Washington State CBT+  Initiative. Given previous studies showing improvements in clinician self-reported skill following CBT+  participation (Dorsey et al., 2016) and other research on the effectiveness of flexible, multi-problem treatment approaches on child symptoms (e.g., Chorpita et al., 2005; Kerns et al., 2022; Weisz et al., 2012, 2017), we hypothesized that children treated by CBT+  trained clinicians would demonstrate symptom reduction over time. Additionally, extending the work of Kerns et al. (2022), we sought to examine outcomes for CBT+  with a more clinically severe sample that is more typical for community mental health, and examine if children who received CBT+  showed greater improvements with greater length of time in treatment and/or higher number of treatment sessions.

Method

CBT+  Initiative

The CBT+  Initiative was inspired by MATCH-ADTC and other multi-problem or transdiagnostic interventions targeting the most common mental health conditions in an integrated fashion (Weisz et al., 2012). The CBT+  Initiative was funded by The Washington State Division of Behavioral Health and Recovery beginning in 2009 and its goal was to simplify and integrate training and support for clinicians in CMH, enabling them to address the most common mental health issues using training acquired through one integrated program. Until the Covid-19 pandemic, the CBT+  Core team—comprised of University of Washington-affiliated faculty and staff as well as CBT+-experienced CMH supervisors, who were trained to be co-trainers through a train-the-trainer initiative (Triplett et al., 2020)—provided five in-person, three-day trainings each year to child-focused CMH clinicians and clinical supervisors across Washington State. Following the pandemic, training was delivered remotely, though data from this study are from pre-pandemic years. After training, clinicians participated in six months of twice-monthly, group-based phone or video consultation led by the CBT+  Core Team or CMH-supervisor co-trainers. During the consultation calls, clinicians were supported in applying CBT+  with their clients. To obtain a certificate of completion, clinicians were expected to: complete the TF-CBT web training (a 10-h online course required before CBT+ training), participate in the three-day CBT+  training, participate in nine of 12 consultation calls, present one or more of their cases on a group consultation call during the six-month consultation period, and document delivery of CBT+  for at least two clients in an online measure feedback system used by the CBT+  Initiative (EBP Toolkit; described in more detail in the next section). One client’s treatment was required to have a TF-CBT focus (for the clinician to be eligible for TF-CBT national certification) and another client’s treatment could focus on one of the other problem areas (depression, anxiety, or behavioral difficulties). To support adherence to the CBT+  model and examine client response, clinicians were expected to administer standardized symptom measures and document them in the measurement feedback system at baseline and at a follow-up point as well as document session content for six or more sessions. This project uses data from three years of the CBT+  Initiative: 2016–2019, during which 498 clinicians were trained, and clinical data from their cases were used for analysis.

Client Symptom Measurement Feedback System

CBT+ uses the Evidence-based practice Toolkit (EBP Toolkit; EBP Toolkit, 2022), an online measurement feedback system that enables clinicians to document de-identified clinical data. EBP Toolkit was used to track delivery of treatment elements administered for each client and to record client-reported symptoms across time by inputting standardized assessments for each case, to allow use of measurement-based care to guide treatment. At baseline prior to initiating CBT+  treatment with clients, clinicians administered symptom measures to assess all four potential treatment targets (depression, anxiety, PTS, and behavioral difficulties; see Measures section). Following baseline assessment, clinicians specified the primary mental health condition for the client and the EBP(s) they planned to deliver, typically decided upon from the symptoms they observed using symptom measures (e.g., if a child screened positive for anxiety, they typically selected CBT for anxiety). For cases with multiple targets present (e.g., comorbid anxiety and trauma), clinicians used their clinical judgment to choose the primary treatment focus and EBP, with the CBT+  approach allowing flexibility for comorbidity. Once a treatment approach was selected to address a primary target, the CBT+  Initiative only required clinicians to re-administer a symptom measure assessing that target, to keep clinician and client burden low. However, clinicians were free to re-assess as many targets as they wished to during treatment. To receive the CBT+  certificate and encourage routine symptom monitoring, clinicians were required to assess the target at least twice after the baseline measurement.

Study Inclusion Criteria

In order for children’s de-identified data to be included in the study, they were required to have: (1) received one of the primary EBPs; (2) a symptom measure associated with the selected EBP at baseline and at a follow-up point at least two-weeks or more post-baseline, termed “post-treatment” here forward; (3) a minimum of two treatment sessions documented on EBP Toolkit; (4) baseline and post-treatment scores completed by the same responder (i.e., either child or their caregiver); and (5) be 21 years of age or younger, per the National Institutes of Health (NIH) categorization of youth given that development continues past age 18, so that all youth are included in the sample (NIH, 2015). We utilized case-wise deletion for individuals who had missing number of sessions data (n = 129; 9.5%) as number of sessions was essential to our research questions. There were no exclusions for comorbidity. For the 2016–2019 years, CBT+  trained clinicians entered data into EBP Toolkit for 2,475 children. Of these, 1,219 children met inclusion criteria. The majority of these 1219 cases (44%) received TF-CBT.

Measures

For most measures, both child and caregiver reports were available. For some targets (depression, anxiety, PTS, and behavioral difficulties), multiple measures were available to administer, allowing flexibility based on client’s age, need, and clinician’s preference. For example, clinicians could use either the Short Mood and Feelings Questionnaire (SMFQ) or the Patient Health Questionnaire-9 (PHQ-9) to measure symptoms of depression. Measures are organized by primary clinical target below.

Trauma Exposure and Posttraumatic Stress Symptoms

The Child and Adolescent Trauma Screen (CATS; Sachser et al., 2017) was used to measure trauma symptoms for those with a clinical target of PTS. This measure consisted of two sections: (a) traumatic event exposure screening and (b) symptomology section for PTS. The exposure screening section consisted of a 15-item survey characterized by dichotomous yes or no answer choices, completed by either the caregiver or the child, to assess whether the child had undergone a list of significant traumatic events. The symptomology section consisted of a variable length questionnaire, dependent upon respondent age—16-items for children between the ages of three-six, and 20-items for children between the ages of seven-17. The respondent (child or caregiver) answered using a four-point Likert Scale (0 = never to 3 = almost always). The cutoff score that warranted treatment was 15 or higher for children between the ages of seven and 17, and 12 or higher for children between the ages of three and six. The symptomology section of the CATS has been demonstrated to have good to excellent reliability, medium to strong correlations with measures of depression and anxiety, and low to medium correlations regarding externalizing symptoms (Sachser et al., 2017).

Anxiety Symptoms

The Screen for Child Anxiety Related Emotional Disorders (SCARED; Birmaher, et al., 1997) Brief Assessment was used to screen anxiety symptoms in children between the ages of eight and 17 for children with a clinical target of anxiety. The SCARED was completed by the child and consisted of five statements rated on a three-point Likert scale (0 = not true or hardly ever true, 2 = true or often true). Scores totaling three or higher suggested a positive screen for anxiety symptoms. The SCARED brief measure has demonstrated good internal consistency, test–retest reliability, discriminant validity between children with anxiety versus non-anxiety disorders, and moderate parent–child agreement (Birmaher et al., 1997, 1999).

The General Anxiety Disorder 7 (GAD-7; Spitzer et al., 2006), was a seven-item child self-report anxiety questionnaire which screened for and measured Generalized Anxiety Disorder (GAD) in children 12 years of age and above for children with a clinical target of anxiety. The seven items administered to children were rated on a four-point Likert Scale (0 = not at all, 3 = nearly every day). Once completed, the scores were added for all items and a positive screen for anxiety symptoms was met if scores added up to 10 or higher. The GAD-7 has been shown to have excellent internal consistency, good test–retest reliability, and good procedural validity (Spitzer et al., 2006).

Depression Symptoms

The Short Mood and Feelings Questionnaire (SMFQ; Angold et al., 1995) was a child self-report measure used to screen for depression among children between the ages of eight and 17 for children with a clinical target of depression. The SMFQ contained 13 items which gauged how the child had been feeling and acting during the past two weeks. For each item, child scores were rated on a three-point Likert scale (0 = not true, 2 = true). A positive screen for depressive symptoms was met if the score on the 13 items added up to 11 or higher. The SMFQ has acceptable internal reliability and discriminant validity (Angold et al., 1995) as well as content validity and satisfactory sensitivity to change (Thabrew et al., 2018).

The Patient Health Questionnaire-9 (PHQ-9; Kroenke et al., 2001) was a child self-report measure used to measure depression symptomology and severity of depression over the past two weeks in children above the age of 12 for children with a clinical target of depression. The questionnaire consisted of nine items which were rated using a four-point Likert scale (0 = not at all, 3 = nearly every day). The children screened positive for depressive symptoms if the summation of the scores for the nine items was above 10. The PHQ-9 has been shown to have good internal reliability in a primary care clinic (Kroenke et al., 2001) as well as in an obstetrics and gynecology clinic (Kroenke et al., 2001). It has also been shown to be reliable, have acceptable test–retest reliability, and internal consistency in adolescents (Tsai et al., 2014).

Behavioral Difficulties Symptoms

The Pediatric Symptom Checklist-17 (PSC-17; Gardner et al., 1999) measured caregivers’ perception of their children aged 4 to 17, with a clinical target of behavioral difficulties, using three subscales: externalizing, internalizing, and attention problems. For our study, we only utilized the externalizing and attention subscales because internalizing symptoms were captured in other clinical measures. The PSC-17 consists of 17 items which were ranked using a three-point Likert scale (0 = never, 2 = often). The child’s scores met the clinical cut-off if they had a score of seven or greater on any of the subscales. The PSC-17 has been shown to have high internal consistency for each subscale (Gardner et al., 1999).

Treatment Characteristics

The research team extracted de-identified demographic information (e.g., age), primary diagnosis, clinical target, administered EBP, and all standardized assessments (including date of administration) for both baseline and post-treatment time points using EBP Toolkit. CJRN, AV, and VG extracted the total number of sessions documented by trained clinicians between baseline and post-treatment for each child meeting inclusion criteria.

Analytic Approach

We used IBM SPSS version 19 to calculate descriptive and inferential statistics for all study variables. We split children by their primary clinical target so that we would not have overlap of children included across different groups. It was possible for a child to have multiple analyzed measures based on their primary clinical target. For example, a child whose primary EBP was CBT for Anxiety could have analyzed data for both the SCARED and the GAD-7 if the clinician administered both measures. This point was only actualized if the study inclusion criteria was met (i.e., a symptom measure at baseline and at a follow-up point at least two-weeks or more post-baseline). However, if a child’s primary EBP was Anxiety and had measures analyzing both Anxiety and Trauma, analysis would only be performed on their Anxiety measures because their primary EBP was the basis for how they were grouped, and their Trauma measure would not be taken into consideration. We conducted nine paired sample t-tests, one for all measures within each EBP, to compare mean differences between baseline and post-treatment for children receiving each of the four EBPs. We also calculated Cohen’s d to evaluate the magnitude of change between baseline and post-treatment scores. The assumption of normality was assessed and met prior to conducting the paired sample t-tests. Results suggested data was normally distributed. We also performed simple linear regressions to investigate if children who received CBT+  treatment showed greater improvements with increased length of time between symptom measures (i.e., time in treatment in days) as well as increased number of sessions. This model controlled for baseline severity using pre-treatment scores as a covariate. Time between baseline and post-treatment was highly skewed. Therefore, we log-transformed the days between symptom measures. This transformation makes highly skewed data less skewed and helps the assumptions of the linear regressions to be met more readily. Our significance level Alpha (α), for both the paired sample t-tests and the linear regression analyses was set at a standard p < 0.05.

Results

Child Participants

The missing symptom measures reduced our sample size from 2,475 children who had baseline data to 1,219 children who had both baseline and post-treatment data entered into EBP Toolkit and met our inclusion criteria. We performed an attrition analysis for age, race, and gender demographics comparing the whole sample (2475 children, including our study sample), to only the sample that met our inclusion criteria. Analysis comparing demographic data for children included in our sample to all children did not reveal differences. A little over half the cases included were female clients (58%), white (57%), and about one-fourth lived only with their biological mother (24%; see Table 1 for demographic characteristics by EBP received). Participants’ mean number of days between baseline and post-treatment sessions was 123 (SD = 83) and mean number of sessions was 11 (SD = 5.3) (see Table 2).

Table 1 Demographics of study sample by Evidence Based Practice received (N = 1219)
Table 2 Number of sessions and time from baseline to post-treatment by evidence-based practices (EBPs) and symptom measure

CBT-Anxiety

GAD-7: One hundred thirty-two children who received CBT for anxiety were assessed at baseline and post-treatment using the GAD-7 child report. There was a significant decrease in child-reported anxiety-related symptoms on the GAD-7 from baseline (M = 12.86, SD = 4.70) to post-treatment (M = 7.97, SD = 5.03); t(131) = 10.76, p < 0.001, 95% CI [3.98, 5.76], with a large effect size (ES) of 0.94. The mean score at baseline was above the GAD-7’s specified clinical cutoff for anxiety-related symptoms, and the mean score at post-treatment was below the GAD-7’s specified clinical cutoff for anxiety-related symptoms. Neither elapsed time between baseline and post-treatment nor number of sessions was significantly associated with post-treatment scores (see Table 3).

Table 3 Linear regression for number of session and log-transformed time variables on clinical outcomes by evidence-based practices (EBPs)

SCARED: One hundred thirty-one children who received CBT for anxiety were assessed at baseline and post-treatment using the SCARED child report. There was a significant decrease in child-reported anxiety-related symptoms on the SCARED from baseline (M = 4.85, SD = 2.30) to post-treatment (M = 3.21, SD = 2.14); t(130) = 7.21, p < 0.001, 95% CI [1.20, 2.10], with a medium ES of 0.63. The mean scores at baseline and post-treatment were above the SCARED’s specified clinical cut off for anxiety-related symptoms. Neither elapsed time between baseline and post-treatment nor number of sessions was significantly associated with post-treatment scores (see Table 3).

TF-CBT

Caregiver-Report of PTS Symptoms

CATS (ages 3–6) caregiver report: Twenty-two children who received TF-CBT were assessed at baseline and post-treatment using the CATS (ages 3–6) caregiver report. There was a significant decrease in caregiver-reported trauma-related symptoms on the CATS (ages 3–6) caregiver report from baseline (M = 40.27, SD = 20.30) to post-treatment (M = 27.91, SD = 17.97); t(21) = 3.64, p = 0.002, 95% CI [5.29, 19.44], with a medium ES of 0.78. The mean score at baseline and post-treatment were above the CATS (ages 3–6) caregiver report’s specified clinical cut off for trauma-related symptoms. Log-transformed time between measures was significantly associated with post-treatment scores. For each 10% increase in children’s days in treatment, their caregiver reported PTS symptoms reduced by 0.02 points. Number of sessions was not significantly associated with post-treatment scores (see Table 3).

CATS (ages 7–17) caregiver report: One hundred seventeen children who received TF-CBT were assessed at baseline and post-treatment using the CATS (ages 7–17) caregiver report. There was a significant decrease in caregiver-reported trauma-related symptoms on the CATS (ages 7–17) caregiver report from baseline (M = 57.35, SD = 24.72) to post-treatment (M = 42.58, SD = 25.46); t(116) = 5.38, p < 0.001, 95% CI [9.33, 20.21], with a medium ES of 0.50. The mean scores at baseline and post-treatment were above the CATS (ages 7–17) caregiver report’s specified clinical cut off for trauma-related symptoms. Neither elapsed time between baseline and post-treatment nor number of sessions was significantly associated with post-treatment scores (see Table 3).

Child-Report of PTS Symptoms

CATS (ages 7–17) child report: Three hundred ninety-seven children who received TF-CBT were assessed at baseline and post-treatment using the CATS (ages 7–17) child report. There was a significant decrease in child-reported trauma-related symptoms on the CATS (ages 7–17) child report from baseline (M = 64.37, SD = 24.65) to the post-treatment (M = 42.44, SD = 26.99); t(396) = 15.88, p < 0.001, 95% CI [19.21, 24.65], with a large ES of 0.80. The mean scores at baseline and post-treatment were above the CATS (ages 7–17) child report specified clinical cut off for trauma-related symptoms. Neither elapsed time between baseline and post-treatment nor number of sessions was significantly associated with post-treatment scores (see Table 3).

CBT-Depression

SMFQ: One hundred six children who received CBT for depression were assessed at baseline and post-treatment using the SMFQ child report. There was a significant decrease in child-reported depression-related symptoms on the SMFQ from baseline (M = 14.02, SD = 6.19) to post-treatment (M = 7.77, SD = 5.84); t(105) = 9.69, p < 0.001, 95% CI [4.97, 7.52], with a large ES of 0.94. The mean score at baseline was above the SMFQ’s specified clinical cutoff for depression-related symptoms. The mean score at post-treatment was below the SMFQ’s specified clinical cutoff for depression-related symptoms. Neither elapsed time between baseline and post-treatment nor number of sessions was significantly associated with post-treatment scores (see Table 3).

PHQ-9: One hundred seventy-nine children who received CBT for depression were assessed at baseline and post-treatment using the PHQ-9 child report. There was a significant decrease in child-reported depression-related symptoms on the PHQ-9 from baseline (M = 15.12, SD = 6.18) to post-treatment (M = 9.97, SD = 6.16); t(178) = 9.95, p < 0.001, 95% CI [4.13, 6.18], with a medium ES of 0.74. The mean score at baseline was above the PHQ-9’s specified clinical cutoff for depression-related symptoms. The mean score at post-treatment was below the PHQ-9’s specified clinical cutoff for depression-related symptoms. Neither elapsed time between baseline and post-treatment nor number of sessions was significantly associated with post-treatment scores (see Table 3).

BMT

PSC-17 Externalizing: One hundred twenty children who received BMT for behavioral difficulties were assessed at baseline and post-treatment using the PSC-17 externalizing subscale caregiver report. There was a significant decrease in caregiver-reported behavior-related symptoms on the PSC-17 externalizing subscale from baseline (M = 9.33, SD = 2.92) to post-treatment (M = 7.15, SD = 3.36); t(119) = 7.71, p < 0.001, 95% CI [1.62, 2.74], with a medium ES of 0.70. The mean scores at baseline and post-treatment were above the PSC-17 externalizing subscale’s specified clinical cut off for behavioral problem-related symptoms. Neither elapsed time between baseline and post-treatment nor number of sessions was significantly associated with post-treatment scores (see Table 3).

PSC-17 Attention: One hundred twenty children who received BMT for behavioral difficulties were assessed at baseline and post-treatment using the PSC-17 attention subscale caregiver report. There was a significant decrease in caregiver-reported behavior-related symptoms on the PSC-17 attention subscale from baseline (M = 6.67, SD = 2.43) to post-treatment (M = 5.78, SD = 2.50); t(119) = 4.18, p < 0.001, 95% CI [0.47, 1.30], with a small ES of 0.38. The mean scores at baseline and post-treatment were below the PSC-17 attention subscale’s specified clinical cut off for attention-related symptoms. Log-transformed time between measures was significantly associated with post-treatment scores. For each 10% increase in children’s number of sessions in treatment, behavioral symptoms increased by 0.01 points. Elapsed time between baseline and post-treatment was not significantly associated with post-treatment scores (see Table 3).

Discussion

This study was the first to examine symptom change over time for children who received treatment as part of routine care in the Washington State CBT+  Initiative, in which clinicians received a flexible training for depression, anxiety, PTS, and behavioral difficulties. From both child and caregiver reports, children receiving treatment by CBT+  trained clinicians demonstrated a statistically significant reduction in mean scores on symptom measures for child’s primary clinical problem after an average of 11 treatment sessions and 123 days (about four months) between their baseline and post-treatment assessments. Children’s symptoms significantly decreased between the two assessment points, and effect sizes were mostly in the medium to large range. For the majority of our symptom measures, neither increased length of time nor number of treatment sessions between administered symptom measures were related to greater symptom improvement for children.

To our knowledge, this is one of few studies to examine child clinical outcomes for flexible training approaches in CMH routine care, outside the context of a funded-research study. The majority of children in the sample had high clinical severity at the first assessment point and were well above clinical cutoff scores that indicate need for treatment. As such, post-treatment scores also remaining above the clinical cut-off is not altogether surprising. For PTS, children’s baseline scores were approximately 2.35–3.29 times above the clinical cut off. These scores were particularly high in contrast to the only other study evaluating the CBT+  training approach in children involved in the child welfare systems (Kerns et al., 2022). Considering that the baseline scores were substantially higher than Kerns et al. (2022) we would expect children to also have higher post-treatment scores since experiencing meaningful clinical improvement may not always produce reductions to below clinical cutoffs, given that having more severe symptoms may be associated with less favorable symptom outcomes (Compton et al., 2014). Yet, for nearly half of our measures (4 of 9), mean scores at post-treatment were below the specified clinical cutoff (see Figs. 1, 2, 3 and 4). Further, in comparison to a randomized control trial of MATCH-ADTC evaluating common elements approaches in children with similar clinical targets (Weisz et al., 2012), the effect sizes from our study were comparable to, and sometimes greater than, those found in MATCH-ADTC. For all but one symptom measure, PSC-17 (attention), effect sizes in this sample were in the medium to large range. Children mostly demonstrated a 13% to 45% reduction of symptoms, with child-report measures appearing to have the greatest reductions. Even when children remained above the clinical cutoff, these substantial changes in symptomology could facilitate meaningful improvements in their functioning (Krause et al., 2021). We found the greatest symptom reductions for anxiety and depression, followed by PTS (and particularly when looking at child reports). In their review of 50 years of child treatment research, Weisz et al. (2017) found that treatments for anxiety had the strongest mean effect sizes, while treatments for depression showed the lowest mean effects. Thus, our findings for anxiety and PTS are relatively in line with their review. While different from Weisz et al.’s (2017) findings, our results for depression are promising, particularly because the depression measures were among those with both medium or large effect sizes and below the cutoff at post-treatment.

Fig. 1
figure 1

Mean assessment scores at baseline and post-treatment for children receiving cognitive behavioral therapy-anxiety (CBT-Anxiety). Note: Clinical cut offs are indicated by the dotted red lines. Error bars represent standard deviations. GAD-7 indicates General Anxiety Disorder 7; SCARED, Screen for Child Anxiety Related Emotional Disorders

Fig. 2
figure 2

Mean assessment scores at baseline and post-treatment for children receiving trauma-focused cognitive behavioral therapy (TF-CBT). Note: Clinical cut offs are indicated by the dotted red lines. Error bars represent standard deviations. CATS indicates Child and Adolescent Trauma Screen

Fig. 3
figure 3

Mean assessment scores at baseline and post-treatment for children receiving cognitive behavioral therapy-depression (CBT-Depression). Note: Clinical cut offs are indicated by the dotted red lines. Error bars represent standard deviations. SMFQ indicates Short Mood and Feelings Questionnaire; PHQ-9, Patient Health Questionnaire-9

Fig. 4
figure 4

Mean assessment scores at baseline and post-treatment for children receiving behavioral management training (BMT). Note: Clinical cut offs are indicated by the dotted red lines. Error bars represent standard deviations. PSC-17 indicates Pediatric Symptom Checklist-17

As research on the CBT+  training approach grows, with several studies now supporting the approach (Dorsey et al., 2016; Kerns et al., 2022), it is important to examine potential moderators of the approach, including time and number of sessions delivered in routine implementation. Overall, for seven out of nine measures, children did not show greater improvements with increased length of time or increased number of sessions between assessment measures, which is in line with findings from the services literature that indicate children with greater mental health needs often need more sessions or services (Costello, 2016). While our findings indicate that children did show significant improvement on symptom measures, it is possible that we did not see meaningful effects of time in treatment because that does not necessarily denote active treatment, but instead reflects total time elapsed from the child’s first session to their last recorded session. For example, a child who was in treatment for three months with only five sessions would be treated the same as a child in treatment for three months with 12 sessions. Not seeing an association between symptom improvement and number of sessions was somewhat surprising, especially given the proportion of change we saw within symptom measures. However, children with greater need for treatment, those with more clinical severity, likely receive more sessions, obscuring any relation between more treatment sessions and treatment benefit. Alternately, Wamser-Nanney et al. (2016) found in a study examining early treatment response in children receiving TF-CBT that approximately 40% of children received all the benefits they ever would, regarding PTS symptoms, after four sessions. In other words, number of sessions may not always be relevant to improvement or clinical outcomes. Further, it is possible that some children may not experience improvements with CBT. Warren et al. (2010) compared child symptom change when receiving psychotherapy in CMH and managed care settings and found that in a CMH setting, 56% of cases showed a significant increase or no significant change in symptoms over time. It may also be possible that the children who were being seen by CBT+  trained clinicians remained in treatment beyond the post-treatment time point and potentially improved over time after assessment data stopped being entered into EBP Toolkit. In summary, we did not see an effect of length of time or number of sessions on treatment outcomes, future research should evaluate other potential moderators of the treatment effectiveness in the CBT+  Initiative.

The current study had several strengths. First, the data were drawn from a real-world sample of children, we included children with comorbidities, and observations were collected in geographically diverse CMH sites across Washington State. Second, the study was conducted in the context of the CBT+  Initiative, which is an academic-community partnership that has been fully funded by the State. This gave us a unique opportunity to examine changes in clinical outcomes for children receiving EBPs in service settings from clinicians trained through routine, state-funded training approaches. While highly controlled clinical trials provide more accurate estimates regarding intervention effects, generalizability to community settings is limited due to the restrictions of inclusion criteria and the extra resources that clinical trials bring. Study criteria can screen out individuals with comorbidities that are often the norm in community settings (e.g., CMH; Merikangas et al., 2010; Stuart et al., 2016). Further, many clinical trials study EBPs delivered by highly trained doctoral-level mental health clinicians (Dorsey et al., 2017), which is less reflective of the range of providers in CMH. Our focus on a State-funded initiative, with clinicians and their clients in CMH settings, along with providing clinicians with options to use standardized measures based on children’s age, need, and clinicians’ preferences, may make our findings more generalizable. Another strength of this study was the online application, EBP Toolkit, which was a free and effective way to examine clinical outcomes in a large sample, using routinely collected, de-identified data that includes a more diverse client population—often an obstacle observed in various randomized controlled trials (Blonde et al., 2018).

Limitations

While this study had many strengths, some notable limitations regarding data collection and analysis should be considered. First and most importantly, the current study had no control group. As such, we cannot be sure that symptom change over time is due to receipt of treatment from the CBT+-trained clinicians or whether children simply improved due to the passage of time. The data were entered into the online EBP Toolkit measurement feedback system by the CBT+  trained clinicians. In this real-world setting, there was no mechanism to ensure quality control or completeness over data entry, which is a common limitation of evaluation data from community settings in comparison to data from studies performed within labs where researchers can monitor and correct for data entry errors (Blonde et al., 2018). Additionally, it is possible that the children remained in treatment beyond the post-treatment time point and that symptoms continued to improve or deteriorate—even after assessment data was no longer entered into EBP Toolkit. Further, several cases had missing data on not only demographic information but also on child and caregiver symptom measures. Given that cases typically included missing demographic information and symptom measures, there were no variables on which we could reliably assess if children who had missing data systematically differed from those who did not. While we performed an attrition analysis for the individuals that had demographic data in EBP Toolkit comparing the whole sample to the final sample included in our study, it is important to note that this analysis was not perfect because those in the whole sample included the children within the sample we analyzed for this paper. However, this analysis gives us a broad understanding that the percentage of individuals who were included did not differ from those who were screened out from the analysis. Finally, these data are from the years when CBT+  training was in-person (pre-pandemic; 2016–2019). Beginning in 2020, training was performed remotely, and materials were adapted given the effects of the pandemic. Thus, these effects may not generalize to the current online mode of CBT+  training which began in 2020. However, many other elements of the CBT+  training remain the same (e.g., phone/video follow-up consultation, use of EBP Toolkit as a measurement feedback system). Future evaluations should compare current findings to those after the online training began.

Conclusion

Based on prior studies, the CBT+  initiative has shown promise in effectively training CMH clinicians across Washington State in four focal EBPs during one integrated training which emphasized flexible delivery to target comorbidities. Little research has been done to evaluate flexible treatment approaches in CMH settings. Results suggest that children benefited from treatment by a clinician trained through the CBT+  Initiative, an approach that equips clinicians with the flexibility to address the mental health needs of children presenting with the most common mental health conditions and comorbidities, in real-world CMH settings.