Introduction

Major depressive disorder is the most common mood disorder with a lifetime prevalence rate of 6–25% in international community studies (Kessler et al. 2014; Kessler and Wang 2009). In the US National Co-morbidity Survey Replication the lifetime prevalence of major depression was 16.6% (Kessler et al. 2005). Depression is a significant, and often relapsing, mental health problem, which may have a large impact on a person’s quality of life and wellbeing (Klein and Allmann 2014). Depression, through the burden of care it imposes, can also significantly affect families and wider society (Murray et al. 2012).

Within a stepped-care model of service provision, common in Ireland and the UK (NICE 2009, 2012; Twomey and Byrne 2012), most people with depression are treated at primary care level, through general practitioners. Here depressed adults are offered anti-depressant medication and/or self-help interventions such as bibliotherapy and referred to community based support groups, such as AWARE (http://www.aware.i.e.). If these interventions are insufficient to alleviate depressive symptoms, referral to secondary mental health facilities may occur where specialist pharmacological and psychological services for depression are provided.

Cognitive behavior therapy (CBT) is widely recognized as one of the best supported evidence-based psychological interventions for depression in adulthood (NICE 2009). The UK NICE (2009) guidelines propose that group CBT (gCBT) occupies a distinctive position within a stepped care model of service provision for adults with depression. It bridges the divide between low intensity psychosocial interventions (such as guided self-help, bibliotherapy, and computerized CBT) and high intensity psychosocial interventions, such as individual psychotherapy.

Originally developed as an individual approach to psychotherapy, in recent years CBT has been offered in group settings to increase availability and reduce costs. In a review of 36 treatment studies of varying disorders in a range of populations, Tucker and Oei (2007) concluded that gCBT can be both cost-effective and time-efficient. In addition to addressing availability and cost issues, gCBT may also offer service-users support and normalization for their dysphoric experiences.

Previous meta-analyses point to the potential value of gCBT for depression. In a meta-analysis of 32 randomized controlled trials (RCT) of gCBT compared with a range of comparison and control groups, Feng et al. (2012) found a small to moderate effect size immediately after treatment and at 6-months follow-up, but the effect size at later follow-up occasions was not significant. In a meta-analysis of 34 non-randomized effectiveness studies of outpatient individual and group CBT for depression, Hans and Hiller (2013) found that gCBT participants attended about half the number of sessions that were needed in individual CBT and had comparatively similar results on most measures. Cuijpers et al. (2008) and Huntley et al. (2012) conducted meta-analyses of RCTs comparing group and individually based therapies for depression. Each of these meta-analyses included studies of gCBT. Both of these meta-analyses showed that group treatments were less effective than individual therapy when outcome was assessed immediately after treatment, but these differences were no longer present at follow-up. Huntley found no difference in outcome at short, medium and long-term follow-up for group and individual CBT. Huntley also found that gCBT was more effective than treatment as usual (TAU).

This present meta-analysis aimed to investigate the effects of gCBT as a treatment for depression in recent controlled trials, conducted since 2000, in which the first or second edition of the Beck Depression Inventory (Beck et al. 1961, 1996) were used to assess outcome. Many studies in previous meta-analyses are quite old (e.g. Hans and Hiller 2013) and so may not reflect contemporary gCBT practice. Other meta-analyses include studies that use a wide range of assessment instruments (e.g., Feng et al. 2012) which introduce measurement-method variability into outcome data. In addition, this study also aimed to assess the impact of gCBT on depressive cognition and quality of life. Change in depressive cognition is a central treatment goal for CBT (Beck et al. 1979). Quality of life (as opposed to symptom change) is a key feature of third wave cognitive psychotherapies and therefore is an important factor to consider in the evaluation of gCBT.

Method

A protocol based on Cochrane guidelines (Higgins and Green 2011) and the PRISMA statement (Moher et al. 2009) was used for this meta-analysis.

Search Strategy

Four databases (PsychInfo, EMBASE, PubMed and Cochrane) were searched for articles describing gCBT evaluation studies published in English language peer reviewed journals between 2000 and 2016 using the following search terms: (group AND depression) AND (cognitive behavi* therapy OR CBT OR GCBT OR CBGT) AND (BDI OR Beck Depression Inventory). 2996 articles were found. Reference lists of systematic reviews identified in the electronic database search were examined but did not lead to the identification further potentially eligible studies.

Inclusion and Exclusion Criteria

Studies were included if they (1) evaluated outpatient gCBT for depression within the context of RCTs or non-RCTs using the BDI-I or BDI-II as an outcome measure; (2) contained participants who were adults aged 18–65 years of normal intelligence with a primary diagnosis of depression made by a psychiatrist or with a structured clinical interview that yielded a DSM-IV or 5 (American Psychiatric Association (APA) 2000, 2013) or ICD-10 (1992) diagnosis of depression; and (3) were published in English language academic journals between 2000 and 2016. Studies which did not meet these design, participant, or publication criteria were excluded. gCBT was defined as group CBT programs involving behavioral activation and cognitive restructuring as the core elements and excluding mindfulness based CBT.

Data Extraction

Data were extracted from articles using a coding system covering study characteristics, participant characteristics, and means, standard deviations and effect sizes (if reported) on measures of depressed mood, depressive cognition and quality of life. Two members of the research team were involved in data extraction. Where there was ambiguity about how to code specific items, these were discussed and a consensus was reached.

Assessment of Risk of Bias

The quality of studies was assessed using the Cochrane Collaboration’s domain-based evaluation tool for assessing risk of bias (Higgins and Altman 2008). Where there was ambiguity about how to rate specific items two members of the research team discussed the item and a consensus rating was made. In addition, funnel plots were drawn using study outcome data on depressed mood, depressive cognition and quality of life.

Data Analysis

The Comprehensive Meta-Analysis Version 2 (CMA, Borenstein et al. 2005) software package was used for data analyses. Means and standard deviations on measures of depressed mood, depressive cognition and quality of life at post-treatment, and follow-up from gCBT and comparison groups were used to compute Cohen’s d effect sizes. In combining effect sizes from sets of studies, they were weighted by the inverse of the variance (Borenstein et al. 2009). Before analyzing effect size data from multiple studies, the Q test for heterogeneity was conducted to determine if there was significant variation between studies, and the I 2 index was calculated to determine the degree of heterogeneity (Huedo-Medina et al. 2006). Where groups of effect sizes were homogenous a fixed effects model was used for data analysis, and a random effects model was used for analyzing data when the Q test was significant (Borenstein et al. 2009). For each set of effect sizes analyzed, a main effects test was conducted to determine if the mean effect size differed from zero. Following Cohen’s (1977) interpretation guidelines, effect sizes of 0.2 were considered small, 0.5 medium, and 0.8 large.

Meta-analyses were conducted for three dependent variables: depressed mood, depressive cognition, and quality of life. For each of these dependent variables, data comparing the effects of gCBT with all comparison groups were analyzed first. Then, where data were available, three further analyses were conducted in which the effects of gCBT was compared with waiting list control groups (WLC), treatment as usual (TAU), or well defined alternative treatments (ALT). For each of the three dependent variables, separate analyses were conducted on post-treatment data, follow-up data (where available), and combined post-treatment and follow-up data (where available) or post-treatment data only (if follow-up data were unavailable).

Results

Selection Procedure

A flow chart of the study selection process is shown in Fig. 1. Using the search strategy, described above 2996 records were identified. Of these 2944 and 17 duplicates were excluded after reading titles and abstracts. Of the remaining 35, a further 15 were excluded after assessing the full text of articles for eligibility.

Fig. 1
figure 1

Flow chart of study selection process

Assessment of Publication Bias and Study Quality

In order to minimize publication bias, a comprehensive literature search was conducted using both electronic databases and traditional sources. Two of ten studies (Andersson et al. 2013; Zettle et al. 2011) identified reported negative effect sizes for gCBT which suggests an absence of publication bias due to under-reporting of non-significant or negative results. Asymmetry was evident in funnel plots for depressed mood, depressive cognition and quality of life, suggesting the presence of publication bias. However, due to the relatively small number of studies this finding should be interpreted cautiously.

Quality ratings of studies were made with the Cochrane Collaboration’s domain-based evaluation tool for assessing risk of bias (Higgins and Altman 2008). Total scores for the ten studies in this meta-analysis ranged from 0 to 7 indicating that that there was considerable variability in their adherence to Cochrane recommendations for reducing bias. In this group of studies there was a higher risk of selection, performance and detection bias, than attrition or reporting bias. This was due on the negative side to lack of information on random sequence generation and allocation concealment, and the absence of blinding. On the positive side, it was due to thorough reporting of results and attrition rates.

Study Design Characteristics

Study design characteristics are summarized in Table 1. Studies were carried out in nine different countries and varied in size from N = 18 to 347. Dropout-rates varied from 0 to 30%. All studies except one (Zamirinejad et al. 2014) were RCTs. In three studies gCBT was compared to a WLC group (Embling 2002; Wong, 2008a, b); in two gCBT was compared to TAU (Chiang et al. 2015; Muktar and Oei 2011); while in the other five, there was an ALT comparison group. The alternative therapies were positive psychotherapy (Asgharipoor et al. 2012), mutual support group therapy (Baker and Neimeyer 2003), group acceptance and commitment therapy (Zettle et al. 2011), group resilience training (Zamirinejad et al. 2014), and guided internet CBT (Andersson et al. 2013). While all studies reported post-treatment outcomes, only four included follow-up results (Chiang et al. 2015; Zettle et al. 2011; Zamirinejad et al. 2014; Andersson et al. 2013).

Table 1 Study design and gCBT treatment programme characteristics

Intervention Characteristics

gCBT treatment program characteristics are given in Table 1. gCBT groups contained 4–9 participants. In six of ten studies therapy was facilitated by mental health professionals with 2–8 years of experience, whereas in two studies graduate students or paraprofessionals along with experienced therapists facilitated treatment. gCBT programs spanned 8–12 sessions. In seven studies sessions were offered on a weekly basis; in two they were offered more frequently; and in one they were offered weekly initially, and then fortnightly. Sessions lasted between 60 and 180 min.

Participants’ Demographic and Clinical Characteristics

Demographic and clinical characteristics of participants are given in Table 2. The ten studies evaluated 923 participants aged between 19 and 64 years. In studies where data were available, average percentages of demographic and clinical characteristics were as follows: 69% of participants were female (k = 9); 48% were married (k = 6); 16% were unemployed (k = 4); 64% had a history of recurrent depression; and 72% were on antidepressant medication (k = 8). Mean scores prior to treatment on the first and second editions of the BDI ranged from 23.9 to 39.90. This indicates that participants were moderately or severely depressed. Moderate depression is indicated by scores of 19–29 on the BDI-I and 20–28 on the BDI-II. Scores of 30–63 on the BDI-I and 29–63 on the BDI-II indicate severe depression.

Table 2 Participants demographic and clinical characteristics

Effect of gCBT on Depressed Mood

Results of analyses of the effect of gCBT on depressed mood are given in Table 3 and Fig. 2. At post-treatment (d = 1.61) and in the analysis of combined post-treatment and follow-up data (d = 1.63), large significant (p < .001) pooled effect sizes were found for gCBT compared with all comparison conditions (WLC, TAU and ALT) indicating that participants who received gCBT fared better than approximately 95% of those in comparison conditions on measures of depressed mood. There were ten studies in each of these analyses and individual effect sizes ranged from small and negative (d =−0.10) to large and positive (d = 4.86).

Table 3 Results of meta-analysis for depressed mood assessed with the Beck Depression inventory (Beck et al. 1961) and Beck Depression Inventory II (Beck et al. 1996) for gCBT compared with waiting list control (WLC), treatment as usual (TAU) and alternative treatment (ALT) comparison groups at post-treatment, follow-up and combined outcomes at post-treatment and follow-up
Fig. 2
figure 2

Forest plots for studies of gCBT v WLC + TAU + ALT with effect sizes based on combined post-treatment and follow-up data for depressed mood, depressive cognition, and quality of life

There were depressed mood follow-up data for four studies where individual effect sizes ranges from medium and negative (d = −0.73) to large and positive (d = 4.60) and the pooled effect size (d = 1.47, p < .001) was large and significant. In three studies gCBT was compared with an ALT treatment and in one to TAU. Follow-up periods ranged from 2 months to 3 years. These results indicate that at follow-up participants who received gCBT fared better than approximately 93% of those in ALT and TAU comparison conditions on measures of depressed mood. However this finding requires cautious interpretation due to the large variability in results of studies in this analysis.

In three studies the effect of gCBT compared with WLC on depressed mood was evaluated at post-treatment. Individual effect sizes ranged from medium (d = 0.74) to large (d = 2.52). There was a large statistically significant pooled effect size (d = 1.20, p < .001) indicating that participants who received gCBT fared better than approximately 88% of those in WLC groups on measures of depressed mood.

In two studies the effect of gCBT compared with TAU on depressed mood was evaluated. Across the post-treatment and combined post-treatment and follow-up analyses individual effect sizes were large and ranged from d = 3.37 to 4.86. There was a large statistically significant pooled effect size in both analyses (d = 4.28 and 4.64, p < .001) indicating that participants who received gCBT fared better than approximately 99% of those who received TAU on measures of depressed mood.

In five studies the effect of gCBT compared with ALT treatments on depressed mood was evaluated. At post-treatment (d = 0.60, p < .001) and in the analysis of combined post-treatment and follow-up data (d = 0.53, p < .05), medium significant pooled effect sizes were found for gCBT compared with ALT treatments indicating that participants who received gCBT fared better than approximately 70–73% of those in ALT treatment on measures of depressed mood. Individual effect sizes in these five studies ranged from small and negative (d = −0.10) to large and positive (d = 1.90).

There were depressed mood follow-up data for three studies that compared gCBT and ALT treatments. In these individual effect sizes ranged from (d =−0.74) to large (d = 1.75) and the pooled effect size was not significantly different from zero (d = 0.41, p > .05). Follow-up periods in these studies ranged from 2 months to 3 years.

Effect of gCBT on Depressive Cognition

Results of analyses of the effect of gCBT on depressive cognition are given in Table 4 and Fig. 2. Depressive cognition was assessed with the Automatic Thoughts Questionnaire (Hollon and Kendall 1980) in three studies (Baker and Neimeyer 2003; Chiang et al. 2015; Zettle et al. 2011), and the Dysfunctional Attitudes Scale (Weissman 1979) in three studies (Wong 2008a, b; Zettle et al. 2011). At post-treatment (d = 2.34, p < .01) and in the analysis of combined post-treatment and follow-up data (d = 2.66, p < .001), large significant pooled effect sizes were found for gCBT compared with all comparison conditions indicating that participants who received gCBT fared better than approximately 99% of those in WLC and ALT comparison conditions on measures of depressive cognition. There were five studies in each of these analyses and individual effect sizes ranged from small and negative (d = −0.45) to large and positive (d = 17.43).

Table 4 Results of meta-analysis for quality of life assessed with the Automatic Thoughts Questionnaire (Hollon and Kendall 1980) and the Dysfunctional Attitudes Scale (Weissman, 1979) for gCBT compared with waiting list control (WLC) and alternative treatment (ALT) comparison groups at post-treatment, follow-up and combined outcomes at post-treatment and follow-up

Because the effect sizes of d = 12.57 in the post-treatment analysis and d = 17.43 in the combined analysis were so much larger than the other effect sizes in these analyses, additional analyses were conducted in which the study containing the very large effect sizes (Chiang et al. 2015) was excluded. These analyses had the advantage of not being unduly influenced by the effects of a large outlying data point. At post-treatment (d = 0.52, p < .05) and in the analysis of combined post-treatment and follow-up data (d = 0.55, p < .01), medium significant pooled effect sizes were found for gCBT compared with comparison conditions. This showed that, when study with exceptionally positive results was excluded from analysis, participants who received gCBT fared better than approximately 71% of those in WLC and ALT comparison conditions on measures of depressive cognition.

There were depressive cognition follow-up data for two studies and the pooled effect size (d = 9.43, p > .05) for these was large but not significant. The result was not significant mainly because of the large variability in the results of both studies in this analysis. In a comparison of gCBT and TAU, Chiang et al. (2015) obtained a large positive and significant effect size for depressive cognition at one-year follow-up (d = 19.05, p < .001). In contrast Zettle et al. (2011) obtained a small negative non-significant effect size (d = −0.08, p > .05) for depressive cognition at 2 months follow-up when gCBT was compared with acceptance and commitment therapy.

In two studies the effect of gCBT compared with WLC on depressive cognition was evaluated at post-treatment. Individual effect sizes were medium (d = 0.44) and large (d = 0.88). There was a medium statistically significant pooled effect size (d = 0.53, p < .001) indicating that participants who received gCBT fared better than approximately 70% of those in WLC groups on measures of depressive cognition.

In two studies the effect of gCBT compared with ALT treatments on depressive cognition was evaluated. The pooled effect size was small and not significant (d = 0.36, p > .05). This overall finding masks the wide variability in results from these two studies. Baker and Neimeyer (2003) found that gCBT was more effective in ameliorating depressive cognition than a support group, reflected in a large significant post-treatment effect size (d = 0.94, p < .01). In contrast Zettle et al. (2011) found that gCBT and acceptance and commitment therapy did not differ in their effects on depressive cognition reflected in a small to medium negative effect size (d = −0.45, p > .05).

Effect of gCBT on Quality of Life

Results of analyses of the effect of gCBT on quality of life are given in Table 5 and Fig. 2. There were 3 different measures of quality of life in these studies. Wong (2008a) used the Abbreviated Quality of Life Enjoyment and Satisfaction Questionnaire (Endicott et al. 1993); Andersson et al. (2013), used the Quality of Life Inventory (Frisch et al. 1992); and Asgharipoor et al. (2012), the Subjective Wellbeing Scale (Golestanibakht 2007). At post-treatment (d = 0.13, p > .05) and in the analysis of combined post-treatment and follow-up data (d = 0.15, p > .05) there were small non-significant pooled effect size for gCBT compared with WLC and ALT comparison conditions. There was considerable variability in effect sizes which ranged from small and negative (d = −0.41) to medium and positive (d = 0.61). Wong (2008a) found a significant post-treatment positive medium effect size (d = 0.61, p < .001) for quality of life when gCBT was compared with WLC. In contrast, studies by Asgharipoor et al. (2012) comparing gCBT and positive psychotherapy, and by Andersson et al. (2013) comparing gCBT and guided internet CBT yielded non-significant effect sizes.

Table 5 Results of meta-analysis for depressive cognition assessed with the subjective wellbeing scale (Golestanibakht 2007) the abbreviated Quality of Life, Enjoyment, and Satisfaction Questionnaire (Endicott et al. 1993), and the Quality of Life Inventory, (Frisch 1988) for gCBT compared with waiting list control (WLC) and alternative treatment (ALT) comparison groups at post-treatment, follow-up and combined outcomes at post-treatment and follow-up

Discussion

This study aimed to evaluate the effectiveness of gCBT in ameliorating depressed mood assessed with the BDI, depressive cognition, and quality of life by conducting a meta-analysis of data from recent controlled trials. There were four key findings. First, for depressed mood assessed with the BDI, large significant post-treatment effect sizes were found for gCBT compared with WLC and TAU and a medium effect size was found for comparisons of gCBT and ALT treatments. Second, at follow-up periods ranging from 2 months to 3 years improvements in depressed mood made during gCBT were maintained in comparison with TAU and ALT treatment. However, this conclusion is tentative due to the small number of studies (k = 4) and the wide variability in the results of these. Third, compared with comparison groups, especially WLC comparison groups, gCBT yielded significant post-treatment effects on depressive cognition. Overall these effects were large. The magnitude of the effect size was partly influenced by the extremely positive results of one study. A medium effect of gCBT on depressive cognition was found when this outlier was excluded. Improvements in depressive cognition found after treatment were not maintained at follow-up. However this finding should be interpreted cautiously due the small number of studies with depressive cognition follow-up data (k = 2) and the wide variability in their results. Fourth, there was no evidence that gCBT improved Quality of Life. However, this finding requires cautious interpretation due to the small number of studies containing quality of life data (k = 3), and the fact that all three studies use different measures.

The overall effect sizes for depressed mood in the present study were more positive than those of previous meta-analyses, and this is an important novel finding. Feng et al. (2012) found a small to moderate effect size on a range of measures of depressed mood for gCBT after treatment (d = 0.4, k = 16) and at 6 months follow-up (d = .38, k = 5) but a negligible effect after 6 months (d = 0.06, k = 4). Huntley et al. 2012 found a moderate effect size on a range of measures of depressed mood for gCBT plus TAU compared with TAU after treatment (d = 0.55, k = 14), and at short and long-term (d = 0.47, k = 3) follow-up. Another novel meta-analytic finding was the positive effect of gCBT on depressive cognition.

The principal limitation of the current meta-analysis was the limited number of very high quality studies with follow-up data, and data on depressive cognition and quality of life. Its main strengths were its adherence to Cochrane (Higgins and Green 2011) and PRISMA (Moher et al. 2009) guidelines.

For more informative meta-analyses on the effects of gCBT to be possible, more high quality RCTs (Higgins and Altman 2008) are required in which follow-up data are collected for periods of at least six months. These studies should also include measures of depressive cognition and quality of life, as well as measures of symptomatic improvement. Future gCBT RCTs should also include cost-effectiveness data especially taking into account the high cost of depression to individuals, families and society (Kessler 2012).

The principal clinical implications of this study are that gCBT may be used either alone or as part of multimodal intervention involving antidepressant medication in the treatment of mild to moderate major depressive disorder.