Introduction

Although numerous lower back pain investigations have established benefits of psychological interventions on various outcomes [1,2,3,4], the incremental benefit of cognitive behavioral therapy (CBT) among patients undergoing lumbar spine surgery remains unclear. Lumbar spine surgery is increasingly common and aims to alleviate chronic low back and radicular pain [5,6,7,8]. Despite the efficacy of many spine surgery procedures, postoperative recovery can be suboptimal, and patients may complain of unrelenting pain or poor functionality [9, 10]. Patient reports of minimal improvement and various complications are associated with reoperation rates as high as 23% at 8–10 years following lumbar decompression [11].

To improve surgical outcomes, there is a push toward optimizing modifiable risk factors through prehabilitation [12, 13]. This approach is thought to have potential advantages including shorter recovery times, less postoperative pain, greater independence, fewer complications, and reduced costs [14,15,16,17,18]. Prehabilitation includes improving preoperative levels of physical activity, anxiety reduction, and nutrition optimization [12], and it may improve presurgical beliefs, such as kinesophobia [19], catastrophizing [20], and fear avoidance. The aforementioned metrics have been observed to correlate with inferior outcomes following lumbar surgery [21,22,23,24,25,26,27]. CBT can target these beliefs to help patients succeed after surgery. The purpose of CBT is provide support to patients experiencing feelings of distress by teaching coping mechanisms [28]. CBT is a “problem-oriented” therapy that aims to teach patients cognitive reframing techniques to confront current distress [28]. CBT has demonstrated efficacy in treating depression [29, 30], generalized anxiety disorder, panic disorder, pain, and post-traumatic stress disorder [30].

Clinicians have started exploring CBT’s utility in primary care and in surgery. CBT has shown efficacy as a preoperative intervention for patients undergoing bariatric surgery [31, 32], cardiac surgery [33], and lumbar surgery [10]. Several randomized control trials (RCTs) have investigated pre- and postoperative CBT among lumbar spine surgical candidates. Results of these RCTs are varied, and no summary of these findings is available. The purpose of this study is to evaluate the effect of CBT on patient-reported outcome (PRO) improvement following lumbar spine surgery.

Methods

General study design

Each RCT was classified as comparing CBT outcomes to protocol therapeutic alternatives (PTA) or versus usual conventional (UC) pre- and postoperative care. PTAs involved specific interventions including education or exercises focused on improving outcomes following surgery. We evaluated pre- and postoperatively collected outcome instruments that were measured at baseline before beginning the intervention of interest, as well as short- and long-term postoperative time points. To preserve comparability to other CBT-focused systematic reviews, short-term time points were selected in the range of 2–3 months postoperatively. We collected the final outcome assessment in each study as our long-term outcome measurement. We assessed PROs in terms of five outcome categories: disability, back pain, leg pain, quality of life, and psychological outcomes. These outcomes were based on a variety of specific outcome measurement tools utilized among the included studies at the specified time points.

Search strategy and information resources

To gather relevant randomized studies with a reproducible study design, we used the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA, Fig. 1). We consulted with a medical library and information sciences expert to conduct our search. Our reproducible search strategy was refined and recorded (“Appendix 1” in ESM). We searched the following seven databases in December 2019: PubMed/MEDLINE, Scopus, CINAHL, Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, PsycINFO, and Google Scholar.

Fig. 1
figure 1

Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) diagram

Study selection

Studies were selected based on their evaluation of a clinical impact of CBT on lumbar spine surgery outcomes. While descriptions of CBT can be heterogeneous [34], we defined CBT as an active psychologically informed intervention delivered by a trained therapist to identify and alter a patient’s maladaptive thoughts and behaviors [35]. Articles were included if they had the following characteristics: (1) if they were RCTs, (2) if patients underwent lumbar spine surgery, (3) if they included CBT interventions (pre- or postoperatively), and (4) if they contained PROs. The search was completed with both controlled vocabulary (i.e., MeSH terms) and keywords in the title or abstract. Studies were excluded if no English translation was available, if the article existed solely as a study protocol, if CBT was not conducted with a health professional, or if the study was only published in abstract form. We did not place restrictions on the publication date, geography, or participant age.

Data extraction and study quality assessment

We used Covidence (Melbourne, Australia) to synchronize our study assessment and evaluation. Duplicates were removed based on titles, abstracts, or matching identifiers. Each study was assessed by title and abstract by two independent reviewers (JMP, MSP). Full texts were then screened and inputted into a standardized form. We gathered the following information: demographics, surgical and psychological interventions (type, duration, timing, and provider), number of CBT sessions, and PROs (such as Oswestry Disability Index [ODI], Tampa Scale for Kinesiophobia [TSK], the Pain Catastrophizing Scale [PCS], and the European Quality of Life 5 Dimensions Questionnaire [EQ-5D]). To assess risk of bias (RoB), two reviewers used the Cochrane Back Review Group (CBRG) criteria, which evaluates selection bias, performance bias, detection bias, attrition bias, reporting bias, and all other biases [36]. Reviewers gave each category a rating of “low,” “high,” or “unclear.” When reviewers did not agree, a second review was conducted until consensus was reached.

Synthesis and analysis

Statistical analysis was performed with Stata SE 16.1 (College Station, TX, USA). The purpose of data synthesis was to numerically estimate the possible effect of CBT on spine surgery outcomes (disability, back pain, leg pain, quality of life, and psychological assessments). Secondary aims included exploring heterogeneity between studies and conducting sensitivity analysis. Standard deviations were collected and, if absent, were calculated [37, 38]. Forest plots visually depicted effect size, confidence intervals, and the effect of study heterogeneity. I2 statistics were used to describe the variability of effect size estimates, and previously established I2 statistic numerical cutoffs for heterogeneity estimation were used: Up to 40% heterogeneity may have been unimportant; 30–60% moderate heterogeneity; 50–90% substantial heterogeneity; and 75–100% considerable heterogeneity [39]. I2 statistics were compared with a Chi-squared test, and statistical significance was evaluated based on the associated p value (p < 0.005).

Our primary effect was the standardized mean difference (SMD) of CBT arms compared to control arms (e.g., PTA or UC) for all outcomes. Effect size was examined at three time points: preintervention baseline, postoperative short-term, and long-term. We used a random effects model with Cohen’s d estimators for our meta-analyses. Our outcome subgroup analysis was conducted according to five outcome categories: disability, back pain, leg pain, quality of life, and psychological assessment instruments. An outcome needed to be reported by at least three studies to be assessed in the meta-analysis. When necessary, outcome instrument scales were reversed by taking the mean score and subtracting from the maximum.

Sensitivity analysis was conducted by comparing the difference of effect size between the two groups of studies, i.e., those that assessed CBT vs PTA as compared to those that investigated CBT versus UC. A meta-regression assessed the influence of moderators on heterogeneity. Moderators that were assessed were the postoperative duration until final assessment, duration of CBT, number of CBT sessions, and instrument category.

Results

Systematic review

In total, our search returned 366 articles. After 125 duplicates were removed, 241 articles were screened by two independent reviewers, and 29 were selected for a full-text review (Fig. 1). The eleven studies eligible for qualitative analysis were conducted between 2003 and 2019 and included 735 lumbar spine surgery patients (Table 1). Of the eleven studies, seven evaluated lumbar fusion, three analyzed lumbar disk surgery, and one investigated laminectomy procedures. The primary intervention, CBT, was utilized preoperatively in four of the eleven studies. Postoperatively, ten of the eleven studies conducted some form of CBT. The total number of CBT sessions differed substantially, ranging from 3 to 18 sessions (Table 2). CBT sessions were completed in groups and individually in person, with the exception of one computer-based CBT study. Practitioner type varied, with the most frequent provider being physiotherapists (7 studies), followed by multidisciplinary teams (3 studies) (Table 2). The most frequently investigated PROs were disability (82%, ODI), pain (55%, visual analog scale), quality of life (55%, EQ-5D, 55% Short Form-36), pain catastrophizing (45%, Pain Catastrophizing Scale), and fear of movement (45%, TSK). PROs demonstrated improvement with CBT intervention (vs. control) in six of eleven studies (Table 3).

Table 1 Descriptive characteristics by study
Table 2 Cognitive behavioral therapy protocol by study
Table 3 Randomized control trial cognitive behavioral therapy outcomes by study

Quality assessment

Study population sample sizes ranged from 86 to 130, with an average age ranging from 42.8 to 58.4 years. Reporting styles and quality ranged considerably, with over one-third of all quality assessment (37.6%) ratings were either “poor” or “unclear.” Further analysis of risk of bias, assessment revealed that 91% (n = 10) of the included studies carried a low risk of bias and 9% (n = 1) carried a unclear risk of bias due to inadequate generation of a randomized sequence. In terms of risk of bias due to inadequate concealment of allocations prior to assignment, 100% of the included studies carried a low risk of selection bias. Performance bias due to inadequate blinding of participants and personnel carries the highest risk with 91% (n = 10) of studies carrying a high risk and only 9% (n = 1) carrying a low risk. Risk of detection bias due to inadequate blinding of outcome assessors was variable with 54.5% (n = 6) carrying a low risk, 9% (n = 1) carrying an unclear risk, and 45.5% (n = 4) carrying a high risk. Review of the included studies also demonstrated a more favorable risk of attrition bias with 64% (n = 7) demonstrating a low risk and 36% (n = 4) with an unclear risk of attrition bias due to handling of incomplete outcome data. Reporting bias due to selective outcome reporting had 91% (n = 10) of studies carrying a low risk and 9% (n = 1) with an unclear risk. Lastly, 27.3% (n = 3) of studies carried a low risk, 54.5% (n = 6) carried an unclear risk, and 18.2% (n = 2) carried a low risk of bias due to other sources. The most common bias category to incur a “high” risk rating was “blinding of participants and personnel.” Seven studies had two or more categories with a “high” risk of bias. A detailed summary of study bias is shown in Fig. 2.

Fig. 2
figure 2

Risk of bias assessments for included studies. (Top) Breakdown of each component by study. (Bottom) Proportion of studies contributing each level of bias by category

On second review, four of the 11 studies originally included were removed due to a high risk of bias. Three of the removed studies reported results from the same RCT [40,41,42]. The investigators disclosed that treatment integrity within this RCT was likely compromised due to “issues” with adherence to study protocol [40]. The fourth study excluded was removed because it evaluated an computer-automated CBT application [43].

Meta-analysis

A total of seven studies from the previous eleven were included in the meta-analysis with 531 patients. Five compared CBT to UC and two compared PTA. The SMD improved from baseline measurements during both short-term and long-term evaluations for all five outcome categories (e.g., disability, back pain, leg pain, quality of life, and psychological) for PTA and CCG groups (Table 4). Forest plots of all long-term outcome categories indicated effect sizes that favored CBT in comparison with control groups (Figs. 3, 4).

Table 4 Summary of meta-analysis results with pooled effect sizes (95% confidence intervals)
Fig. 3
figure 3

Effect of cognitive behavioral therapy on disability, back pain, leg pain, quality of life, and overall psychological outcomes

Fig. 4
figure 4

Effect of cognitive behavioral therapy on psychological subgroups; global mental health, self-efficacy, fear of movement, and catastrophizing

All studies included in the meta-analysis reported at least one instrument for disability, back pain, quality of life, and psychological. Only one study did not report leg pain. Four psychological outcomes were reported with a high enough frequency (n ≥ 3) to be compared in further subgroup analysis. Psychological subgroups included catastrophizing, fear of movement, global mental health, and self-efficacy. At the short-term follow-up, the majority of studies reported outcome difference effects favoring CBT over control groups (small: 20% [n = 7], moderate: 6% [n = 2], large: 34% [n = 12]). At long-term follow-up, the number of outcomes reflecting effect sizes favoring CBT increased (small: 31% [n = 11], moderate: 23% [n = 8], large: 20% [n = 7]).

At short-term follow-up, while all outcomes favored CBT, disability (SMD =  − 0.73 [95% CI  − 1.46, 0.01], p < 0.001, I2 = 93.5%) and back pain (SMD =  − 0.42 [95% CI   − 0.98, 0.14], p < 0.001, I2 = 89.5%) had the largest effect sizes for outcome differences. At the long-term follow-up, psychological health outcomes (SMD = 0.61 [95% CI 95% CI 0.28, 0.94], p < 0.001, I2 = 89.7%) and overall quality of life (SMD = 0.55 [95% CI 0.05, 1.05], p < 0.001, I2 = 86.7%) outcome differences were most suggestive of a statistically significant effect size. The largest effect sizes within the psychological outcome subgroup analysis favoring CBT were observed for differences in fear of movement (SMD = 0.67 [95% CI 0.02, 1.32], p < 0.001, I2 = 91.7%) and self-efficacy (SMD = 0.27 [95% CI 0.02, 0.52], p < 0.001, I2 = 13.2%).

Comparing CBT versus PTA versus UC

Our analysis of the two study designs (e.g., CBT vs. PTA and CBT vs. UC) revealed that both designs favored CBT interventions for all assessed outcome groups (Table 4). In particular, CBT demonstrated the largest effect compared to UC for disability at the short-term time point (SMD =  − 1.05 [95% CI  − 2.15, 0.05], I2 = 95.01). The largest effect of CBT compared to PTA was demonstrated for back pain at the short-term time point (SMD =  − 0.54 [95% CI  − 1.41, 0.34], I2 = 86.72). At the long-term time point, CBT demonstrated its largest effect compared to UC for back pain (SMD =  − 0.87 [95% CI  − 2.45, 0.71], I2 = 97.48), whereas, compared to PTA, the largest effect by CBT was observed for disability (SMD =  − 0.39 [95% CI  − 0.68, − 0.10], I2 = 0.00). Only leg pain could not be assessed among PTA studies due to insufficient reporting.

When assessing the long-term follow-up for CBT versus UC studies, considerable heterogeneity was observed for disability (I2 = 95.80), back pain (I2 = 97.48), leg pain (I2 = 94.13), and psychological outcomes (global mental health, fear of movement, and catastrophizing) (I2 = 93.82; all p < 0.001, all). For long-term follow-up of CBT versus PTA studies, heterogeneity among studies was assessed as unimportant. When assessing all studies included in the meta-analysis, considerable heterogeneity was observed among disability (I2 = 92.75), back pain (I2 = 95.80), leg pain (I2 = 91.47), psychological outcomes overall (I2 = 89.71), and within the global mental health, fear of movement, and catastrophizing subgroups (p < 0.001, all).

Sensitivity analysis

When incorporating our study design comparison into a sensitivity analysis, our analysis revealed no significant differences among subgroups for disability, back pain, leg pain, quality of life or overall psychological outcomes (Fig. 3). Among the psychological subgroups, the PTA versus UC subgroups were only observed to have a significant group difference for global mental health (p < 0.001, Fig. 4). When analyzing all outcomes as a part of a meta-regression, the only two statistically significant moderators were the frequency of CBT sessions and months until final follow-up (p < 0.001).

Discussion

At the time of our search, this is the only study that summarizes RCT findings regarding the effect of CBT on outcomes following lumbar spine surgery. The results suggest that, when combined with lumbar spine surgery, CBT is associated with a clinically significant improvement in postsurgical outcomes as compared to UC or PTA. Pooled effect of our meta-analysis indicated that of all included RCT outcomes (n = 34), the majority demonstrated significant effect sizes favoring CBT interventions at both short-term (62% [n = 21]) and long-term (76% [n = 26]) intervals. While the average final follow-up time period of the studies varied from 6 months to 3 years following surgery, our meta-analysis suggests that the effects of CBT are sustained over time.

Within our meta-analysis, there was considerable variation regarding the type, interval, and frequency of outcome reporting. All seven studies reported disability with ODI. Back pain was also reported by seven studies and was variously measured among studies using VAS back, low back pain rating scale (LBPRS), back pain inventory (BPI), and the numerical rating scale (NRS). Leg pain was reported by six of the seven studies in our meta-analysis and all studies reported outcomes analogous to their back-pain counterparts (i.e., studies utilizing VAS back to report back pain used VAS Leg to report leg pain). While each study reported quality of life outcomes, all were unique (e.g., EQ-5D VAS, EQ-5D Index, SF-12, Patient-reported functioning, and SF-36). Psychological outcomes were reported with the most variability. Even among category subgroups, reported outcomes differed (e.g., global mental health, self-efficacy, catastrophizing, pain avoidance, etc.).

Although each investigation used validated outcomes, overall quality might have been improved with adherence to current RCT literature guidance. For example, the initiative on Methods, Measurement, and Pain Assessment in Clinical Trials guidelines recommends specific validated outcomes to assess global areas including pain, physical functionality, emotional functionality, etc. [44, 45]. While three of the RCTs included in the meta-analysis mentioned the recruitment protocols adhering to the Consolidated Standards of Reporting Trials guidelines, heterogeneity might have been reduced further with better adherence to these guidelines [46].

Relation to other literature and strengths

The results of this study are similar to previous studies that have demonstrated improvement of PROs after CBT intervention [47, 48]. While previous meta-analyses have investigated CBT in relation to chronic back pain [49,50,51], our study is the first to specifically analyze the efficacy of CBT interventions on PROs in the setting of lumbar spine surgery. Furthermore, this meta-analysis only reviewed RCTs that were prescreened to evaluate bias using the CBRG criteria. Although the reviewed studies utilized differing CBT methodologies, most sessions were led by physiotherapists, which may have controlled for a number of potential confounders (provider type, training, experience level, etc.). Similar PRO investigations also allowed for comparative improvement among treatment and control groups.

There was substantial heterogeneity with regard to the implementation of CBT in each study. Among eleven articles in the systematic review, four included preoperative CBT and all of the articles made use of postoperative CBT. The overall number of sessions varied from as few as three sessions to 18 sessions. Session lengths varied between studies and within each protocol, and ranged from 30-min to 3 h. Settings of CBT ranged from in person to over the phone and one-on-one sessions with a therapist to group therapy with multiple therapists. Finally, the overall treatment duration ranged from 4 to 12 weeks. With such a heterogeneous pool of CBT protocols, it is challenging to assess which treatment variables were most influential. Our sensitivity and meta-analysis revealed that frequency of CBT sessions was a statistically significant moderator for overall outcome effect size. These findings are generally aligned with current research, though less heterogeneity between studies may have allowed for a greater number of variable comparisons.

In general, research has demonstrated that increased frequencies of psychotherapy sessions can result in greater symptom improvement [52,53,54]. Among patients with depression, a meta-analysis found that symptoms of depression were inversely proportional to the number of CBT sessions per week [55]. In current mental healthcare practice, CBT is typically held once a week. However, the original CBT manual recommends beginning with two sessions per week [35]. When looking at the studies in this meta-analysis, only one study conducted biweekly CBT sessions, and interestingly, this study demonstrated the greatest overall PRO improvement effects [56]. Our results suggest CBT’s utility as an evidence-based treatment for depression may also be applied to surgical candidates. Furthermore, this may indicate a similarity between the core issues typically addressed by CBT and those associated with poor postsurgical outcomes.

The modes of CBT delivery can have numerous implications on outcome effect. CBT can be administered one-on-one or in group settings. Both can have potential advantages and disadvantages [57]. One-on-one therapy is ideal when both patient and provider are supportive and therapeutically aligned [58]. Particularly in the context of therapist availability and insurance policy limits, the resource-intensive nature of one-on-one therapy can hinder the application. None of the assessed RCTs utilized psychiatrists for individual sessions. With customized training, however, providers of differing backgrounds can administer CBT [59]. While having more providers qualified to administer CBT could improve access to care, varied technical training might also contribute to an array of CBT methodologies and limit outcome analysis.

Moreover, the impact of therapist experience on CBT outcomes is both nuanced and controversial. Although therapist experience level is often considered to influence the effect of CBT on anxiety, depression, and pain, several studies have revealed that less experienced therapists can still achieve meaningful clinical results [60, 61]. Others have observed that more years of general CBT clinical experience has a significant influence on lessening patient anxiety or depressive symptoms [62,63,64].

Group therapy is both appealing financially and importantly addresses the social isolation that many patients feel in the perioperative setting [57]. On functional magnetic resonance imaging, for example, feelings of social isolation have been associated with the same brain regions that activate during physical pain experiences [65]. Stigma imposed by both the public and the self that the individual may be socially undesirable or unacceptable may be a barrier to participation in group therapy [66]. One therapeutic possibility could be restructuring preoperative group therapy to include education and realistic expectations of postoperative rehabilitation. This could facilitate group therapy motifs while potentially reframing the purpose of the sessions to diminish the stigma associated with a “group therapy” session. The social interactions fostered during group CBT sessions have helped people gain comfort with a common experience and focus on aspects of life outside of their pain experience [67].

This meta-analysis and the studies reviewed had several strengths and limitations. Our review expands on research regarding the influence of CBT on lower back pain by examining a surgical population. This study included an information science expert to assemble a reproducible study search string, incorporated multidisciplinary input, adhered to the PRISMA guidelines, assessed RoB, and conducted a meta-analysis on clinically relevant outcomes in RCTs. The investigations we assessed should be commended for adhering to a sequence generation protocol, strong allocation concealment, and low-risk selective outcome reporting. Another strength is that they assessed patient samples using an intent-to-treat methodology.

Limitations

Although CBT appears to be beneficial, this is only one approach to psychotherapy. One limitation is that our search focused on CBT and did not include other potentially useful therapy mediums such as operant behavioral therapy, biofeedback, mindfulness, or meditation. Additionally, it is important to consider how to maximize CBT effectiveness and cost utility. Although it would have been helpful to evaluate the effect of CBT during the preoperative time period compared to the postoperative time period, neither of the CBT versus PTA investigations utilized preoperative CBT. This limited our ability to discern whether a difference in effects occurred due to the PTA study design or the lack of preoperative CBT. Further, since none of the included studies examined preoperatively administered CBT without subsequent postoperative CBT interventions, it is difficult to assess the independent contributions of preoperative vs postoperative CBT on improvements in PROs.

While our study focused on lumbar spine surgery, outcomes including disability, pain, quality of life, and psychological measures have complex perioperative determinants. Given the numerous surgical variables that could be related to these outcomes, quality may have been added if the RCTs reported perioperative characteristics including surgical levels, operative duration, blood loss, complications, surgical history, or comorbidities. Without this information, the ability of our investigation to control for the effects of these characteristics on outcomes between groups and/or studies was limited. For example, if study groups differed significantly on the basis of comorbidities, these might present a significant confounding effect. Other limitations may have stemmed from the lack of therapist heterogeneity and short, non-specific CBT training regimens. This may limit the generalizability of the results to CBT interventions facilitated by other types of practitioners. To some extent, all of the studies reviewed utilized physiotherapists to administer CBT and only two of the studies included clinical psychologists. None of the investigations attempted to discuss potential advantages associated with therapist training.

Conclusion

Our meta-analysis indicates that CBT arms had improved lumbar spine surgery outcomes as compared to UC or PTA. Additionally, CBT session frequency may be associated with CBT efficacy. Overall, the quality of the literature is high with a number of published RCTs and generally low risk of bias, although the available studies regarding CBT in lumbar spine patients are relatively heterogeneous. While we could not determine whether pre- or postoperative CBT is more beneficial, the majority of the RCTs in our meta-analysis utilized some combination. Our observations suggest that both pre- and postoperative CBT approach may be effective and expand on the benefits of CBT in association with lumbar spine health. Among appropriately selected patients, use of pre- and postoperative CBT, alone or in combination, could become an important asset to optimize patients before surgery to improve disability, pain, quality of life, and mental health.