Introduction

Internationally, a rising trend in the number of spinal fusion procedures is observed [1, 2]. Over the years, technical advancements have been translated into higher radiographic success rates of bony fusion and sagittal alignment [3, 4]. In contrast, the clinical success rate remains only modest with up to 40% of patients reporting persistent pain, suboptimal functional improvement and dissatisfaction [5,6,7,8], and a work resumption in only half of the patients below normal retirement age [7]. Therefore, an urgent need exists to optimize clinical outcomes after lumbar fusion.

Rehabilitation has been put forward as a window of opportunity to enhance the value of spine care [9, 10]. However, the golden standard of rehabilitation for lumbar fusion remains largely unclear. This is reflected by extensive variation in everyday practice. For example, no consensus regarding timing and content of rehabilitation was found between surgeons in the Netherlands and Sweden [11]. This considerable variability in physiotherapy practice was also demonstrated in Australia and the United Kingdom [12, 13].

The shortcomings of previous reviews in this field are summarized below: firstly, previous reviews were focused on either the pre- or postoperative period but not on the entire care continuum [9, 10, 14, 15]; secondly, extrapolated or included evidence from other types of lumbar surgery [10, 15]; and/or thirdly, were out-of-dated [9, 14]. Hence, an updated review and meta-analysis of the effectiveness of rehabilitation strategies for lumbar fusion across the entire care continuum was warranted.

Therefore, the primary aim of this systematic review and meta-analysis was to assess and compare the effectiveness of unimodal and multimodal rehabilitation strategies on disability, pain, and pain-related fear in patients undergoing lumbar fusion surgery for degenerative conditions and (adult) isthmic spondylolisthesis. The secondary aim was to assess the effectiveness on return-to-work.

Methods

This systematic review followed the methods of the Cochrane Handbook for Systematic Reviews of Interventions [16], and is reported in line with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [17]. The protocol has been prospectively registered in PROSPERO (CRD42018083422).

Search strategy

Our search strategy included lumbar, fusion, rehabilitation, randomized controlled trial and related terms. To optimize the sensitivity of the search, no terms related to relevant outcomes were applied. This search strategy was developed in conjunction with a research librarian, peer reviewed for completeness within our team (LB, TT, TWS, LJ), and validated by testing whether it could identify eight relevant studies in PubMed and Embase. The full search strategy is outlined in Appendix A. We searched PubMed, Embase, Web of Science, Pedro, Cinahl and Cochrane Library from inception until April 28, 2021. To identify ongoing research, Clinicaltrials.gov was additionally searched. Thereafter, we scanned references of identified articles and relevant reviews. Our search output was managed in EndNote X9, which facilitated removal of duplicates in a stepwise manner [18]. After deduplication, two reviewers with complementary methodological and clinical expertise (LB, TT) independently screened titles and abstracts (phase 1) and full texts (phase 2) using blinded Rayyan software [19]. In case of disagreement, consensus was obtained after each phase by discussion and, if necessary, mediation by a third reviewer (LJ).

Eligibility criteria

RCTs investigating the effect of specified rehabilitation in the pre-, peri- and/or postoperative period of lumbar fusion on disability, pain and/or pain-related fear were eligible for inclusion (Table 1). Outcomes were narrowed from our registered protocol, representing most of the components of the International Classification of Functioning, Disability and Health (ICF) framework: pain (function), disability (activities), return-to-work (participation) and pain-related fear (personal factors), except for environmental factors as an a priori exploratory search indicated that these were not reported in this context. A pilot test was used to ensure that the eligibility criteria were applied consistently between the reviewers.

Table 1 Eligibility criteria for inclusion

Risk of bias

The quality of the included RCTs was independently assessed as ‘low’, ‘uncertain’ or ‘high’ risk of bias by two reviewers (LB, TT), using the Cochrane Collaboration Revised Risk of Bias tool for RCTs (RoB 2.0, version 22 August 2019, facilitated by Cochrane RoB 2: Learning Live series) [20, 21]. Given the nature of rehabilitation interventions, blinding of participants was not feasible. Therefore, this domain was not considered in the overall summary risk of bias judgment, which is in line with previous reviews of rehabilitation interventions [22].

Data extraction and synthesis

Data extraction was completed by two reviewers (LB, CA), using a predefined extraction form based on the TIDieR checklist (for details, see Table 2) [23]. Consistent data extraction by the two extracting authors was ensured by piloting the extraction form (on two articles).

Primary outcomes were patient-reported disability, pain, and pain-related fear at short term (≤ 6 months postoperatively) and/or long term (≥ 1 year postoperatively). Secondary outcome was return-to-work at short- and/or long term. If studies reported multiple follow-up moments, data closest to three months and one year postoperatively were used for meta-analyses for short term and long term, respectively.

Across all outcomes, random-effects meta-analyses were conducted of studies that were sufficiently homogeneous in terms of the rehabilitation procedure, procedure of the comparator and outcome measurement (by LB, TWS, LJ). Effect estimates were reported as relative risks (RR) and 95% confidence interval (95% CI) for dichotomous outcomes and standardized mean difference (SMD) with 95% CI for continuous outcomes. A SMD was applied, since different valid measurement scales of the same continuous outcomes were used across studies (e.g., for pain). Based on Cohen’s interpretation of effect size, a SMD of ≥ 0.2, ≥ 0.5 and ≥ 0.8 represents a small, moderate, and large effect, respectively. Post-rehabilitation measurements were used for effect size estimation as these yields more precise analysis for the included trials than change from baseline measurements (i.e., correlation coefficient of change scores was less than 0.5) [16]. Inverse variance weighting was used for pooling, which gives studies with more precise results (narrower confidence intervals) more weight. If sample mean and standard deviation could not be retrieved upon request from the corresponding authors, sample mean and standard deviation were estimated from reported CI; or from median and range. If multiple randomized arms were included in one RCT, each comparison was separately included but with the shared control group divided evenly among the comparisons [16]. Outliers were defined as studies in which the 95% CI of the studies effect size was outside the 95% CI of the pooled effect size. In case an outlier was detected, a sensitivity analysis by pooling the effect size again, this time excluding the identified outlier, was conducted. Statistical heterogeneity among the included studies was considered by calculation of I2 statistics, with 75% as boundary for high heterogeneity. High statistical heterogeneity did not preclude meta-analysis, but it downgraded ratings of the quality of evidence. Exploration of publication bias could not be visualized in funnel plots, since less than ten studies were included in our meta-analyses. All statistical analyses and visualizations of data were performed in R software (version 4.0.3), using meta package [24,25,26].

Certainty of evidence

The certainty of evidence was evaluated for each pooled estimate according to the GRADE system, as high, moderate, low, or very low [27]. The GRADE profile was downrated from high quality by one level for each of the following limitations: low methodological quality, inconsistency, indirectness, imprecision, or publication bias (operational rules are outlined in Table 3).

Results

A total of 4425 records were identified through electronic database searching (Fig. 1). After removal of duplicates, 2085 titles and abstracts were screened; and subsequently 86 full-text articles were reviewed for eligibility. Finally, 21 articles, reporting data from 18 different RCTs were included, with a total of 1402 participants (mean age 43–61 years, 57% female). Indications and fusion techniques varied across and within studies. Most described indications for lumbar fusion surgery were degenerative disc disease (39%) and spondylolisthesis (25%) (Appendix B). All articles were published in 2003 or later, and the trials were conducted in Europe (n = 15), Asia (n = 2) or Africa (n = 1).

Fig. 1
figure 1

Study selection flowchart, according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) diagram. RCT randomized controlled trial. *When multiple publications reported data from the same RCT, the first publication was referred to as primary publication and any additional publications as companion reports. Companion reports without relevant outcomes were excluded

Table 2 provides an overview of the data extraction. The 18 included trials investigated 21 different rehabilitation interventions in total.

Table 2 Description of study characteristics and key intervention items according to TIDieR guide

As Fig. 2 shows, rehabilitation was either initiated preoperatively (n = 6); postoperatively within three months (n = 8), from three months (n = 6), or unspecified (n = 1), yet all rehabilitation interventions included a postoperative rehabilitation component. Ten trials provided follow-up beyond one year.

Fig. 2
figure 2

Overview of the timing, duration, intensity and outcomes of the included studies. Timing and duration are visualized by the number of weeks, intensity is indicated by the number of sessions (and duration per session in minutes)

Nine interventions consisted of multimodal rehabilitation [28,29,30,31,32,33,34,35,36]. Of these, eight compared this multimodal rehabilitation to exercise therapy alone, and were included for meta-analyses [28,29,30,31,32,33,34,35]. All multimodal rehabilitation interventions featured exercise training, most often combined with CBT (n = 5), with fear-avoidance counseling (n = 1), case manager guidance (n = 1) or education and peer support (n = 1). Despite the multimodal nature of those interventions, these interventions were mostly provided by physiotherapists (n = 5) rather than by a multidisciplinary rehabilitation team (n = 3).

On the other hand, six interventions including exercise therapy alone were compared to usual care, and were included for meta-analyses [37,38,39,40,41]. Although they shared similar durations of at least six weeks, the exercise methods varied. All exercise interventions used strength training, and in two studies this was combined with cardiovascular conditioning [37, 39].

Due to heterogeneity in the content of the remaining six unimodal interventions (i.e., occupational therapy, psychological therapy, peer support and three different types of passive physiotherapy), no inclusion for meta-analysis was possible [37, 42,43,44,45].

Risk of bias

As shown in Fig. 3, the overall bias was scored unclear (n = 13; 72%) or high (n = 5; 28%). A high proportion of studies had an unclear or high risk for selective outcome reporting. This was mainly explained by a lack of registered protocols in the majority of RCTs (n = 11; 61%).

Fig. 3
figure 3

Risk of Bias assessment using the ROB2.0. Since blinding of participants is not feasible in rehabilitation interventions, thereby leading to high risk of outcome measurement, this was not considered in overall risk of bias assessment, as is outlined in the method section

Certainty of evidence and sensitivity analysis

A summary of pooled effect sizes and GRADE quality ratings are provided in Table 3. A sensitivity analysis showed that one outlier in the meta-analysis of the effect of multimodal rehabilitation on disability and pain, Monticone et al., highly contributed to heterogeneity and possibly lead to an overestimation of the effect size. This could partly be explained by clinical variation between the intervention of Monticone et al. and other multimodal rehabilitation interventions (i.e. more dose-intense rehabilitation program, less well described population). Therefore, this outlier was excluded [32], leading to a decrease in pooled effect size and a reduction from high to low heterogeneity.

Table 3 Overview of estimated effect of rehabilitation interventions according to their content and GRADE assessment

Effects on disability and pain (primary outcomes)

Effects on disability were reported for 13 interventions (five exercise and eight multimodal interventions), using the Oswestry Disability Index (ODI) [28,29,30, 32,33,34,35,36, 38, 40,41,42] or the Roland Morris Disability Questionnaire (RMDQ) [39]. Effectiveness on pain was measured with Visual Analog Scale (VAS) [29, 30, 33, 35, 38,39,40,41, 45], Numeric Rating Scale (NRS) [32] or Low Back Pain Rating Scale (LBPRS) [28, 34, 36, 37], for six exercise and eight multimodal interventions.

Exercise vs usual care

There is low-quality evidence that an exercise intervention was more effective than usual care for reducing disability at short term (four trials with a total of five interventions and 180 participants, SMD with 95%CI: −0.41 [−0.71; −0.10]) (Fig. 4). Only one study with a high overall risk of bias investigated the long-term effect of exercise treatment on disability, and reported no significant differences between exercise and usual care (SMD with 95%CI: −0.10 [−0.85; 0.66]) [40].

Fig. 4
figure 4

Forest plot for the meta-analysis of the effectiveness of exercise versus usual care for reducing disability and pain. All studies are ordered from most to least effective. Random-effects model was used. Negative effect sizes favor exercise therapy

Low-quality evidence from five studies (235 participants) indicated significantly more pain reduction after rehabilitation with an exercise component (SMD with 95%CI: −0.36 [−0.65; −0.08]). The pooled results of two studies (82 participants) provided low-quality evidence for no difference on the long term (SMD with 95%CI: −0.10 [−0.53; 0.34]).

Multimodal rehabilitation vs exercise

Participants who received a multimodal rehabilitation intervention (n = 255), which was in more than half of the patients initiated preoperatively, showed less disability at short-term follow-up than those who received only exercise therapy (n = 235) (SMD with 95%CI: −0.31 [−0.49; −0.13], low-quality evidence, six trials) (Fig. 5).

Fig. 5
figure 5

Forest plot for the meta-analysis of the effectiveness of multimodal rehabilitation versus exercise alone for reducing disability and pain. All studies are ordered from most to least effective. Random-effects model was used. Negative effect sizes favor multimodal rehabilitation

In the long term, the pooled result of five trials (including 394 participants) provided low-quality evidence for no significant effect on disability (SMD with 95%CI: −0.18 [−0.49; 0.14]).

For pain, low-quality evidence suggests no significant effect of multimodal rehabilitation compared to exercise alone at both short term (SMD with 95%CI: −0.23 [−0.51; 0.04], five trials with 450 participants) and long-term follow-up (SMD with 95%CI: −0.16 [−0.37; 0.05], four trials with 350 participants) (Fig. 5).

Peer support, occupational therapy, psychological intervention, or passive physiotherapy vs usual care

Christensen et al. compared a postoperative ‘back café’ to usual care. There was no group difference in back pain at two-year follow-up, and whereas peer support improved the ability to raise a chair, carry a bag and take stairs, no superiority was reported for the other daily functions. [37] Also, occupational therapy guided by a questionnaire in the immediate postoperative period was not associated with better daily functioning performance [44]. In contrast, Reichart et al. demonstrated that participants receiving a short perioperative psychological intervention to increase their self-efficacy reported less pain and better functionality than those receiving usual care [45].

Two trials investigated the effectiveness of passive, postoperative physiotherapeutic interventions. More specifically, Elsayyad et al. [42] reported less disability and pain when myofascial release or neural mobilization (under the form of longitudinal traction) were added to stabilization exercises compared to stabilization exercises only. On the other hand, Zhao et al. [43] favored acupuncture to improve functioning over complete bedrest for six weeks, however not reaching the minimal clinically important difference (MCID) for the Japanese Orthopaedics Association (JOA) score. Due to this striking contrast in comparator between both RCTs, those interventions were excluded for meta-analysis.

Effects on pain-related fear (primary outcome)

The effects on pain-related fear were reported in seven studies including five multimodal, one psychological and one exercise alone intervention, using the Tampa Scale of Kinesiophobia (TSK) [29, 30, 32, 33, 41] or Fear-Avoidance Beliefs Questionnaire (FABQ) [28, 45].

Exercise vs usual care

One study of uncertain quality including 37 participants showed no significant difference in pain-related fear between exercise and usual care at six weeks postoperative (SMD with 95%CI: −0.25 [−0.90; 0.40]) and attributed this partly to the absence of a longer follow-up [41].

Multimodal rehabilitation vs exercise

Participants who received a multimodal intervention showed less pain-related fear at short term, compared to those who received exercise therapy alone (four RCTs with 412 participants; observed SMD with 95%CI ranging from −0.02 [−0.40; 0.37] to −1.10 [−1.47; −0.73], low-quality evidence). At long-term follow-up, however, no significant difference in pain-related fear was present between participants of the multimodal intervention or those of the exercise intervention (four RCTs, including 409 patients; observed SMD with 95%CI ranging from 0.00 [−0.40; 0.40] to −1.91 [−2.33; −1.50], low-quality evidence) (Fig. 6). Both estimates were imprecise owing to the low absolute sample sizes, as indicated by the width of the confidence interval. High statistical heterogeneity across trials was present, yet no outlier was detected, and an additional sensitivity analysis was not performed because of the low number of trials.

Fig. 6
figure 6

Forest plot for the meta-analysis of the effectiveness of multimodal rehabilitation versus exercise alone for reducing fear-avoidance. All studies are ordered from most to least effective. Random-effects model was used. Negative effect sizes favor multimodal rehabilitation

Psychological intervention vs usual care

At short-term follow-up, Reichart et al. described a trend towards an increase in fear-avoidance beliefs after usual care and a decrease after a psychological intervention (p = 0.11). This study was limited by an uncertain risk of bias, sample of 39 participants and a follow-up of only 6 weeks [45].

Effects on return-to-work (secondary outcome)

Four studies evaluated the efficacy of specific rehabilitation on return-to-work at long-term follow-up [28, 29, 35, 37]. Taken together, the estimated relative risk for return-to-work tends to favor rehabilitation modes of various content (i.e., peer support, occupational therapy, exercise, multimodal rehabilitation). However, this difference was not statistically significant (pooled RR with 95%CI: 1.30 [0.99–1.69]) (Fig. 7).

Fig. 7
figure 7

Relative risk (RR) of return to work at long-term follow-up (Rolving et al. at 1 year, the remaining studies at 2 years postoperative) of rehabilitation interventions versus control group

Discussion

The results of this systematic review and meta-analysis indicate that exercise is likely to reduce disability and pain up to six months after lumbar fusion. Moreover, multimodal rehabilitation combining exercise training with CBT, peer support or counseling, is associated with a greater reduction in disability and pain-related fear than exercise alone. It is uncertain, however, which effects of exercise and multimodal rehabilitation persist in the long term and to what extent they remain beneficial. Also, since multimodal rehabilitation was compared to exercise, the magnitude of effect of multimodal rehabilitation compared to no rehabilitation remains unclear.

Exercise therapy reduces pain up to six months after lumbar fusion, when compared to usual care. ‘Usual care’ varied between studies but mostly consisted of providing information and postoperative mobilization. It is unclear if exercise-induced hypoalgesia is the mechanism to explain the pain reducation. While in healthy persons, pain and pain sensitivity decreases during and shortly after exercise, the evidence of exercise-induced hypoalgesia in patients with chronic pain is less substantiated [46]. Multimodal rehabilitation has no additional effect on pain when compared to exercise in isolation. For disability, multimodal rehabilitation seems to be more effective than exercise alone at short-term follow-up.

Greenwood et al. included two RCTs in their meta-analysis and concluded that multimodal rehabilitation reduces disability and pain-related fear in both short and long-term follow-up. The current findings confirmed this beneficial effect of multimodal rehabilitation at short term [9]. In contrast, no significant benefit of multimodal rehabilitation in the long term was detected in our meta-analysis. Greenwood’s conclusion was skewed by inclusion of Monticone et al., while the current review clearly identified this study as an outlier.

In patients undergoing lumbar surgery, greater fear of movement is associated with higher levels of pain, more disability and poorer quality of live [47,48,49]. Several authors, therefore, have pointed to fear-avoidance as a potential treatment target in rehabilitation of lumbar surgery [47, 50]. Recently, Hanel et al. [22] demonstrated in their meta-analysis that exercise training effectively reduces fear-avoidance in a population with chronic low back pain. A single study included in our review could not confirm a fear-reducing effect of exercise alone in patients undergoing lumbar fusion [41]. However, the combination of exercise with psychosocial modalities was associated with less fear-avoidance up to six months after lumbar fusion. Given the high prevalence of fear-avoidance in patients with chronic musculoskeletal pain (56%) [51], a multimodal framework should be considered for patients undergoing lumbar fusion. In particular, patients with pain-related fear and in extent other interfering psychological components as outlined in the fear-avoidance model of Vlaeyen and Linton (e.g., anxiety and depression) [52], could benefit from multimodal rehabilitation tailored to their patient-specific characteristics and needs. Besides avoidance of activities, persistence of pain-provoking activities or a combination of pain persistence and avoidance, are also well-known maladaptive coping strategies, that may guide therapeutic approaches. It should be pointed out, however, that none of the included multimodal interventions preselected patients based on their psychological profile or coping strategy.

Compared to prehabilitation in other orthopedic interventions such as hip and knee replacement, prehabilitation of lumbar fusion is still in its infancy. The fact that the majority (71%) of RCTs in this review skipped the preoperative period and only started rehabilitation postoperatively, may partly be an expression of prehabilitation being “unknown, unloved”. Four RCTs started preoperatively with CBT, but could not demonstrate less disability at last follow-up, which is in line with a recent meta-analysis that provided very low to low-certainty evidence that preoperative CBT is not effective for disability in patients undergoing lumbar surgery [15]. Nevertheless, preoperative physiotherapy and psychological therapy, improved pain after lumbar fusion surgery, in the study of Nielsen et al. and Reichart et al., respectively [39, 45]. Overall, we hope to set the scene for new (needed) studies rethinking rehabilitation across the entire care continuum of lumbar fusion to unravel opportunities for value improvement. Given that all interventions that started preoperatively also continued postoperatively, we were not able to distinguish prehabilitation and postoperative rehabilitation. Consequently, the value of the optimal rehabilitation period (preoperatively, postoperatively or both) remains still unclear and in need of further investigation.

One unexpected finding is the variability of reported restrictions in the included trials, reflecting uncertainty among authors in whether and which restrictions are necessary following lumbar fusion. Restrictions ranged from prohibition of sports for three or six months [29, 37], or postoperative bracing [41], to six weeks of complete bedrest [43]. Noteworthy, overgeneralizing (unnecessary) restrictions may fuel iatrogenic pain-related fear and fear of movement, which are reported barriers for physical activity [53]. Restrictions not tailored to patient- and technique specific factors may thereby jeopardize the effects of rehabilitation interventions and a timely return-to-work. Hence, a call for evidence on the impact of postoperative restrictions emerges, requiring future research to clearly report on implied restrictions.

Our results suggest a tendency towards a higher return-to-work ratio after participation in a rehabilitation intervention compared to control condition in the long run. It would be interesting to also map out the time to return to work, however this was precluded due to underreporting of return-to-work at short-term follow-up in the included studies. Even small improvements in the return-to-work timeframes may have large impact on patients and our society. In this light, future rehabilitation trials should consistently measure return-to-work, and this already shortly after lumbar fusion surgery.

Based on our meta-analysis, exercise as a centerpiece of a multimodal framework is suggested. To translate this framework into a more detailed blueprint ready for clinical use, perspectives from the important stakeholders, such as patients, their caregivers, and policy makers, need to be included.

Study limitations

This study has several limitations. First, a small number of eligible trials with an unclear (72%) or high (28%) risk of bias, limited the level of evidence to low. Nonetheless, 15 additional RCTs were identified since the previous meta-analysis of Greenwood et al. [9]. Second, due to language other than English or Dutch, one record could not be retrieved, and one full-text article was excluded. Third, most trials were conducted in European countries (83%). Six author groups were affiliated to the same university in Denmark [28, 34,35,36,37, 44], thereby potentially limiting generalizability to other settings.

Fourth, rehabilitation interventions and comparisons were often insufficiently described. To enhance transparency and enable replication of exercises and other modalities, future studies should follow description guidelines. [23, 54, 55]. Moreover, transparency of trials also requires prospective protocol registration, which was only present in a minority of included trials.

Finally, the comparison of multimodal rehabilitation with exercise had a large degree of statistical heterogeneity, as indicated by an outlier and large I2 statistics. Inclusion in the meta-analysis was based on sufficient clinical homogeneity in terms of rehabilitation modality. Remaining clinical heterogeneity could be related to differences in timing, duration, intensity and setting of the rehabilitation. Additionally, it is possible that non-reaching of surgical goals (e.g., unsuccessful fusion, alignment or decompression) interferes with the long-term effects of rehabilitation. The inclusion of different fusion techniques and indications across RCTs, may imply variable structural success rates. Surprisingly, four included RCTs reported non-instrumented fusion [28, 29, 39, 44], which increases the risk for pseudarthrosis. Given paucity of surgical success data in included studies, we could not correct for this variability. One study with uncertain risk of bias and no description of used fusion technique reported an effect size on disability and pain much larger than any of the other included studies. This result is presumably attributed to the very high intensity of the rehabilitation program [32]. Exclusion of this outlier from the meta-analyses substantially reduces heterogeneity and the magnitude of the summary effect sizes. This observation may raise the question whether rehabilitation shows a dose–response effect, which should be investigated by future research.

Conclusions

The results of this systematic review with meta-analysis encourage exercise for all patients undergoing lumbar fusion given the positive impact on disability and pain up to six months postoperative. Embedding exercise in a multimodal rehabilitation context is suggested given the additional positive effect on disability and pain-related fear, compared to exercise alone. It remains uncertain if these beneficial effects of exercise and multimodal rehabilitation persist in the long term. Additional high-quality research is needed to evaluate these long-term functional and work-related outcomes and to establish the optimal period (pre-, postoperative or both) and dose of rehabilitation.