Introduction

Lumbar degenerative disc disease (LDDD) is the leading reason for low back pain, which is a major health problem with significant economic burden [1, 2]. Lumbar fusion has been a gold standard procedure for treating LDDD unresponsive to conservative treatment. However, stiffness and adjacent segment degeneration may lead to poor clinical outcomes and more complications in the long term [3, 4]. As an alternative technique, total disc replacement (TDR) has received more and more attentions in recent years. This procedure is designed to maintain the motion of the operated level and to prevent adjacent segment degeneration [5, 6]. However, whether TDR or lumbar fusion is the optimal procedure for treating LDDD is still unclear.

Several randomised clinical trials (RCTs) have been conducted to compare TDR with lumbar fusion, but their findings are conflicted regarding which procedure is better [714]. In light of this, many meta-analyses of RCTs, representing the highest level of evidence, have been published to compare these two procedures for the treatment of LDDD. However, these overlapping meta-analyses also showed discordant findings [1519]. For example, Nie et al. [17] concluded that TDR shows a significant superiority for treating LDDD compared with fusion. However, Yajun et al. [18] did not demonstrate the significant superiority of TDR in comparison to lumbar fusion in their study. These inconsistent findings have resulted in uncertainty for decision makers regarding the surgical treatments of LDDD.

In recent years, systematic reviews of overlapping meta-analyses have been widely published in many medical fields [2023]. These studies help to select the highest quality level of evidence for decision-making by evaluating meta-analyses with the discordant results on certain topic [2023]. However, to the best of our knowledge, there is no systematic review of overlapping meta-analyses investigating the relative effects between TDR and fusion for LDDD. The objective of this study was to perform a systematic review of overlapping meta-analyses regarding TDR versus lumbar fusion for the treatment of LDDD, to assist decision makers in selection among conflicting meta-analyses, and to provide treatment recommendations based on the best available evidence.

Materials and methods

This study was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-analysis (PRISMA) statement [24]. The design of this study was based on previous similar publications [2023].

Literature search

On July 10, 2015, the literature databases, including PubMed, EMBASE, and Cochrane Library, were systematically retrieved. The following keywords were used, including lumbar, arthroplasty, prosthesis, replacement, arthrodesis, fusion, low back pain, intervertebral disc degeneration, degenerative disc disease, systematic review, and meta-analysis. The search was independently performed by two authors, with the limitation of English language. The references of the included studies were also checked to find potential meta-analyses. The titles and abstracts were first reviewed, and the full texts were acquired if the information was not enough. Disagreements were settled by discussion, and a third author was consulted when necessary.

Eligibility criteria

The inclusion criteria of this systematic review were: (1) comparing TDR with fusion for treating LDDD; (2) meta-analysis exclusively including RCTs; (3) at least 1 outcomes (e.g., functional scores and complications). The narrative review, meetings abstract, correspondence, meta-analysis comprising non-RCTs, and systematic review without meta-analysis conducted were excluded.

Data extraction

Two authors independently extracted the following data from the included studies: first author, year of publications, databases for search, primary study design, the number of RCTs included, heterogeneity or subgroup analyses of primary study, and meta-analysis results. When disagreements occurred between the two authors, a third author was consulted.

Quality assessment

The methodological quality was evaluated by the Oxford Levels of Evidence [25] and the Assessment of Multiple Systematic Reviews (AMSTAR) instrument [26]. AMSTAR has been proved as a methodological assessment tool with good reliability, validity, and responsibility [27, 28]. It is widely used to assess the quality of systematic reviews [2023]. Two authors independently evaluated the quality of the included meta-analyses. Disagreements between authors were settled by discussion, and a third author was consulted if necessary.

Application of Jadad decision algorithm

The Jadad decision algorithm was applied to investigate the source of inconsistence among systematic reviews, comprising differences in clinical question, inclusion and exclusion criteria, data extraction, quality assessment, data pooling, and statistical analysis [29]. It had been widely conducted to provide treatment recommendations among meta-analyses with discordant results [2023, 29]. This algorithm was independently applied by three authors, who reached a consensus regarding which meta-analysis provided the best available evidence.

Results

Literature search

A flow chart of the study selection is shown in Fig. 1. A total of 502 titles were found from the literature source. Five meta-analyses met the inclusion criteria [1519]. The characteristics of these included studies are listed in Table 1. These studies were published between 2010 and 2015. The primary studies of included meta-analyses were published between 2005 and 2011, and the number of primary trials ranged from 5 to 7 (Table 2).

Fig. 1
figure 1

The flow chart of study selection

Table 1 The characteristics of the included meta-analyses
Table 2 Primary studies included in meta-analyses

Search methodology

Three of the included meta-analyses only included English literature [1517], and the other two studies had no language restriction [18, 19]. The databases of Embase and Medline (PubMed) were searched in all included meta-analyses, whether Cochrane Library, OVID, and BIOSIS were included in search strategy that was inconsistent among the studies. Search methodology used in the included meta-analyses is shown in Table 3.

Table 3 Search methodology of the included studies

Methodological quality

All meta-analyses included only RCTs, and were determined as Level-II evidence according to Oxford Levels of Evidence (Table 4). Only one meta-analysis showed that the GRADE was conducted in their study [19]. The results of AMSTAR scores for the included meta-analyses are listed in Table 5, ranging from 6 to 9 (median 7). A Cochrane review with 9 scores of AMSTAR was the highest quality study [19].

Table 4 Methodological information for the included studies
Table 5 AMSTAR scores for the included studies

Heterogeneity assessment

The I 2 statistic value, as a measurement tool for investigating the interstudy variability, was used to evaluate the heterogeneity of study in each meta-analysis (Table 6) [1519]. A total of three studies performed sensitivity analyses according to methodological quality [16, 18, 19] (Table 4). One meta-analysis did not conduct sensitivity or subgroup analysis [15] (Table 6).

Table 6 Heterogeneity or subgroup analyses of primary studies

Results of Jadad decision algorithm

Which meta-analysis represented the best available evidence among the five included meta-analyses was investigated following the Jadad decision algorithm [29]. The meta-analysis results of the included studies are show in Fig. 2. Based on that the included studies investigated the same study question did not comprise the same trials, and the selection criteria were discordant, the Jadad decision algorithm indicated that the best available evidence should be chosen according to the publication status and the methodological quality of primary trials, language restrictions, and analysis of data on individual patients. Hence, a high-quality Cochrane review was selected (Fig. 3) [19]. This study concluded that statistical significances were observed between TDR and fusion for LDDD regarding disability, pain relief, and pain in the short term, but it was not over clinically important differences. The preventative effects on adjacent segment disease and facet joint degeneration, as the primary goal of adopting TDR stated by the manufacturers, were not appropriately evaluated [19].

Fig. 2
figure 2

Results of the included meta-analyses

Fig. 3
figure 3

The flow chart of the Jadad decision algorithm

Discussion

To the best of our knowledge, this is the first systematic review of overlapping analyses regarding TDR versus fusion for LDDD. This study may help the surgeons to understand the current best evidence on this topic, and assist decision makers in selection among conflicting meta-analyses. In this study, five meta-analyses [1519] were included in terms of a comprehensive literature search. This study found that most of meta-analysis identified by the literature search was published within similar period, but they did not comprise the same primary trials, and not provide the same conclusions for the treatment of LDDD [1519]. According to the Jadad decision algorithm, Jacobs et al. [19] was selected as the current best available evidence on this topic. This systematic review of overlapping meta-analyses suggests that TDR may be an effective intervention to treat the selected LDDD, and is at least equal to fusion in the short term. However, given that disadvantages may appear after years, spine surgeons should be serious about performing TDR on a large scale.

Our study demonstrated that there were discordant results among the included meta-analyses. Some meta-analyses [18, 19] showed that TDR did not show significant superiority for the treatment of LDDD compared with fusion. Therefore, the benefits of motion preservation are still unable to be concluded. However, the other meta-analyses [1517] concluded that TDR showed significant safety and efficacy comparable to lumbar fusion. The possible sources of inconsistence among meta-analyses have been analysed and reported by Jadad et al. [29], including the clinical question, study selection and inclusion, data extraction, assessment of study quality, assessment of the ability to combine studies, and statistical methods for data synthesis. Moreover, a decision algorithm was also designed to choose the highest quality level of evidence from currently discordant systematic reviews [29]. This decision tool adopted in this study was widely used to find the best available evidence among overlapping systematic reviews [2023].

Jacobs et al. [19] was the current best available evidence on the comparison of TDR and fusion for LDDD. It demonstrated that TDR was superior to lumbar fusion in Oswestry disability index, visual analogue scale, back pain, patient satisfaction, implant motion, and subsidence [19]. There were no differences between TDR and lumbar fusion in leg pain, proportion of full-time and part-time work, reoperation rate, blood loss, radiographic loosening, and adjacent segment and facet joint degeneration. However, Jacobs et al. [19] believed that the current meta-analyses did not properly evaluate adjacent segment and facet joint degeneration, which is a defect as this is the reason that the disc prosthesis was manufactured. Thus, Jacobs et al. [19] concluded that because disadvantages may appear after years, spine surgeons should be serious about applying TDR on a large scale, despite TDR may be an effective technique for treating selected patients with LDDD and is at least equal to fusion in the short term.

Although Jacobs et al. [19] provided the best evidence, however, it should be recognised that this study had several factors that influenced their findings. First, most of their results were pooled by less than four primary RCTs, although six studies were included in their study. This may be mainly because different studies adopt different outcome measures. The results could not be pooled using all the data in the included studies. Therefore, more RCTs with similar outcome assessments should be performed in the future. Second, many new techniques, such as minimally invasive spine surgery [30, 31], have been increasingly used in treating LDDD in recent years. These procedures may further improve the outcomes of spine surgery for the treatment of LDDD [31, 32]. However, these new techniques could not be well discussed in their study, due to a few RCTs was included. Third, their study was published in 2012 and the latest included RCTs was published in 2011. Their study could not include newer RCTs published in recent years, which may strengthen or weaken the conclusions. Therefore, the meta-analysis regarding TDR versus fusion for the treatment of LDDD should be updated in the futher.

The included meta-analyses were published in five journals, including three orthopaedic journals, European Spine Journal, International Orthopaedics, and Archives of Orthopaedic and Trauma Surgery. The study from Jacobs et al. [19] was chosen according to the Jadad decision algorithm, which was published in Cochrane Database of Systematic Reviews. This journal is a professional journal in the field of evidence-based medicine, and has the highest impact factor among the included journals. It indicates that some high-level study in the evidence-based orthopaedics may not definitely published in the professional orthopaedic journals.

There are several limitations in this study. First, the literature search was limited in articles published using English language. Non-English literature could not included in this systematic review, despite multiple databases were searched. Second, to get the highest level of evidence, meta-analyses only comprising RCTs were included in this study. However, all the included studies were Level II of evidence. Therefore, this systematic review could not provide treatment recommendations based on Level-I evidence. Third, this study could not assess the long-term results, because almost all of the primary studies only have data for 2 years, and the long-term complications, such as adjacent segment diseases, may be well assessed in more than 10 years follow-up.

Conclusion

This is the first systematic review of overlapping meta-analyses on comparing TDR with fusion for the treatment of LDDD. This systematic review showed that there are conflicting results among these overlapping meta-analyses. Based on this systematic review of overlapping meta-analyses, the best available evidence indicated that TDR compared with fusion for LDDD had statistically, but not clinically, significant superiority regarding disability, pain relief, and quality of life in a selected group of patients in the short term. The prevention of adjacent segment and facet joint degeneration, as the primary reason for adopting TDR noted by the manufactures, was not appropriately evaluated. Hence, considering that disadvantages may appear after years, spine surgeons should be serious about applying TDR on a large scale, despite TDR may be an effective technique for treating selected patients with LDDD and is at least equal to lumbar fusion in the short term.