Introduction

The current first-line treatment for chronic low back pain (CLBP) is rehabilitation, preferably multimodal with cognitive behavioural therapy included. If this fails, there is the option of lumbar surgery for a subgroup of patients with identified disc degeneration and level-specific pain. Lumbar spinal fusion has long been considered the gold standard for surgical treatment of CLBP due to degenerative disc disease (DDD) [1, 2].

However, four randomized controlled studies have been conducted that could not determine whether spinal fusion or non-operative treatment is preferable; the discrepancy among non-operative regimes creates further difficulties in making comparisons [37].

The purpose of fusion is to eliminate painful motion in an assumed symptomatic motion segment; however, decreased segmental mobility is also thought to increase the stress on the neighbouring segments, particularly the adjacent segment [811]. This could lead to recurrence of symptoms, known as adjacent segment disease (ASD), and a subsequent need for further surgery. Total disc replacement (TDR) has been developed to avoid the negative effects of arthrodesis.

In previously performed randomized studies comparing disc arthroplasty with fusion, clinical outcome was similar for the two methods, with disc prosthesis having a slight advantage [1215]. The results of one recently performed randomized controlled trial (RCT) indicated that TDR may yield a better clinical outcome than a multimodal rehabilitation program [16].

There is a lack of high-level evidence concerning long-term follow-up for both fusion and TDR, particularly TDR; the follow-up rate at 5 years was very low in the two reported RCTs that compared fusion and TDR [14, 17]. Moreover, TDR has been associated with complications, including types of complications that are not seen after fusion surgery, such as malposition and subsidence [18].

Recently, a Cochrane review of TDR for CLBP in the presence of DDD concluded that high-quality evidence for long-term results is missing. Although significant differences have been seen, differences in parameters that are generally accepted as being of clinical importance, such as short-term pain relief, disability, and quality of life have not yet been shown [19]. Thus, there is a need for further investigations that evaluate the long-term clinical results of fusion and TDR and the incidence of ASD.

The objective of this study was to evaluate the long-term clinical outcome of the two surgical interventions TDR and lumbar spinal fusion. This study reports the 5-year results from a RCT that was previously reported on after 2 years of follow-up [15].

Materials and methods

Study design

The original study consisted of a previously published RCT [15] comparing TDR with instrumented lumbar spinal fusion and was approved by the Ethics Committee of the Karolinska Institute in 2003 (03-268). Detailed information about the study design and randomization process is to be found in the previously published material [15]. Inclusion and exclusion criteria are listed in Table 1.

Table 1 Inclusion and exclusion criteria

Demographics

Out of 152 patients included in the original study (90 women and 62 men), 80 patients were treated with TDR and 72 patients with instrumented lumbar fusion. More detailed information is listed in Table 2.

Table 2 Patient demographics and intraoperative data (mean values)

Study interventions

Detailed information about the surgical technique was described previously [15]. TDR patients were randomized to one of the three devices: Charité (DePuy Spine, Raynham, MA, USA), ProDisc (Synthes Spine, West Chester, PA, USA), or Maverick (MedTronic, Memphis, TE, USA).

Outcome measures

The primary outcome was GA of back pain classified by patient themselves into five categories: “total relief,” “much better,” “better,” “unchanged,” or “worse.” Secondary outcomes included low back pain visual analogue scale (VAS), disease-specific pain and disability measured by Oswestry Disability Index (ODI) version 2.0 [20], EQ-5D (score ranges from −0.59 to 1 [21], SF-36 [22], and leg pain VAS. The term “ODI-success” was used according to Blumenthal et al. [12]. Work status, complications, and reoperations were registered.

Patient attended follow-up visits at 1, 2, and 5 years after surgery. All of the self-assessing outcome measures were registered by specific forms filled in by each patient before surgery and at 1, 2, and 5 years after surgery. All of the data were collected by and stored at the independent Swedish Spine Register SweSpine. Radiographs were obtained before surgery and at all follow-up occasions. The results concerning motion preservation and fusion healing at the 2-year follow-up were presented elsewhere [23].

Power calculation and statistical analysis

Statistical analyses were performed with Statistica version 10 (StatSoft Inc., Tulsa, OK, USA). Results are given as means, standard deviations, and ranges. For comparisons between treatment groups and for some sub-group analyses, two-tailed Mann–Whitney U test and Wilcoxon rank sum tests were used. For ordinal data, Student’s t test was used, and for categorical data, such as GA, Spearman’s R, Fisher’s exact test, and χ 2 tests were used. Statistical significance was defined as p ≤ 0.05. All analyses were performed according to intention to treat (ITT), and there were no crossovers between the groups before the 2-year follow-up. The follow-up rate at 5 years was 99 %. The patients who underwent a reoperation remained in their starting groups for further analysis according to ITT.

Results

Primary outcome: global assessment

Both groups showed clinical improvement at 5-year follow-up (Table 1). In the TDR group, 38 % (30/80) reported being totally pain free compared to 15 % (11/71) in the fusion group (Table 3). In the TDR group, 72.5 % (58/80) reported being either totally pain free or much better (defined as clinical success) compared to 66.7 % (48/72) in the fusion group (n.s.). Six patients in the TDR group classified themselves as worse, and three patients as unchanged. In the fusion group, three patients considered themselves worse after surgery, and six patients considered themselves unchanged (Fig. 1).

Table 3 Primary and secondary outcomes at 1, 2, and 5 years after surgery
Fig. 1
figure 1

Typical posterolateral fusion

Secondary outcomes

Back pain

The improvement in back pain from baseline to 5-year follow-up was significantly greater in the TDR group (Table 3). The distribution of VAS difference 5 years from baseline in each category of global assessment of back pain is displayed in Table 4. In each of the five categories, there were no VAS differences between the TDR group and the fusion group, thus the two groups seem to have the same opinion about what is a relevant change of pain status.

Table 4 The distribution in the whole material regarding self-assessed VAS difference from baseline to 5 years presented for each category of global assessment of back pain

ODI

Significantly more patients in the TDR group had a lower level of disability at the 5-year follow-up. The mean rate of improvement was 60.3 % in the TDR group and 43.6 % in the fusion group (p = 0.006). In the TDR group, 77.5 % (62/80) achieved the limit for ODI success with at least 25 % improvement compared to 64.8 % (46/71) (p = 0.08) in the fusion group.

EQ5D

The TDR group had a score of 0.76 ± 0.30, and the fusion group had a score of 0.68 ± 0.30 (p = 0.026). There was no significant difference between the groups in change from baseline.

SF 36

There were no differences between the two groups concerning SF 36 at the 1- and 2-year follow-ups. The 5-year follow-up showed a significant difference between the two groups concerning the subscale for pain. Back pain at 5 years was 67.6 ± 31.8 in the TDR group and 56.8 ± 27.3 in the fusion group. The difference between back pain at baseline and at 5 years was significant, with a 39.0-point difference in the TDR group compared to a 27.8-point difference in the fusion group.

Patient satisfaction

After 5 years, 79 % of the TDR group was satisfied with the results of the operation compared to 69 % in the fusion group (p = 0.14).

Consumption of analgesics

In the TDR group, 59 % (47/80) were totally free from pain medication at the 5-year follow-up compared to 38 % (27/71) in the fusion group (p = 0.01).

Work status

There was no difference in the percentage of patients with sickness benefits between the groups. In the TDR group, 84 % (67/80) had no sickness benefits at 5-year follow-up compared to 83 % (57/69) in the fusion group. At the start of the study, 41.9 % of the participants were at work, full- or part-time: 36.8 % in the TDR group and 47.2 % in the fusion group. This number had increased to 83.4 % in total at the 5-year follow-up. At the 5-year time-point, 77.5 % (62/80) were working full- or part-time in the TDR group compared to 90 % in the fusion group (64/71) (p = 0.04). The change in percentage of those who returned to work between the 2- and 5-year follow-ups was 72–90 % in the fusion group and 76–78 % in TDR group.

Complications and reoperations

The treating surgeon reported both complications and reoperations at 1 and 2 years after treatment began. In addition, medical records were retrospectively scrutinised for information. Complications were equally common in both groups (Table 5).

Table 5 Complications registered 5 years after surgery

The reoperation rate at the index level performed within 5 years was 6/72 (8.3 %) for the fusion group (excluding operations due to complaints of suspected screw irritation) and 5/80 (6.3 %) for the TDR group (excluding fusions performed at the TDR level). The total numbers of operations performed both at the index level and at a new level in the lower back in the two groups are listed in Table 6.

Table 6 Total number of operations 5 years after surgery at index level

In general, patients who underwent a reoperation had significantly higher levels of VAS back pain (36.0 ± 29.0 vs. 23.7 ± 27.7, p = 0.009). This group had a mean improvement in ODI of 15.0 ± 21.4 compared to the mean of patients who did not undergo additional surgery (23.4 ± 17.3, p = 0.01).

In the group that underwent a reoperation that consisted of fusion at the TDR level, only one out of eight classified themselves as much better or totally pain free 5 years after surgery. In the group that had their devices extracted, 12 out of 21 reached a level of much better or totally pain free.

Discussion

This randomized trial with long-term follow-up comparing TDR with fusion surgery showed a significant difference in the primary outcome variable after 5 years. Five years after surgery, a higher proportion of TDR patients were totally pain free. The same pattern was consistent at all follow-up occasions. However, the significant difference did not remain between the groups when the GA was dichotomized into totally pain free or much better vs. better, unchanged or worse. Most of the patients in this study seemed to benefit from surgery, but the TDR patients had a significantly better probability of becoming totally pain free (Fig. 2).

Fig. 2
figure 2

Typical TDR of one of the brands included in the study

In general, there was little deterioration over time, but a group of patients remained unsatisfied and, in some cases, reported being even worse off than before surgery. This group remained relatively constant from the 1-year follow-up onward: 11.25 % (9/80) in the TDR group and 12.7 % (9/71) in the fusion group. Further analysis is needed to understand if nonresponders, irrespective of group, are comparable in terms of individual and sociodemographic factors. If this is the case, it might indicate a need to take this factor into account before offering surgery. A separate analysis of these group and the predictors for good clinical outcome is planned and will be published in the near future.

We also found significant differences in favour of TDR with regard to all secondary outcome variables. Considering VAS for back pain, there was a relatively large difference between the two groups at the 5-year follow-up. The difference between the two groups became greater over the years following surgery, and the reason for this remains unclear. However, because the amelioration of back pain is of fundamental importance when deciding whether to recommend surgery for CLBP, this finding is particularly relevant.

Quality of life and function measured by ODI appeared to be better at the 5-year follow-up compared to 2 years after surgery in the TDR group, supporting our findings for the primary outcome variable. In former reports on this topic, a clinical success level for ODI was set, with the approval of the United States Food and Drug Administration, as an improvement of 25 % or a reduction of 15 units. In this study, 77.5 % of patients in the TDR treatment arm reached this level and 64.8 % reached it in the fusion arm. This is in line with previous findings, although Zigler et al. [12, 13, 16] used a different, insufficiently validated version of ODI, making comparisons difficult.

Even though significant differences were reached between the groups concerning ODI improvement, the difference measured in ODI success was not significant. However, both groups reached a high level of success rate, making this information important in the general decision-making concerning surgery.

It appears as if the fusion group has a higher probability of working part- or full-time 5 years after surgery. This diversity did not exist at the 2-year follow-up and the reason for it remains unknown. It has previously been reported that patients who have been out of work for 2 years have a very little chance of returning to work, but there was no difference between the groups at the base-line regarding this matter.

However, there was a difference between the groups concerning work rate to start with. This is probably not the whole explanation. Analysing the different subgroups, one can see that the patients who have not returned to work are evenly distributed in all categories of global assessment. This finding supports the former known knowledge that work status has a large amount of possible confounding factors, for example, socioeconomic situation and psychological factors [25, 26].

Because of the design of our study, the amount of reoperations was higher than usual. When the study began, there was a general consensus that fusion was the gold-standard treatment, and that patients ought at least to have the right to undergo this procedure if they felt that their initial results were unsatisfactory. The patients, therefore, were informed that they could undergo fusion or have their fusion implants extracted if they were unhappy with the results at the 2-year follow-up. In total, 29 patients decided to avail themselves of this option; in general, their results were poorer than those of the rest of the group. Given what we currently know about the different treatment options for CLBP, we believe that most of these reoperations were unnecessary and should be avoided in the future, except in highly selected cases.

Strengths and limitations

Our study has several strengths. It was randomized and had no patients who crossed over from one treatment to the other. The follow-up rate remained extraordinarily high (99.3 %), even after 5 years had passed since the index operation. Collection of the self-assessed data was independent and handled by the Swedish Spine Register. The study also had an independent design with a low risk of financial bias.

One limitation of our study was the lack of a non-operative control group. Considering the fact that there is still some controversy about the superiority of fusion vs. conservative treatment, a comparison of the effectiveness of TDR and fusion vs. nonsurgical treatment would be highly relevant. Hellum et al. recently published a study that compared TDR with a conservative treatment regime; data from that study suggest that operative treatment is slightly superior [17].

Other methodological limitations of our study were the fact that three different brands were used in the TDR operations and both posterolateral interbody fusion and instrumented posterolateral fusion were allowed. Regarding the slightly different operations in the fusion group, previously published studies did not report significant differences between these two methods [24].

The patients included in this study consisted of a group of selected individuals, with restrictive inclusion and exclusion criteria. When trying to extrapolate our results to a more general population, one must be aware of this fact. However, a registry study undertaken by the authors of the 2-year results from this study supports the present results, but in a less selective patient group [27].

Conclusion

There is a lack of long-term follow-up of surgery for DDD in which fusion surgery is compared with TDR. This randomized controlled study, with a follow-up rate of 99.3 %, found that in general the results were good 5 years after surgery. However, there were more than twice as many patients in the TDR group who were totally pain free 5 years after surgery. This group also had significantly better outcomes in almost all of the outcome variables measured. Although further studies are needed on the topic, it seems that the majority of DDD patients might benefit from surgery in general and that there is little if any deterioration over time.