Introduction

In the field of cervical degenerative disc disease (DDD), the trend to date is to consider both fusion and cervical disc replacement (CDR) as surgical treatment options. Regarding the CDR, and despite the relative short-term outcomes documented in the literature, the increasing acceptance of CDR [1] is primarily supported by the concept that maintaining treated segment and cervical spine mobility could achieve, with time, better protection of adjacent levels than fusion [2]. While waiting for more long-term follow-up studies, an accumulation of short- and intermediate-term [36] and some long-term studies [7, 8] attempt to demonstrate legitimacy of CDR. However, most of the studies to date consider single-level disc replacement or a mixed population made of single and multi-level without comparison between the two groups. Confronted with multi-level pathology and well-known morbidity and risk factors of the extensive fusion techniques [911], some authors advocate for the multi-level disc replacement [12, 13].

The aim of this study was to compare the clinical and radiological outcomes of CDR between single- and multi-level patients.

Population and methods

Clinical study design

A prospective and multicenter study was designed to assess the safety and the efficacy of the Mobi-C® CDR device in the treatment of cervical DDD. This study was performed across nine French Centers, and involved 12 surgeons.

Three hundred and eighty-four (384) patients have been included and operated on between November 2004 and August 2009.

The indications were DDD at one or more levels between C3 and T1, leading to radiculopathy and/or myelopathy. Surgery was performed only after failure of appropriate conservative medical treatment. DDD was confirmed through cervical X-rays, CT or MRI.

Exclusion criteria include age (>65 years old), non-compliance with the study protocol, osteoporosis, metabolic bone disease, congenital or post-traumatic deformity, infection, neoplasia, instability of the intersomatic space, or a narrow canal (<12 mm).

Previous cervical spine surgery (including surgery at the index level) and worker’s compensation were not exclusion criteria. Patients were eligible for enrolment regardless of their pre-operative Neck Disability Index/ Visual Analog Scale (NDI/VAS) score (no minimal value was required). We did not exclude the learning curve cases of the study, nor the patients with severe disc degeneration.

Follow-up (FU) evaluation was performed at 1, 3, 6, 12, and 24 months after surgery.

Methods

Auto-evaluation was completed by the patient before and after the surgery at each FU visit and included VAS, related to both neck and arm pain (VAS 0–100 mm); NDI functional score (NDI 0–100%), and SF-36 quality of life score (Physical and Mental Component Scales). Analgesic requirement, employment status, subject satisfaction as well as complications, re-operations, and additional surgeries were also documented.

Radiological evaluation considered dynamic lateral X-rays at each time-point. Range of Motion (ROM) was measured as the difference of intervertebral angle between maximal flexion and maximal extension. The intervertebral angle was defined as the angle between the two bony endplates of the intersomatic space. For multi-level cases, ROM was measured similarly and was expressed as the average of all implanted levels.

The Heterotopic Ossifications (HO) occurrence has been evaluated on neutral and dynamic X-rays, by a senior spine surgeon (T.V.), unaware of clinical outcomes and not involved in the clinical data collection. We have used the classification established by McAfee, as described previously [3, 14, 15].

We considered the NDI as the primary outcome criteria, and the overall success rate was a composite criteria: among the patients with pre-operative NDI score ≥30%, the success rate was defined as an absolute improvement of the NDI score superior or equal to 15% at 2 years compared to the pre-operative baseline value, associated with the absence of any re-operation at the index level during the 2 years follow-up interval.

Chi-Square test or Fisher’s exact test were used to compare categorical data between patients treated at one or multiple levels. The Student’s t-test for independent sample or the non-parametric Mann–Whitney test were used to compare continuous data between both groups, depending on whether the data were normally distributed or not (according to the Kolmogorov–Smirnov normality test). In each group, comparisons between pre- and post-operative continuous data were performed using the parametric Student’s t test for paired data or the non-parametric Wilcoxon signed-rank test. The Mac-Nemar test was used for within-group comparison of categorical data. All statistical tests were two-sided. P values <0.05 were considered as statistically significant.

Description of the population

Among the 384 patients enrolled in this study, a total of 231 patients have been treated with CDR and have completed their 24 months FU evaluation at the time the data base was closed, and have been included in this analysis. They were divided within 2 groups, 175 patients being treated at 1 level, and 56 treated at 2 levels or more.

Single- and multi-level procedures were performed by the same surgeons, using the same operative procedure, and during the same time interval. All procedures were performed using the same device (Mobi-C®, LDR Médical, Troyes, France), described previously [3].

Results

In the single-level group, 175 patients underwent surgery (175 implanted CDR devices). In the multi-level group, 56 patients underwent surgery at 2–4 levels (118 implanted CDR devices).

The demographics between single- and multi-level groups were comparable (Table 1). The age of the patients was significantly higher in the multi-level group than in the single-level group. As expected, mean length of surgery and time since symptoms onset were greater in the multi-level group than in the single-level group.

Table 1 Demographics of the two groups: single- versus multi-level

In each group, the self-assessment outcomes showed significant improvement at all time-points after the surgery compared to pre-operative baseline (Fig. 1).

Fig. 1
figure 1

Clinical outcomes pre-operatively and over 2 years post-operative follow-up. Results are expressed as mean scores ± SEM at each time-point, in the single-level group (black) and in the multi-level group (grey). a NDI, b radicular VAS, c cervical VAS

Additionally, mean NDI (Fig. 1a), radicular VAS (Fig. 1b) and cervical VAS (Fig. 1c) were comparable pre-operatively between both groups (p > 0.05 for the 3 comparisons), and they decreased significantly at all post-operative time-points compared to pre-operative values, by the same extent. At 24 months post-op, there is no significant difference between both groups regarding the NDI score (p = 0.713), the radicular VAS score (p = 0.790), and the cervical VAS score (p = 0.593).

Improvement of those clinical outcomes at 2 years compared to pre-op baseline was similar in both groups: this absolute improvement averaged 24.0 and 22.8% for NDI (p = 0.668), 43.1 and 35.9 mm for radicular VAS (p = 0.149); 29.5 and 26.7 mm for cervical VAS (p = 0.572) in single- and multi-level groups, respectively.

The SF-36 quality of life score also showed in both groups a strong improvement compared to pre-op baseline during the follow-up period (Fig. 2).

Fig. 2
figure 2

Evolution of the SF-36 quality of life pre-operatively and over 2 years of post-operative follow-up. Results are expressed as mean scores at each time-point, in the single-level group (black line) and in the multi-level group (grey line). a Physical component scale, b mental component scale

The return to work (pooled part-/full-time) was analyzed specifically in the patients who reported sick leave before the CDR surgery. At 2 years FU, 70% of them had returned to work in the single-level group (vs. 46% in the multi-level group), 13% were still in sick leave (vs. 21% in the multi-level group), 6% were in sick leave because of another pathology (vs. 7% in the multi-level group), the others were inactive.

In both groups, the rate of return to work increased significantly at 2 years FU compared to pre-op (p < 0.0001). However, the difference between both groups regarding professional status after 2 years was not significant (p = 0.0985).

After surgery, the return to work occurred after an average of 4.8 months (single-level group) versus 7.5 months (multi-level group) (p = 0.079). One can notice that the mean duration of the sick leave before the surgery differed significantly between the two groups (7.0 and 15.6 months in the single- and multi-level groups, respectively, p = 0.009).

Analgesic use was also analyzed specifically in patients who used analgesic before the surgery. Among this population, in the single-level group, 68% did not use analgesic at 2 years follow-up, while 32% still used some analgesic (daily or occasionally). In the multi-level group, 47% did not use analgesic any more at 2 years follow-up, while 53% still used some analgesic. The difference between both groups at 2 years was statistically significant (p = 0.029).

After 2 years of follow-up, 94.2% of the patients in the single-level group, and 94.5% of the patients in the multi-level group reported that they would undergo the procedure again. There were 3.2% in the single-level group and 5.5% in the multi-level group who answered that they would not (the others being undecided). The difference between both groups is not statistically significant (p = 0.428).

Regarding complications and re-operations, the number of patients meeting at least one complication/re-operation did not differ significantly between both groups (p = 0.109) (Table 2). The rate of dysphagia/dysphonia was significantly higher in the multi-level group (9/56, 16% vs. 6/175, 3.4% in the single-level group, p = 0.0024), but in the all cases, those events resolved spontaneously.

Table 2 Summary of the complications and re-operations reported in single- and multi-level groups during the 2 years of follow-up

The success rate, taking into account both functional improvement and absence of revision surgery, was 69 and 66% in the single- and the multi-level groups, respectively (p = 0.727).

Given the potential positive influence of the CDR to preserve the adjacent segments, additional surgeries for adjacent level degeneration were also monitored.

In the single-level group, four patients (2.3%) underwent a secondary surgery, during the 2 years interval following CDR (2 fusions; 2 CDR).

In the multi-level group, two patients (3.6%) underwent a secondary cervical surgery, with implantation of a third Mobi-C device. The treated degeneration was present at the initial evaluation in at least one of these cases, but was considered as asymptomatic.

Radiographic analysis included motion measurement and HO occurrence.

Evaluation of HO has been performed at 2 years follow-up (Table 3). 165 segments were analyzed in the single-level group, and 111 in the multi-level group (missing data are attributed to radiographs not available or not able to be analyzed due to image quality). Overall, considering the five grades, the difference between both groups is significant, and suggests a lower incidence of HO in the multi-level patients.

Table 3 Distribution of the implanted levels, 2 years after CDR, in single- and multi-level groups, according to the McAfee Classification for HO

Additionally, compared to grade 0, motion decreased significantly from grade II to grade IV (Table 3).

Considering motion, our evaluation has shown that mean ROM of the treated segment was preserved or improved after CDR (Fig. 3).

Fig. 3
figure 3

Evolution of the flexion/extension range of motion before and at the different post-operative time-points. Results are expressed as mean ± SEM, in single- (black line) and in multi-level (grey line) groups

In the single-level group, mean ROM increased after surgery, from 7.2° to 9.5°, and the increase compared to pre-operative value was still significant at 24 months (p = 0.0002).

In the multi-level group (Fig. 4), similarly, mean ROM increased after CDR, from 6.1° to 7.9°, and the increase was still significant at 24 months (p = 0.002).

Fig. 4
figure 4

Representative example of a 3-level CDR patient. Top pre-operative dynamic radiographs. Bottom after CDR with Mobi-C at C4–C5/C5–C6/C6–C7 levels, dynamic and lateral bending radiographs performed 2 years after the surgery and showing mobility at the all treated levels

When considering as mobile a segment with ROM ≥ 2°, then 85.6% (95/111) of the implanted segments in the multi-level group are mobile, versus 85.5% (142/166) in the single-level group after 2 years (p = 1.000).

In some patients, comparison between pre-op and 24 months post-op ROM measurement was not possible due to missing images or poor quality that did not allow for analysis.

One hundred and five (105) patients of the single-level group and 37 patients (76 segments) of the multi-level group had paired ROM measurements at both pre-op and 2 years FU.

The absolute ROM improvement between pre-op and 2 years FU did not differ significantly between both groups (2.8° and 2.2° in single- vs. multi-level groups, respectively, p = 0.521).

Additionally, 63.8% of the segments in the single-level group had a higher ROM value at 2 years compared to the baseline, versus 64.6% in the multi-level group (p = 1.000).

Discussion

In surgical treatment of multi-level cervical DDD, the place of multi-level CDR in the continuum of care is gaining credibility.

Published literature demonstrates biomechanical arguments to support: DiAngelo et al [16] had suggested that CDR may allow more physiological load transfer and kinematics at adjacent levels when compared with fusion. Elsawaf [17] showed in a study of 20 anterior fusion cases that the ROM at adjacent level was increased in six cases. Five of them became symptomatic with time (mean FU 28 months). Four other cases showed only MRI signs of degeneration without symptoms. Furthermore, the Adjacent Segment Disease, i.e. with symptoms, (ASDi), was significantly correlated to the increase of mobility. Phillips [18] showed in a cadaveric study that a two-level CDR allows a near-normal mobility at index and adjacent levels. Laxer [19] in another in vitro study showed that adjacent discs experience substantially lower pressure after two-level disc replacement when compared to two-level anterior fusion (ACDF).

Several clinical trials have shown good results in two-level CDR: Cheng [20] carried out a randomized study with 31 two-level CDR versus 39 fusion cases (autograft + plating): he concluded reliability and the safety of CDR in two-level DDD. Two U.S. FDA IDE studies (Investigational Device Exemption) are currently ongoing in the United States for the two-level indications in CDR. Early results of the Prestige LP IDE are reported by Lanman et al. [21] with 97 patients randomized for double-level CDR versus 83 receiving allograft spacers and plates. In these intermediate results, differences appear in improvement of the primary clinical outcome criteria (NDI) and also for SF-36 and VAS, although in a non significant way. Furthermore, six secondary surgical procedures have been needed in the fusion group versus one only in the CDR group. The author concluded to the achievement of encouraging results for CDR in multi-level cervical disease. The second IDE study compares Mobi-C device to fusion for single and two-level cases. Hoffman [22] reported the preliminary single-site results. The author concluded that disc replacement at one and two levels offered comparable outcomes to those of ACDF, with favorable results for CDR regarding severity of dysphagia, return to work, and patient satisfaction.

Others studies, with a lower level of evidence, examine outcomes of CDR in mixed populations with single and multi-level indications. They sometimes include severe spondylosis with radiculopathy and/or myelopathy. Wang [13] looked into the possibility of treating the multi-level cervical spine myelopathy with multi-level CDR. Goffin [7], with a small number of two-levels cases, showed a persistence of good outcomes previously reported with a slight reduction of ROM in the two-level cases after 6 years.

To our knowledge, two studies attempt to compare clinical outcomes between single and multi-level disc replacement. The first is the Pimenta study [12]. In that prospective consecutive series of 140 patients, 229 devices were implanted (158 multi-level in 69 patients vs.71 single-level). The author compared the clinical outcomes at a mean FU of 26 months. There was no radiological control in that study. It showed that the primary criterion (NDI) was significantly more improved in the multi-level group than in the single-level group. Pimenta concluded in the title to the superiority of multi-level disc replacement compared to single-level cases. Our study is the second one. We did not exclude severe spondylotic patients nor those with previous fusion: Alternatively to Pimenta, we did not find significant difference between the two groups at two years regarding the primary clinical criterion (NDI) or the overall success rate. The overall complication rate was not significantly different in the two groups, but one can notice that the rate of dysphagia/dysphonia was significantly higher in the multi-level group. The relation between the dysphagia occurrence and the number of treated levels in anterior cervical surgery has been documented by Riley et al. In a retrospective study with 454 patients evaluated 3 months after ACDF, the authors reported a rate of 19.8% for 1-level procedures, 33.3% for 2-levels, and 39.1% for 3-levels [23]. 21.3% were persisting at 2 years follow-up.

However, our study was not specifically designed to demonstrate superiority or non-inferiority of multi-level disc replacement with Mobi-C compared to single-level. Such a comparison is difficult, because indications, populations are slightly different (age, time since symptoms onset, baseline professional situation). It remains an exploratory study, performed in “real life conditions”, and in contrast to the Pimenta study, it is not a consecutive case series, avoiding such a bias selection. Due to the sample size, and to the fact that this study is not a randomized controlled study, it is underpowered to assume a statistical significance. Moreover, despite some significant differences (regarding analgesic use, dysphagia/dysphonia occurrence, return to work or HO occurrence), the p values suggest a trend towards similar results in the outcomes of single- versus multi-levels populations.

Conclusion

Multi-level DDD with radiculopathy and/or myelopathy is a challenging indication to treat surgically in the cervical spine. The extensive use of fusion and the documented outcomes have shown the limitations of fusion treatment option, including revision surgeries, major complications, and the effects on the adjacent segment degeneration. Leaning on the increasing accumulation of favorable results with CDR all over the world, this alternative treatment modality continues to be validated through clinical studies. We present here the results of the second study in time comparing the clinical and radiological outcomes of single- versus multi-level disc replacement. No significant difference was observed between the two groups regarding the major clinical outcomes. The number of revision surgery at the index levels was very low in the multi-level group. We need further studies to know more about the impact of multi-level CDR, especially on the adjacent segments, but these results are encouraging and lend credibility to the technique in selected indications of multi-level cervical DDD.