Introduction

Cervical disc disease (CDD) is an illness entailing degenerated and protruded discs of the cervical spine, causing neck pain, with or without radicular pain in arms, myelopathy, and alteration of the overall cervical spine biomechanics [1, 2]. Symptomatic CDD is considered one of the main causes of incapacity for work [1]. Published studies indicate an increase in the trend of cervical surgery in the coming years, above all, in the population between 45 and 54 years old [2, 3].

Since Robinson and Smith first reported in 1958, anterior cervical discectomy with fusion (ACDF) has been widely used for treating CDD, including cervical disc herniation (CDH), and has been classically the gold standard procedure. In this technique, anterior direct decompression and physical sagittal alignment restore are performed with the use of a cage inserted into the intervertebral space [4]. Many studies have found ACDF to be a successful procedure; and is reported to provide excellent symptom relief and significant improvement in quality of life [5,6,7] For this reason, it has been reported that 84.3% of orthopedic surgeons performed ACDF as the standard technique for CDD and CDH [6]. However, ACDF has also presented some well-characterized complications like pseudoarthrosis or nonunion, instrumentation failure, and the most problematic, causing patients to undergo secondary surgery: a solid bony fusion in this procedure can change the range of motion (ROM) and the mechanical load of adjacent segments, which can cause subsequent adjacent segment disease (ASD) [8, 9]. Hilibrand et al., found that symptomatic adjacent segment disease may affect more than a quarter of all patients within ten years after an ACDF [10]. Lee et al., found that after ACDF, secondary surgery in adjacent segments occurred at a relatively constant rate of 2.4% per year (95% confidence interval (CI), 1.9–3.0). Kaplan–Meier analysis predicted that 22.2% of patients would require reintervention in adjacent segments at 10 years postoperatively [11].

To avoid these risks, an alternative treatment, cervical disc arthroplasty (CDA), emerged in 1990’s, with the introduction of a mobile division between the vertebrae [12]. The CDA has the advantage of preserving physiological motion, maintaining the disc height and segmental lordosis, and the biomechanical properties of the cervical spine. It can also prevent the need for future reoperations [13,14,15]. Based on these advantages the use of CDA has increased, in the last years [3]. However, CDA also presents some drawbacks, the most common being heterotopic ossification, implant failure, and bone loss [16, 17]. Several previous meta-analyses have compared the advantages and disadvantages of ACDF and CDA with inconclusive results, mostly related to short-term follow-up (2 years of follow-up) [7, 14,15,16, 18,19,20,21], with few studies analyzing the mid-term efficacy (5 years of follow-up) [22]. With the hypothesis that differences can be seen at long-term follow-up, the purpose of this meta-analysis was to examine the long-term efficacy between ACDF and CDA by comparing clinical, radiological, and surgical outcomes, in randomized clinical trials with a minimum follow-up of 7 years.

Material and methods

Literature search strategy

The present meta-analysis was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Statement [23] and the systematic review following the Cochrane Handbook for Systematic Reviews of Interventions recommendations [24]. An electronic search was performed on PubMed, EMBASE, and Cochrane Collaboration Library up to 30 November 2020, for randomized trials of ACDF versus CDA. The following keywords were used in the search strategy: ¨anterior cervical discectomy and fusion¨ or ¨ACDF¨, and ¨cervical disc arthroplasty¨ or ¨CDA¨, and ¨randomized controlled trial¨. The bibliographic search was restricted to the English language.

Eligibility criteria

Inclusion criteria for study identification were as follows: (1) randomized controlled trials (RCTs), (2) comparison between two surgical procedures (CDA and ACDF), (3) follow-up time of more than 84 months, (4) reporting at least one surgery-related outcome and (5) full-text reports in English. Those studies with (1) lack of comparative data, (2) insufficient follow-up, (3) biomechanical or in vitro studies, and (4) conference presentations, editorials, and abstracts were excluded from the meta-analysis.

Study selection

Two authors assessed the search results for eligibility. Intensive reading of the full text was performed when the studies met the inclusion criteria. If there was a conflict between the two reviewers, a third reviewer was consulted to reach a decision.

Data extraction

Data were extracted from the main texts and supplementary appendices of the trials. The data extracted from the enrolled studies were performed as follows: (I) General characteristics such as first author, year of publication, number of clinical trial (NCT), enrolled patients, age, sex, surgical levels, type of prosthesis, and follow-up duration were extracted. (II) Clinical outcome measures, including clinical overall success rate, neck disability index (NDI), neck pain, arm radicular pain, 36-item Short Form (SF-36) Health Survey (summary of physical and mental components). (III) Radiological outcome measures, include fusion rate, heterotopic ossification rate, range of motion (ROM), superior adjacent syndrome, and inferior adjacent syndrome. (IV) Surgical outcome measures, such as adverse events rate and reoperation rate.

Quality assessment

RCTs quality was assessed according to Review Manager (RevMan) version 5.3 software (The Nordic Cochrane Centre, The Cochrane Collaboration, Copenhagen, 2014) to assess the risk of bias. Assessment methods consisted of the following steps: random sequence generation, allocation concealment, blinding, incomplete outcome data, and selective outcome reporting. Scores in these domains are distilled into an overall assessment of the overall risk of bias for a given RCT: (I) “low risk of bias”; (II) “some concerns”; or (III) “high risk of bias”.

Statistical analysis

Descriptive statistics were mean and standard deviation (SD) for continuous variables and count and percentage for categorical variables. Meta-analysis was performed using Review Manager software (Version 5.3) from the Cochrane community. For binary variables, the odds ratio (OR) was used for evaluation, while for continuous variables, the standard mean difference (SMD) with a 95% confidence interval (CI) was applied. The heterogeneity of the studies was estimated using the I2 test. The random-effects inverse variance model was applied. Statistical significance was defined as a two-tailed p-value of < 0.05.

Results

Literature review

The initial database search identified 2834 articles (PubMed: 1452, Embase: 1020, Cochrane Collaboration Library: 362) and the detailed literature selection is described in the flowchart in Fig. 1. A total of 1231 studies were removed because they were duplicates, 1553 studies were excluded based on their titles and abstracts, and 41 studies were excluded for other reasons. As a result, 9 studies were included for further evaluation [25,26,27,28,29,30,31,32,33]. Figure 2 provides the summary of the risk of bias.

Fig. 1
figure 1

PRISMA flowsheet. PRISMA flowsheet illustrating the number of articles excluded at different stages of the screening process

Fig. 2
figure 2

Risk of bias summary. The review authors’ judgments about each risk of bias item for each included study: green is “low risk of bias”, red is “high risk of bias”, yellow is “unclear risk of bias”

Study characteristics

The general characteristics of each study are shown in Table 1. The meta-analysis included a total of 2664 patients, with 1464 patients undergoing CDA while 1200 underwent ACDF [25,26,27,28,29,30,31,32,33]. The mean age was 44.2 (SD 1.8) in CDA and 44.8 (SD 1.6) in ACDF, there were no statistically significant differences (SMD =  − 0.08, 95% CI: − 0.18–0.02, p = 0.12) [26, 28,29,30, 33]. There were 674 men and 746 women in the CDA group, and 552 men and 601 women in the ACDF group, there were no statistically significant differences (OR = 1.02, 95% CI: 0.87–1.19, p = 0.85) [25, 26, 28,29,30, 32, 33]. Eight studies compared CDA and ACDF at one level replacement [25,26,27, 29,30,31,32,33]; and two studies compared at 2 levels replacement [28, 30]. Three studies used the BRYAN® Cervical Disc (Medtronic, Minneapolis, MN) [29, 31, 32]; 2 studies used Prestige® Cervical Disc (Medtronic, Minneapolis, MN) [25, 28], 2 studies used ProDisc-C® (Depuy-Synthes Spine, Raynham, MA) [26, 27], 1 study used Mobi-C® Cervical Disc (Zimmer Biomet, Warsaw, IN) [30], and one study used SECURE-C (Globus Medical, Audubon, Pennsylvania) Cervical Disc [33]

Table 1 Baseline characteristics of studies included in the meta-analysis

Clinical outcomes

The CDA group had a significantly higher overall success rate (p < 0.001), a higher improvement in the neck disability index (NDI) (p = 0.002), less VAS arm pain (p = 0.01), and better health questionnaire SF-36 physical component (p = 0.01) than ACDF group. There were no significant differences between the CDA and ACDF groups in the neck pain scale (p = 0.11), and the health questionnaire SF-36 mental component (p = 0.10).

The overall success rate was reported in 6 studies that included 1370 patients in the CDA group and 1106 patients in the ACDF group [25, 26, 28, 30, 32, 33]. Pooled results showed that the overall success rate in the CDA group was significantly higher than in the ACDF group (OR = 1.98, 95% CI: 1.57–2.49, p < 0.001) with moderate heterogeneity (I2 = 36%, p = 0.16) (Fig. 3a). The NDI, data were provided in 4 studies that included 790 patients in the CDA group and 579 patients in the ACDF group [25,26,27, 30]. Significant differences in the NDI in favor of CDA were found (SMD =  − 0.21, 95% CI: − 0.38 to − 0.04, p = 0.02), with substantial heterogeneity (I2 = 51%, p = 0.08) (Fig. 3b).

Fig. 3
figure 3

a Forest plot of overall success rate. b Forest plot of neck disability index (NDI). c Forest plot of VAS neck pain. d Forest plot of VAS arm pain. e Forest plot of SF-36 physical component. f Forest plot of SF-36 mental component. 95% CI: indicates 95% confidence interval; ACDF: anterior cervical discectomy and fusion; CDA: cervical disc arthroplasty; Std: standard; OR, odds ratio

Neck pain and arm pain scales were found to be analyzed in the same 5 studies, with 941 patients in the CDA group and 719 patients in the ACDF group [25,26,27, 30, 33]. Pooled results showed no significant differences in the neck pain scale between the 2 groups (SMD = − 0.17, 95% CI: − 0.37–0.04, p = 0.11) with high heterogeneity (I2 = 74%, p = 0.002) (Fig. 3c). Pooled results showed that the arm pain scale was significantly in favor in the CDA group (SMD = − 0.16, 95% CI: − 0.29 to − 0.04, p = 0.01) with moderate heterogeneity (I2 = 31%, p = 0.20) (Fig. 3d). Regarding the 36-item Short Form health questionnaire (SF-36) (physical and mental components), 4 studies included the analysis of the physical component [25, 26, 30, 33], and 3 studies included the analysis of the mental component [26, 30, 33], with a total of 919 patients in the CDA group and 697 in the ACDF group for the former, and 643 and 432 respectively for the latter. Pooled results showed that the physical SF-36 component was significantly in favor in the CDA group (SMD = 0.13 95% CI: 0.03–0.23, p = 0.01) with very low heterogeneity (I2 = 1%, p = 0.40) (Fig. 3e). No significant differences were found between the 2 groups in the mental SF-36 component (SMD = 0.19, 95% CI: − 0.03–0.41, p = 0.10), with substantial heterogeneity (I2 = 69%, p = 0.02) (Fig. 3f).

Radiological outcomes

A heterotopic ossification rate of 10.3% was observed in the CDA group. A fusion rate of 94.06% was found in the ACDF group. The pooled results indicated a significantly higher motion rate (p < 0.001), and less adjacent syndrome (p < 0.05), in the CDA group. The motion rate with SD was reported in 2 studies that included 267 patients in the CDA group and 187 patients in the ACDF group [26, 30]. Pooled results showed that the motion rate in the CDA group was significantly higher (SMD = 1.86, 95% CI: 1.63–2.08, p < 0.001) with inconspicuous heterogeneity (I2 = 0%, p = 0.52) (Fig. 4a). Pooled rates of the superior adjacent syndrome were reported in 4 studies (with a total of 832 patients in the CDA group and 594 in the ACDF group) [29, 30, 32, 33]. Less superior adjacent syndrome was reported in the CDA group (OR = 0.33, 95% CI: 0.17–0.65, p = 0.001), with substantial heterogeneity (I2 = 81%, p < 0.001) (Fig. 4b). Only one study reported the rate of the inferior adjacent syndrome [30]. Less inferior adjacent syndrome was reported in the CDA group (OR = 0.31, 95% CI: 0.15–0.66, p = 0.002), with substantial heterogeneity (I2 = 75.1%, p = 0.05) (Fig. 4c).

Fig. 4
figure 4

a Forest plot of motion rate. b Forest plot of superior adjacent syndrome. c Forest plot of inferior adjacent syndrome. 95% CI indicates 95% confidence interval; ACDF, anterior cervical discectomy and fusion; CDA, cervical disc arthroplasty; OR, odds ratio

Surgical outcomes

The pooled results indicated no significant differences in adverse events (p = 0.42) between both groups and a significantly lower percentage of reoperation (p < 0.001) in the CDA group. Adverse events were reported in 7 studies that included 1420 patients in the CDA group and 11,153 patients in the ACDF group [25, 26, 28,29,30, 32, 33]. There was no difference in the rate of adverse events between CDA (33.1%) and ACDF (38.9%) (OR = 0.84, 95% CI: 0.56–1.27, p = 0.42), with moderate heterogeneity (I2 = 37%, p = 0.13). For reoperation rate, 7 studies documented it, with 1172 patients in the CDA group and 932 in the ACDF group. [25,26,27,28, 30, 31, 33]. Reoperations occurred in 4.4% of CDA patients, a significantly lower rate compared with 15.6% of the ACDF group (OR = 0.26, 95% CI: 0.19–0.37, p < 0.001), with inconspicuous heterogeneity (I2 = 0.0%, p = 0.97). Figure 5 summarizes the adverse events and the reoperation rate.

Fig. 5
figure 5

a Forest plot of adverse events rate. b Forest plot of reoperations rate. 95% CI indicates 95% confidence interval; ACDF, anterior cervical discectomy and fusion; CDA, cervical disc arthroplasty; OR, odds ratio

Discussion

Numerous meta-analyses in recent years have compared the clinical and radiological results of ACDF and CDA, trying to find the advantages and disadvantages of each, most of them have inconclusive results and are related to short or medium-term follow-ups [7, 14,15,16, 18,19,20,21,22]. In this meta-analysis, we aimed to compare both treatments (CDA and ACDF) for CDD at long-term follow-up. Our main findings were that with a minimum follow-up of 7 years, CDA showed a statistically better overall success rate, better improvement in NDI, less VAS arm pain, better SF-36 physical component, better motion rate, less adjacent syndrome, and less reoperation than ACDF. No significant differences were found in adverse events, neck pain scale, or mental SF-36 component.

Some limitations of the present study should be taken into account. First, the number of included studies was small, which may lead to insufficient evidence. However, only RCT was included, which is stronger. Second, as all RCT, there are inclusion and exclusion criteria, for that reason the patients analyzed were able to have either CDA or ACDF. However, patients suitable for ACDF but not for CDA were excluded. Third some results have moderate heterogeneity, which can introduce bias. And fourth, although the ACDF and CDA groups suffered from CDD, the intervention of the two groups was not the same. Some studies were surgery at one level and others at two levels. In addition, the ACDF group selection fusion tools are also different: interbody graft with bone graft (autograft vs. allograft) or implants. On the other hand, the ADC groups used different types of artificial intervertebral discs, including Mobi-C, Bryan, Prestige, SECURE-C, and Prodisc-C. Different interventions can affect the accuracy of the conclusion. Further high-quality, large-sample studies with strong evidence are needed to verify our results.

In the terms of age (p = 0.12) and sex (p = 0.85) no statistically significant differences were found between both groups. This is important because can limit demographic bias. Study selection and study homogeneity play an important role in quality control when performing a meta-analysis. RCTs can optimize follow-up and data quality, with low selection bias and confounding [34]. For example, Saifi et al., after a retrospective analysis of a national database, found that CDA's review burden was more than double that of ACDF's review burden (5.9% vs 2.3%), which was not taken into account in the initial patient demographics [3]. Regarding clinical outcomes, no significant differences were found between CDA and ACDF for neck pain and the SF-36 mental component. In contrast, NDI, radicular pain, and SF-36 physical component were reported significantly better in the CDA group, as well as the surgical overall success. These findings are similar to some meta-analyses [14, 15, 18], however, they also differ from other meta-analyses, like Luo et al. [21], and Gao et al. [7], they found lower cervical and arm pain scores in the CDA group than in the ACDF group (p < 0.05); and similar NDI in both groups (p > 0.05). Gendreu et al., in their meta-analysis, they did not find statistically significant difference between NDI (p = 0.37), VAS neck pain (p = 0.79), neither VAS arm pain (p = 0.66) [35]. Zhang et al., found that in short-term and midterm follow-up, patients treated with CDA had improved NDI and had higher NDI success rates than those treated with ACDF. Regarding pain relief, they found CDA group had lower neck pain scores and lower arm pain scores in short-term follow-up and in midterm follow-up. Furthermore, they found higher overall success rates in the CDA group [15]. This discrepancy in the results may be due to the heterogeneity of the groups [7, 14,15,16, 18, 21, 35]. The result may indicate, as Zhang et al. said, that different types of cervical arthroplasties might have different efficacy [15]. Due to the limited number of included articles, subgroup analyses stratified by prosthesis types cannot be performed for the other outcomes. However, taking into account all these data, this may indicate that both techniques may be useful for improving pain management and improving quality-of-life health.

A significantly higher motion rate and less adjacent syndrome in the CDA group were found in our study. The higher range of motion is a constant finding in other metanalyses, which means that the CDA movement persists despite time [7, 15, 16, 18, 20,21,22, 36]. Regarding adjacent syndrome, Zhu Y et al., in a meta-analysis of 14 RCTs showed that there were significantly fewer adjacent segment reoperations in the CDA groups (hazard ratio 0.47) compared with the ACDF groups with a follow-up of 2 to 7 years [20]. Luo et al. found that CDA had a significantly lower incidence of AS (OR = 0.57, 95% CI: 0.44–0.73, p < 0.00001) with no obvious heterogeneity (I2 = 17%, p = 0.26) [37]. Dong et al., also found that the rate of adjacent segment in the CDA group was significantly lower compared with ACDF (p < 0.01), and that the advantage of CDA in reducing adjacent segment reoperation increase with increasing of follow-up time (p < 0. 01) [36]. Xu et al., also found a decrease in the rates of adjacent segment degeneration and reoperation in CDA compared with ACDF, and the superiority may become more apparent over time [38]. AS rates were significantly lower in the CDA group than in the ACDF group, which altogether may suggest that CDA reduces or prevents adjacent syndrome [7, 18, 20,21,22, 36,37,38]. Although, the assumption that adjacent segment disease arises from ACDF with iatrogenic motion restriction is currently under debate. Some investigators have hypothesized that adjacent segment disease signifies natural history progression of spinal segmental degeneration [39].

No significant differences were found in adverse events, however CDA have less reoperation rate than ACDF. When talking about adverse events, no significant difference was found in adverse events between CDA and ACDF (OR = 0.84, 95% CI: 0.56–1.27, p = 0.42), results that are consistent with previous studies [22, 40]. In our meta-analysis reoperations occurred in 4.4% of CDA patients, a significantly lower rate compared to 15.6% in the ACDF group (OR = 0.26, 95% CI: 0.19–0.37, p < 0.001). This finding is similar to other randomized clinical trials or meta-analyses [7, 18,19,20, 22, 35, 37, 41]. Zhu R et al. [41] and Zhang et al. [22], in their respective meta-analysis showed that the rate of index-level secondary surgery in the CDA group was significantly lower than in the ACDF group (RR, 0.47; 95%CI, 0.36–0.63; p < 0.05) and (OR = 0.41, 95% CI: 0.25–0.69, p = 0.001), respectively. Also, Luo et al., found a lower incidence of reoperations (OR = 0.43, 95% CI 0.29 to 0.64, p < 0.0001). Xu et al. and Dong et al., also found a decrease in the rates of reoperation in CDA compared with ACDF, and the superiority may become more apparent over time [36, 38]. Despite, reoperation rates were significantly lower in the CDA group than in the ACDF group in most studies [7, 18,19,20, 22, 35,36,37,38, 41], some meta-analyses have not found a statistically significant differences [4].

In conclusion, for the treatment of CDD, in patients suitable for ACDF or CDA, CDA is superior to ACDF in terms of a better overall success rate, better improvement in NDI, less VAS arm pain, better health questionnaire SF-36 physical component, a higher motion rate, less adjacent syndrome, and less reoperation rate. No significant differences were found in the neck pain scale, SF-36 mental component, and in adverse events between both treatments.