Introduction

Anterior lumbar interbody fusion has been an established surgical treatment for degenerative low back pain since its first description by Lane and Moore [18]. Simultaneous combined anterior and posterior fusion was first carried out by O’Brien in 1960 [26]. Supplementation with posterior fixation affords greater stability and a more favourable environment for fusion [11, 29, 31], and is now well established in modern practice with excellent clinical and radiological outcomes. Evidence for the intervertebral disc as the pain generator in low back pain is increasing [2, 6, 7, 9, 32], suggesting a need for removal of the disc for successful treatment of discogenic low back pain.

Following the introduction of interbody cages, their use has become widespread [38] with authors citing impressive clinical results in prospective cohort studies [28]. Femoral ring allograft (FRA) used for interbody fusion has also proven to be successful in the surgical goals set for the treatment for degenerative low back pain [11, 13, 19, 33], but there has been criticism of its use [28]. One prospective, randomised controlled trial of anterior lumbar interbody fusion compares a titanium cylindrical threaded fusion device to FRA [35]. In this study, stand-alone titanium cylindrical threaded interbody fusion cages were reported to have a higher fusion rate when compared to a stand-alone anterior FRA; however, improvements in clinical outcome were similar in both groups. The cost of titanium cages (TCs), however, is rarely mentioned in papers written by the proponents of their use, even though, the implant cost of a TC may be tenfold that of an FRA.

We were interested to know, in the setting of a prospective, randomised controlled trial, whether there was an observed difference in clinical outcome following circumferential lumbar fusion with the femoral ring (current practice) (Fig. 1) or the TC (Fig. 2). The null hypothesis, therefore, stated that there was no difference in the clinical outcome between these two methods of circumferential fusion. Also, if there was an observed difference in favour of the TC, was the additional cost of the TC justified?

Fig. 1
figure 1

Anterior–posterior radiograph (a) and lateral radiograph (b). Circumferential fusion with femoral ring allograft/translaminar screws at L5/S1

Fig. 2
figure 2

Anterior–posterior radiograph (a) and lateral radiograph (b). Circumferential fusion with titanium cage/translaminar screws at L4/5

Materials and methods

Study design

Local ethical committee and institutional research and development departmental approval were obtained for the study. A single-centre multi-surgeon prospective, randomised controlled trial was conducted. The inclusion criteria (Table 1) were degenerative disc disease between L3 and S1 with a maximum of two consecutive motion segments to be instrumented, pain or functional deficit present preoperatively for a minimum period of 6 months and the patient failing to respond to conservative treatment modalities for at least 3 months. The diagnostic criteria for inclusion required the radiographic evidence of sclerosis, osteophyte formation, degenerative changes of facet joints or greater than 50% collapse of the interspace (as determined by the measurement across the middle of the endplate on the lateral radiograph); 3.5 mm or more movement on flexion/extension radiographs; MRI evidence of dehydration of the lumbar disc with or without reactive sclerosis of the adjacent vertebral body; and discographic evidence of abnormal disc morphology with concordant pain reproduction on provocation. The exclusion criteria (Table 2) listed skeletal immaturity or patients over 70 years, more than two vertebral levels involved, previous spinal fusion, Meyerding grade II or greater spondylolisthesis, active or systemic infection, osteoporosis or the presence of active malignancy. After a full informed consent, randomisation was by sealed envelope with a 1:1 ratio opened just prior to surgery.

Table 1 Inclusion criteria
Table 2 Exclusion criteria

Implants

The titanium implant used in this study was the SynCage (Synthes, Oberdorf, Switzerland). It is wedge shaped with convex toothed surfaces to match the vertebral body and plates and has a built-in lordosis. The central area of the cage has a web-like structure allowing the autograft to be packed within, offering a large surface area centrally for graft incorporation whilst maintaining structural support peripherally. It is inserted after distraction of the intervertebral space and sizing. The cost of the Syncage is presently approximately £1,200 (€1,724).

The FRA was obtained from the National Blood Service (Edgware, England). It is a cortical femoral ring sterilised with ethylene oxide. Donors are tested for hepatitis B surface antigen and antibodies to hepatitis C, HIV 1 and 2. The FRA is shaped with a high-speed burr at the time of surgery to fit the disc space and incorporate lordosis. The cost of the femoral ring is presently approximately £120 (€172).

Surgical technique

The surgical technique followed was the one that was described by Kumar et al. [17], with the posterior procedure being carried out first. In this study, we did not use the graft harvested from the vertebral body. Two techniques of posterior fixation were performed in the study, namely, translaminar screw fixation or pedicle screw fixation (ClickX, Synthes). Translaminar screws are placed via stab incisions as described by Montesano [24]. Subperiosteal dissection is carried out through a midline posterior approach to expose the facet joints, which are decorticated and packed with bone graft harvested from the posterior iliac crest. Further harvested bone graft is kept sterile, while the patient is turned supine and re-draped. The lumbar spine was exposed anteriorly via a retroperitoneal approach and the visualisation of the disc was aided by the Steinmann pins or Synframe retractors (Synthes). The surgical level was identified using intraoperative radiographs and a complete discectomy was carried out. The vertebral body endplates were prepared by curetting until point bleeding was seen. Trial implants were used for sizing the TC. Measuring calipers and a depth gauge were used to size the FRA. An allograft larger than the measured disc space was chosen and burred down to the correct size. The previously harvested autograft was then packed into the implant before insertion. The FRA was secured with a 6.5 mm large fragment cancellous screw and washer inserted into the superior vertebral body at each fused level to act as a buttress to prevent anterior migration of the graft. This measure was not necessary for the TC due to its design. Three doses of intravenous antibiotics were given, one preoperatively and two further doses postoperatively at 8 and 16 h, respectively. TED stockings were worn for 6 weeks postoperatively, with mobilisation commencing the day after the surgery. We did not use lumbar orthoses.

Outcome measures

The Oswestry Disability Index (ODI) questionnaire, Visual Analogue Score (VAS) for back and leg pain (maximum 10 points) and the Short-Form 36 (SF-36) questionnaire were completed preoperatively and postoperatively at 6, 12 and 24 months, respectively. The minimum clinically important differences for outcome measures were established from previously published data; ODI 10 points, VAS 2 points [10], SF-36 seven points in each domain [27]. Radiographs of the lumbar spine were taken at the same time intervals and are the subject of a separate study.

Power of the study

The number of participants sufficient to detect a clinically relevant difference in functional outcome was calculated using the following formula, as used in previous studies [3]:

$$ N = (n_{1} + n_{2} ) = (t_{{2a}} + t_{b} )^{2} \times 4 \times ({\text{SD}}^{{\text{2}}} /d^{2} ) $$

where, N is the total number of participants in the two groups, and n1 and n2 are the number of participants in each group. The risk of a Type I error (t2a) was set to 5% (t2a=1.96) and the risk of a Type II error (tb) was set to 20% (tb=0.842). The standard deviation (SD) of the observation, that is the SD of the ODI, was 16 and was derived from more than 100 patients from a previous database. The symbol ‘d’ indicates the clinically relevant difference (10 points) in the ODI.

$$ N = (1.96 + 0.842)^{2} \times 4 \times (16^{2} /10^{2} ) = 80 $$

This equates to 40 patients in each arm of the trial.

Statistical analysis

Statistical analysis was carried out using SPSS (Version 11). Independent t-test, paired t-test and Pearson chi-squared test were used to establish differences between the groups.

Results

Between February 1998 and October 2002, 83 patients were recruited for the trial: 45 were randomised to receive the TC and 38 to receive the FRA. Entry and exit data were available on all 83 patients with a mean follow-up of 28 months (range 21–75).

Technical infringements

Four female patients requiring two-level fusion and randomised to the SynCage had a disc space that was too narrow to take the smallest size implant at one level; an intraoperative decision was made to insert FRA instead of the SynCage. These four patients, therefore, had one of each implant. One patient was found to have a disc space so narrow that we were unable to insert an FRA; this patient had autologous bone graft chips placed in the disc space. These patients were excluded from analysis, thereby leaving 41 patients in the TC group and 37 in the FRA group.

Patient demographics are set out in Table 3 and showed no significant differences between the groups with regards to sex, smoking history or level of degenerative disc disease. Age at operation, although not statistically significant, was on average 3.6 years less in the femoral ring group. In view of the difference in the age found between the two groups, an analysis of the correlation coefficient for each outcome measure and the age of patient at operation was carried out, but no significant correlation was found.

Table 3 Patient demographics

Translaminar screws were used in 68 patients (94 levels) and pedicle screws in 10 patients (18 levels).

Clinical outcome

Analysis of preoperative outcome measures showed no statistical difference between the two groups (Table 4) except in the vitality domain of the SF-36, which was higher in the TC group.

Table 4 Mean preoperative outcome measures (standard deviation)

Oswestry Disability Index

Two years postoperatively both groups had significantly improved their ODI, with a greater improvement seen in the femoral ring group (mean 15 points, SD 20 points) compared to a mean of 6 points (SD 15 points) for the TC group (Table 5 and Fig. 3). Comparing the change in ODI, there is a significantly greater improvement in the FRA group when compared to the TC group (p=0.027). The FRA group reached the mean clinically important difference (MCID) for ODI, whilst the TC group did not.

Table 5 Mean scores for ODI, VAS back pain and VAS for leg pain
Fig. 3
figure 3

Oswestry Disability Index preoperatively and at 24 months for femoral ring allograft (FRA) group and titanium cage (TC) group

Visual Analogue Score for back pain

Both groups again showed a significant improvement in mean VAS for back pain with the FRA group improving by 2.0 points (SD 2.8) and the TC group by 1.1 points (SD 2.2) (Table 5 and Fig. 4). Again, the FRA group reached the MCID for ODI, whilst the TC group did not. There was, however, no significant difference in change of VAS for back pain between the two groups (p=0.188).

Fig. 4
figure 4

Visual analogue score (back pain) preoperatively and at 24 months for FRA group and TC group

Visual Analogue Score for leg pain

The TC group had worse leg pain on the VAS than preoperatively, increasing by 0.4 points (SD 3.1) (Table 5 and Fig. 5). The FRA patients had a decrease in leg pain by 1.1 points (SD 2.5). The difference between the changes seen in this outcome measure in the two groups was significant (p=0.029).

Fig. 5
figure 5

Visual analogue score (leg pain) preoperatively and at 24 months for FRA and TC groups

Short Form-36

Table 6 shows the scores for each domain preoperatively and at 2 years for both the groups. Figure 6 shows the mean score changes for each domain preoperatively and at 2 years for both the groups. The FRA patients made clinically important and significant improvements in six of the eight domains (general health and emotional role did not reach >7.0 point improvement). The TC patients had consistently lower score improvements compared to the FRA group (except for the emotional role domain). Only two of the eight domains (physical function and bodily pain) reached statistically significant improvement in the TC group.

Table 6 Mean Short Form-36 scores
Fig. 6
figure 6

Mean score changes for each domain of the SF-36. Improvement of +7.0 points for each domain is considered to be a clinically significant improvement. Any negative change indicates deterioration

Smokers did not have worse preoperative scores (p=0.641) than non-smokers, but as a whole, smokers had poorer outcomes at 2 years (p=0.030). With respect to the change in ODI, smokers did significantly worse than non-smokers within the TC group (p=0.014), but there was no significant relationship between the outcome and smoking status in the femoral ring group (p=0.292).

The presence of previous discectomy or decompression had no relationship with either the preoperative ODI or the postoperative change in ODI (p=0.879).

Adverse events

The complications encountered are outlined in Table 7. There was no difference in the complication rate between the two groups (p=0.316). No deep infection was seen in any patient in the trial. Superficial infection occurred in 1/37 patients in the Femoral Ring group and 1/41 patients in the TC group. Vascular injuries to the common iliac vein occurred in five cases (2/37 FRA group and 3/41 TC group) and were primarily repaired without further complication. Retrograde ejaculation was seen in 1/37 (2.7%) patients in the femoral ring group and 1/41 (2.4%) patients in the TC group. There were four dural tears noted during the insertion of translaminar screws, none of which were explored or repaired primarily. Four patients implanted with the TC subsequently developed breakage of the translaminar screws, which required revision with pedicle screw fixation. Two of these patients had single-level fusions and two had two-level fusions. There were no cases of broken translaminar screws in the femoral ring group. One FRA fractured subsequently, which required revision with posterior pedicle screw fixation.

Table 7 Adverse events

Discussion

This prospective, randomised controlled trial shows superior clinical outcome when a FRA is used compared to when a TC is used for circumferential lumbar fusion. This is the first prospective, randomised controlled study to compare these two implants.

A detailed radiological analysis looking at intervertebral height, lordosis and evidence for fusion will be the subject of a further study. Controversy still exists over the relationship between fusion and clinical outcome.

Our results show that the femoral ring group achieved a mean of 15 points improvement in ODI, a mean of 2.0 points on the VAS for back pain and greater than 7.0 points in six of the eight domains of the SF-36 2 years following surgery. For this group, the majority of patients achieved the MCID that authors have previously defined [10, 27].

By contrast the TC group achieved a mean of 6 points improvement in ODI, a mean of 1.1 points on the VAS for back pain and greater than 7.0 points in only two of the eight domains of the SF-36 two years following surgery. For this group, the majority of patients did not achieve the MCID previously described [10, 27]. The VAS for leg pain actually worsened in this group by a mean of +0.4.

Femoral ring allograft patients who improved their ODI score by 10 points or more had a significantly lower preoperative ODI than those that failed to make MCID (p=0.044). The same did not hold true for the TC group (p=0.427).

The standard deviation of the preoperative ODI in both our groups was 14 points. The pre-trial power calculation was based on database information showing a standard deviation of 16 points. Therefore, the trial has a power of well in excess of 80% with the numbers we recruited. In view of the difference in age found between the two groups, an analysis of the correlation coefficient for each outcome measure and the age of patient at operation was performed, but no significant correlation was found.

The use of FRA in circumferential fusion is well established and was the subject of previous retrospective studies [11, 13, 19, 26]. Liljenqvist et al. [19] retrospectively reviewed 41 patients with circumferential fusion using FRA and reported a fusion rate of 95%, with 83% of the patients satisfied or highly satisfied with the outcome of the surgery. Sarwat et al. [33] reported fusion rates of 100% for one level and 93% for two levels using this technique, but by using allograft chips instead of cancellous autograft with FRA. Sasso et al. [35], in their randomised trial of a threaded TC versus FRA, observed a significantly higher fusion rate in their cage patients (97 vs 40%), but similar clinical outcomes in both the groups. This trial [35] used the femoral ring as a stand-alone implant without posterior fixation, which is known to result in a lower fusion rate as shown by Holte et al. [11], a study in which the fusion rate was increased from 75 to 98% with the addition of translaminar screws to anterior lumbar interbody fusion.

Our clinical results are similar to those previously reported in prospective cohort studies of lumbar spine fusion for discogenic back pain [20]. Pavlov et al. published a prospective cohort study of the same TC used in our trial [28]. Clinical results were extremely promising, however, it is readily accepted that randomised controlled trials seldom show such impressive results when compared to prospective cohort studies. Age or previous operation status had no influence on outcome in this trial—which is at discrepancy with other studies [4, 33, 36].

Posterior fixation techniques (translaminar vs pedicle screws) have been compared with circumferential fusion previously [12]. The authors showed no difference in the rate of fusion, but reported a higher incidence of myofascial pain with pedicle screws. Our study failed to show any difference between the translaminar screws and the pedicle screws with respect to change in ODI (p=0.286). Injury to the left common iliac vein occurred in 2/37 (5.4%) cases in the femoral ring group and 3/41 (7.3%) cases in the TC group—which is comparable with previous studies [14, 21]. We found no cases of postoperative deep vein thrombosis. The breakage of translaminar screws was seen more frequently in the TC group, although this did not reach a statistical significance (p=0.178). One patient in each group (2.5%) reported retrograde ejaculation, again in keeping with previously reported incidences with a retroperitoneal approach [34, 37].

Although smoking has been shown to influence fusion rates, its effect on the functional outcome is not always seen [1]; smokers in our study had poorer outcomes.

There are several theories that we believe may explain the difference in clinical outcomes between these two groups. While accepting the central role played by the degenerate disc in producing back pain, a more mechanistic concept is proposed by Mulholland and Sengupta [25] and McNally et al. [23]. Their view is that the altered degenerate disc no longer acts as an isotropic structure, and hence transfers loads abnormally, producing high areas of focal load on the endplate and supporting cancellous bone. The pattern of loading is affected by position in the normal disc, which would not be the case if the disc was isotropic. So, for example, when the spine is flexed, the anterior endplate and vertebrae are loaded excessively, and hence the pain on bending, which is a common feature in patients with back pain. Using the finite element analysis, it has also been shown that loads below a cage, which is load bearing, maybe 500% higher than loads below a normal disc [16, 30]. The conclusion by McAfee [22] that pain relief following cage fusions seems to be little different from other methods of fusion, despite much better rates of fusion, may be a reflection of loading problems below some cages that remain weight bearing. If the mechanistic concept suggested by Mulholland and Sengupta is accepted, then the explanation of the different results may be that the femoral rings allow the development of organised weight-bearing bone blending with the bone of the femoral ring, which transfers load in an increasingly normal pattern as the bone remodels according to Wolff’s law. Young’s modulus of titanium is ten times higher than that of cortical bone and may lead to point loading of the endplate. The cage, whilst integrated and providing a ‘union’ in so far as there is no movement, still transfers the load through the metal, producing high loads in a small area as opposed to the dispersal of load produced by the developing ‘ray’ of bone from the femoral ring.

It is suggested that the disc material itself (nucleus) and inner annulus are important pain generators in low back pain. However, in both types of fusion, the discs are excised identically, yet the fact that there was a difference in pain relief must cast doubt on the importance of these structures as pain generators.

FRA undergoes creeping substitution over time and the initial disc space distraction gained during anterior interbody fusion by 1 year may be lost over time [8, 15]. Studies have reported that interbody fusion with FRA may take up to 18 months to fuse [5]—the time for fusion for the TC used in this trial has yet to be reported. The remodelling that can occur in a fusion achieved with FRA in combination with endplate settling may result in better sagittal alignment of the lumbar spine and a more normal pattern of loading. The restoration of normal lumbar lordosis may improve the clinical outcome. To achieve this goal, O’Brien et al. [26] changed his surgical technique to perform the anterior surgery before posterior fixation, but there are no results comparing the clinical outcomes of these groups.

The insertion of the TC can be performed without fear of damage to the cage itself during the insertion process. FRA is potentially prone to fracture during insertion and, therefore, may persuade the surgeon to ‘undersize’ the implant reducing disc distraction; this may explain the increased leg symptoms seen in the ‘potentially over-distracted’ TC group. Radiological analysis may clarify this concern.

In conclusion, we have found the clinical results of FRA to be superior to TCs when used as interbody spacers in circumferential fusion of the lumbar spine 2 years after surgery. The TC is ten times more expensive than the FRA and its use appears not to be justified on the basis of clinical outcome.