Introduction

Lumbar/lumbosacral spinal fusion is an established procedure for the treatment of many degenerative diseases and spondylolisthesis [1, 2]. However, fusion of a lumbar spine segment leads to compensatory biomechanical changes at adjacent levels. Experimental and clinical studies have identified increased mobility (range of motion) and intradiscal pressure on the adjacent segments after lumbar spinal fusion. Increased motion of adjacent level facet joints has also been described [3,4,5]. These biomechanical changes may lead to the development of adjacent segment degeneration (ASD) of the cranial and caudal segments, described as a late effect of lumbar spinal fusion in many studies [6,7,8]. However, Malakoutian et al. [9] report in a review article that according to some studies, no kinematic changes were described at the cranial and caudal segments after spinal fusion. Nevertheless, they reported that some patients develop excessive kinematic changes after spinal fusion, leading to instability in the cranial adjacent segment.

Dynamic spinal implants were invented to prevent the manifestation of ASD. Dynamic stabilization devices are based on pedicle screw fixation coupled with a flexible longitudinal connecting system [10]. These systems provide dynamic stabilization of the treated segment and in some cases the adjacent segment, reducing loads and preventing hypermobility. Biomechanical studies report reduced range of motion and load sharing of the adjacent segment when using dynamic stabilization. The adjacent level is being protected from hypermobility, which might cause ASD [11, 12]. However, long-term clinical studies have reported no clinical benefit from dynamic implants [13, 14]. In recent years, dynamic spine implants have gained more popularity. As a result, purely dynamic or hybrid (semi-rigid) implants have been introduced.

Prior to market launch, new spinal implants are biomechanical tested according to standards. The International Standards Organization (ISO) 12189 (Fig. 1) standard was introduced to investigate mechanical properties of spinal devices under dynamic situations. The American Society for Testing and Materials (ASTM) F2624 (Fig. 2) was developed for testing static, dynamic and wear of extra-discal spinal constructs. ISO 12189 and ASTM F2624 are two standards intended to identify weak areas of the implants and guarantee safe application.

Fig. 1
figure 1

ISO 12189 model for the testing of dynamic pedicle rod systems. Light grey: PE blocks, blue: rod implant with dynamic part (black spring), dark grey: pedicle screws, red springs: calibrated springs simulating the intervertebral disc

Fig. 2
figure 2

ASTM F2624 model for the testing of dynamic pedicle rod systems. Light grey: PE blocks, blue: rod implant with dynamic part (black spring), dark grey: pedicle screws

In 1996, the ASTM proposed the first version of the F1717 standard (Fig. 3), introducing a test method to validate the mechanical characteristics of posterior spinal devices. This standard recommended an experimental bisegmental vertebrectomy model, consisting of two polyethylene (PE) blocks imitating vertebral bodies. It presented a “worst-case scenario” for the posterior spinal device. The dynamic test included a dynamic flexion test, dynamic extension test, dynamic torsion test, dynamic lateral bending test and fatigue tests.

Fig. 3
figure 3

ASTM F1717 model, the standard test method for spinal implant constructs using a vertebrectomy model. Light grey: PE Blocks, light grey crossed: vertebrectomy, blue: rod implant, dark grey: pedicle screw

In 2008, ISO introduced the ISO 12189 standard. This test method specified methods for fatigue testing of spinal implant assemblies, for fusion or motion preservation, using an anterior support. The ISO standard is a bisegmental model consisting of three PE vertebrae, and this test method represents a physiological model. Three calibrated springs are located in the intervertebral space simulating the intervertebral disc.

The aim of this study was to analyse these current methods and tests, and question whether the current tests that are required prior to market launch are sufficient.

Methods

Three prospective clinical studies are presented, all carried out by the authors of this manuscript.

Study 1

A prospective clinical study was conducted. The study investigated clinical and radiological outcomes after lumbar spinal fusion with topping off using the CD Horizon BalanC Rod System (Medtronic, Minneapolis, USA). Inclusion criteria were:

  • Degenerative disease or spondylolisthesis of the lumbar spine with indication for lumbar fusion (osteochondrosis Modic grades I–III or spondylolisthesis Meyerding grades I–III).

  • Radiological signs of degeneration without instability at the level cranially adjacent to the intended fusion (Pfirrmann Grade 2–4) [15].

Exclusion criteria were:

  • No degeneration of the cranial adjacent level

  • Previous lumbar surgery

  • Motor deficiency

  • Scoliosis with Cobb angle > 25°

  • Spondylolisthesis > Meyerding grade III

  • Osteoporosis

  • Metabolic bone disease

The clinical data assessments were obtained from the German Spine Registry using the Operation 2011 form, the Core Outcome Measure Index (COMI) score, and a Visual Analogue Scale (VAS) score for back and leg pain. Forty-five patients were included in the study. Clinical and radiological examinations were performed over a 2-year follow-up. Two patients were lost to follow-up.

Study 2

This prospective study investigated clinical and radiological outcomes after lumbar spinal fusion with topping off, using the S4 Dynamic rod system (Aesculap AG, Tuttlingen, Germany). Inclusion criteria were radiological signs of degeneration without instability (Pfirrmann Grade 2–4) [15] in the cranial adjacent segment. Patients without radiological signs of degeneration of the cranial adjacent segment and those with previous lumbar surgery were excluded. Clinical data assessments were obtained from the Spine Tango Registry with the Operation 2011 form, the COMI score and a VAS score for back and leg pain. Six patients were included in the study. Clinical examinations and radiographs were performed preoperatively, and at 6 weeks, 3, 12 and 24 months after surgery.

Study 3

The third study was a randomized, therapy-controlled, 2-centre trial. Patients included in this study were randomized into two groups, a control group undergoing posterior lumbar interbody fusion (PLIF) and an intervention group undergoing PLIF with topping off. In the control group, the S4 system (Aesculap AG, Tuttlingen, Germany) was used. The patients in the intervention group were treated with the S4 Dynamic rod system (Aesculap AG, Tuttlingen, Germany). Patients were treated for degenerative disease or spondylolisthesis of the lumbar spine with lumbar spinal fusion and dynamic stabilization of the cranial adjacent level. Inclusion criteria were radiological signs of degeneration without instability (Pfirrmann Grade 2–4) [15] in the cranial adjacent segment. Patients without radiological signs of degeneration of the cranial adjacent segment and those with previous lumbar surgery were excluded. Patients were subjected to clinical examinations and radiographs preoperatively, and 6 weeks, 3, 12 and 24 months after surgery. Five patients were included in the study, before the study was terminated. The study protocol was published by Siewe et al. [16].

Data collection from German spine registry and spine tango

The German Spine Registry is a national registry for the assessment of data regarding patients who are treated due to a spinal disease. The German Spine Registry has been developed on the platform of the European Spine Registry (Spine Tango). The German Spine Registry uses the same content as Spine Tango. The surgery and follow-up forms are equivalent in both registries (e.g. Operation form 2011 and COMI). As a result, assessment data from both registries can be compared.

The implants

  1. 1.

    CD Horizon BalanC rod system (Medtronic, Minneapolis, USA; Fig. 4)

    Fig. 4
    figure 4

    PLIF Model CD Horizon BalanC Rod System Fa. Medtronic with the C-shaped (PEEK) dynamic topping off

This system was introduced outside the USA on March 2012, providing an option to perform lumbar spinal fusion and dynamic stabilization of the cranial adjacent segment. The rod system had fusion and dynamic (topping off) components. The dynamic component was composed of a silicone hinge in a C-shaped polyetheretherketone (PEEK) frame, while the fusion component was entirely made of PEEK. The product was tested following ASTM F2624-07 testing protocol.

  1. 2.

    S4 Dynamic rod system (Aesculap AG, Tuttlingen, Germany; Fig. 5)

    Fig. 5
    figure 5

    Spring element of the S4® Dynamic rod, Aesculap AG, Tuttlingen

This system was introduced in two spine surgery centres in Germany in 2013. The objective of the studies was to examine the efficacy and safety of this new hybrid posterior stabilization device, the S4 Dynamic rod. The dynamic connecting rod consisted of a spring element, a short-rod section on the upper side and a rod segment of variable length for fusion of 1–2 segments. The implant was made of titanium and titanium forged alloy (Ti6Al4 V). The product had undergone intensive biomechanical testing for Conformité Européenne (CE)-certification and was compared to other posterior dynamic stabilization (PDS) and rigid devices. The product was tested following ASTM F1798-97 [17]. The implant was only used under study or registry conditions (randomized controlled trials and Spine Tango registry).

Results

CD Horizon BalanC rod System

During routine follow-up, implant failure of the dynamic portion (topping off) occurred in five cases. This implant failure was found in 11% of the applied implants and was verified 4, 7, 9 months, 1 year and 2 years postoperatively (Figs. 10, 11). Four cases had undergone monosegmental and one case bisegmental lumbar spinal fusion with dynamic stabilization of the cranial adjacent level. An example of the material failure is presented in Figs. 6, 7, 8 and 9.

Fig. 6
figure 6

Operatively removed broken C-shaped part of the CD Horizon BalanC rod

Fig. 7
figure 7

Laboratory examination provided by Medtronic Co.: rod breakage near the centre of the C-shaped portion of each rod. Visual and optical examinations of the area of fracture initiation did not identify a pre-existing surface defect that could contribute to crack propagation

Fig. 8
figure 8

Fracture surface examination identified a multi-modal fracture, with some evidence of convex-shaped progressive striations, initiating from the inside of the C shape and emanating outward, consistent with cyclic fatigue. The fatigue striations were followed by increased material disruption through the remaining cross-sectional area, consistent with overload

Fig. 9
figure 9

Multiple witness marks around outside of the C shape suggests the MAS tulip head of the screw may have impinged against the C shape. This impingement condition may have created an undesired fulcrum, shifting the plane of maximum bending stress to a plane that is normal to the surface but intersects at the fulcrum

S4 Dynamic rod System

In the S4 Dynamic rod group, material failure of the dynamic portion was demonstrated in seven patients (64%). Four patients did not participate in follow-up examinations. Failure was demonstrated from 6 weeks to 1.5 years, postoperatively (Figs. 12, 13).

Fig. 10
figure 10

Postoperative anterior and lateral X-ray of the lumbar spine after PLIF L4–5 with topping off L3–4 using the CD Horizon BalanC rod System

Fig. 11
figure 11

Anterior and lateral X-ray of the lumbar spine one year after the procedure revealing a breakage of the BalanC rod at L3/4 segment

Fig. 12
figure 12

Anterior and lateral X-ray of the lumbar spine revealing breakage of the Dynamic rod at L3/4 segment

Fig. 13
figure 13

Posteroperative X-rays after 6 months did not show translation of the spring rod. Material breakage was detected in the 2-year follow-up

Due to the reported adverse events, all three studies were interrupted at an early stage. Reports were sent to the Federal Institute for Drugs and Medical Devices (BfArM) as well as the companies producing the implants. The S4 Dynamic rod System was withdrawn from the market. Following analysis of the broken implants and radiographs of the failed rods by Medtronic, the CD Horizon BalanC system was also withdrawn from the market.

Discussion

Both ISO and ASTM publish standards and guidelines for examination and evaluation of the mechanical properties of spinal devices. These standards supply repeatable methods with specific conditions as a protocol. In this way, the manufacturer, the notified body and clinicians have a reference to compare and examine spinal implants.

In general, spinal implants are medical products of risk category III; especially in Europe, the manufacturer or distributor of an implant is responsible for CE-certification. Before market launch, the implants must be approved by the notified body. The notified body verifies whether the implants fulfil the requirements of current standards. The manufacturer conducts tests based on the standards published by ISO and ASTM to ensure that the implants perform safely and reliably. However, these tests do not categorize the implants as “failed” or “successful”. The testing results should be interpreted by both the manufacturer and the clinician. Ultimately, the clinician is responsible for the decision as to whether the test results for a spinal device meet the requirements for the patient [17]. Standards for testing criteria have been recommended in a number of reports. Wilke et al. [18] proposed a standardized protocol for in vitro testing of spine implants that allowed researchers to compare their results. The protocol described recommended loading methods, specimen conditions and analysis parameters.

The introduction of dynamic or semi-rigid spine implants has led to the development of new testing methods to validate the mechanical performance of these devices. In this case, engineers and clinicians modified and revised test methods current at the time. This required an understanding of both new implant mechanics and the biology of the spine. ASTM F2624 is an example of such a revised standard (modified ASTM F1717). ASTM F1717 is a standard test method for spinal fusion implants. ASTM F2624 was developed to test non-fusion devices, such as extra-discal spinal motion preserving implants. ASTM F2624 makes use of a torsion actuator to generate flexion, extension and rotation motion controlled by an engineer [17].

Compared to rigid implants, dynamic spine implants show a reduced material or structural stiffness, leading to smaller decreases in mobility of the instrumented segment [11, 12, 19]. However, a biomechanical finite element study investigating the effect of implant stiffness has reported a minor difference of the stiffness on the range of motion (ROM) and intervertebral disc pressure of the lumbar spine. For flexion–extension, the intersegmental rotation increased with decreasing stiffness of the implant, such that a steady state is reached with an implant’s stiffness of 1000 N/mm [20]. In addition, the decreased mobility of the instrumented segment differs with different designs of the “flexible” or “dynamic” devices. Galbusera et al. found that ROM reduction varies widely between different rigid and dynamic device designs [19]. In a biomechanical investigation using human spines from cadavers and dynamic fatigue tests involving standardized test blocks, Ponnappan et al. and Gornet et al. [21,22,23] compared and evaluated dynamic PEEK rods and traditional titanium rods. In the cadaveric spine model, the PEEK rods showed no significant difference in ROM reduction compared to titanium rods. The cycles to failure of the dynamic compressive bending tests reached runouts with 5 × 106 cycles for 23° and 25° rotational angles, respectively [21, 22]. However, to simulate the material behaviour in a patient after a certain standing time, several physical effects have to be considered, such as the Bauschinger effect and cyclic hardening or softening [24]. Shrestha et al. [25] showed that PEEK reach its cyclic stable condition after approximately 104 cycles, which corresponded to only 1–2 days of walking. This effect has to be considered in terms of sufficient preconditioning of the implants for low cycle tests, static tests, as well as for finite element simulations.

Nevertheless, the introduction of a new standard cannot guarantee that no fatigue failure will occur. The in vivo performance of these implants cannot be exactly predicted through these tests. Failure rates of fatigue-related problems, such as screw and rod breakage, have been reported in the literature [26]. The international standards and test methods have been revised in recent years. Experimental studies have been performed on these test methods to gain more information and knowledge regarding reliability [6]. Limitations of the international standards have also been reported in the literature. La Barbera et al. investigated in a finite element study the appropriateness of the ASTM F1717 standard [27]. They concluded that under the consideration of a worst-case geometry of the anatomical spine model, the von-Mises stress acting on the pedicle screws and rods was higher (by 15.2% and 8.9%) than in the ASTM F1717 standard model. As a result, they suggested a revision of the ASTM F1717 standard [27]. In an experimental and finite element study, Villa et al. compared the standards of ISO and ASTM F1717 for the fatigue testing of posterior spinal fixation systems. They concluded that the ASTM standard needed revision [28]. For the vertebrectomy model ASTM F1717, an anterior column-supporting device should be added, as the lack of anterior support creates non-physiological conditions. The model is sufficient for the testing of rigid spine devices, but the effects of anterior support stiffness must be considered for dynamic devices with lower stiffness [28].

A declaration of the load value applied at the test set-up for the device is also required. Compared to the unspecified load protocol, the standard guides for the evaluation of the fatigue failure risk do not stipulate a combined loading protocol in all bending and torsion directions (flexion–extension, lateral bending, axial rotation and compressive loads). However, the acting of a multiaxial stress state, as well as non-proportional loadings, decreases the fatigue life of the device [27, 28]. Therefore, the simulation of daily activities, such as walking, should be included in the fatigue tests [28]. To investigate the physiological kinematics of the lumbar spine, Bible et al. [29] and Cobian et al. [30] performed in vivo studies to assess the total L1–S1 ROM of several daily motions, e.g. walking or tying shoelaces. In a previous in vivo study, Cobian et al. tracked 10 healthy probands over 3 days and found the median total lumbar ROM (L1–S1) to be 11.2° for flexion–extension, 10.3° for lateral bending and 12.5° for axial rotation, with an extrapolation of 3 × 106–7 × 106 cycles per year. However, the standard test method ASTM F2624-12 for dynamic extra-discal single level spinal constructs stipulates a minimum mission time of 107 cycles before failure. Testing protocols with complex combined load bearing, e.g. flexion/extension, lateral bending, axial rotation and axial compression at the same time, can be included in the standard testing for spinal implants.

In a biomechanical study, Wilke et al. [31] compared in vivo and in vitro load components of a spinal internal fixator. They suggested that the measured load bearing can differ between in vitro and in vivo conditions during the extension of the spine. In this case, muscle forces play a significant role on the load bearing by the spinal device. Volkheimer et al. [32] reviewed in vitro test protocols for investigating instrumented adjacent segment biomechanics, and reported that the in vivo kinematics of the spine cannot be exactly reproduced using the current test methods. In this sense, they questioned the validity of the current testing protocols. In a finite element study of the instrumented spine, standing, upper body flexion and extension in rigid instrumented lumbar spines (L2–L4) were simulated together with models of current standards, and compared to in vivo measurements [33]. The authors compared ASTM F1717 and ISO 12189. They report on whether there was sufficient simulation of the forces applied to the rigid devices during flexion–extension testing with a preload.

In addition, temperature and moisture are important aspects to consider during the performance of fatigue testing. A recent experimental study investigating the biomechanical properties of dynamic spine implants developed a test method in a vapour-filled chamber, simulating body temperature [34]. However, other biomechanical studies investigating the biomechanical properties of PEEK-based implants in static tests reported minimal influence of temperature [23, 32, 35].

Standards need user feedback so that they can be continually modified and stay up-to-date [17]. The incidents described in this study should raise concerns regarding the reliability of the current standards. In addition, high rates of implant failure for dynamic posterior stabilization devices have been reported in other studies. Reyes-Sanchez et al. [36] found an implant failure of 22% using a pedicle screw dynamic stabilization system, Accuflex (Globus Medical, Inc.).

Implant failure can have many possible causes. Nevertheless, testing standards could prevent implant-specific reasons for failure, such as an unacceptable level of material fatigue. Clearly, testing standards cannot prevent all causes of implant failure. For example, incorrect indications for a surgery could lead to unintended load conditions, and inappropriate implantation can overload the implant or pedicle screws, leading to failure. These aspects are especially important for dynamic spine implants, as higher strains act on them. In a prospective clinical study, Hof et al. compared and evaluated the clinical and radiological outcome after dynamic stabilization alone and hybrid stabilization using the CD Horizon1 Agile TM. They showed implant failure of the dynamic part of the implant in 10 of 37 cases. They concluded that the implant failure was linked to increased posterior distraction of the operated segment, leading to larger disc height and increased shear forces on the implant [26]. Surgeons should be aware of contraindications for the use of dynamic spine implants and select their patients accordingly. In conjunction with this, manufacturers and distributors should define and report the most updated indications and contraindications for implant use. To facilitate this situation, the standards organizations should offer clear instructions to the manufacturers or distributors on implant testing. In the three studies presented in this paper, inclusion criteria of the patients were in accordance with the indications for use, as reported by the manufacturer. However, insufficient application of the standards or misinterpretation of the testing results could lead to the development of inadequate implants.

Conclusion

Three prospective studies have been terminated on an early stage due to material failure of dynamic spine implants. The underlying studies suggest that the current standards for the testing of load bearing capacity and stand ability of dynamic spine implants might be insufficient. For patient safety and successful treatment, the reliability of these tests is crucial. Standards required prior to market release of an implant should be revised and developed to sufficiently consider the in vivo patterns of motion and load (e.g. complex combined load bearing). In their turn, the surgeon, the manufacturer and the indications for surgery each play a significant role in the prevention of implant failure.