Introduction

Artificial intervertebral disc prostheses are developed and marketed in an increasing number. Some have been shown to result in a clinical outcome comparable to or better than fusion [1]. However, complications are also reported. These are either associated with the surgical procedure, with overloading of the adjacent anatomical structures such as the facet joints or with mechanical failures such as implant subsidence, luxation, dislocation, or breakage of implant components [8, 15, 17].

Still, many lumbar artificial disc prostheses are composed of two metallic endplates and a polyethylene core in between. From hip and knee joint replacements such metal-on-polyethylene components are known to produce wear. This can generate an inflammatory cascade resulting in periprosthetic osteolysis [5, 7]. While rare, such reactions have also already been described for disc replacement. In some cases presence of polyethylene wear was associated with osteolysis, implant loosening and subsidence and sometimes even required revision surgery [14, 16].

In order to prevent such complications, wear testing is required before clinical approval. Various testing procedures are described in different international standards. Most often, wear testing is carried out according to ISO 18192-1:8(E). According to this standard, ten million loading cycles are applied at a frequency of 1 Hz. Loading includes a cyclic axial force combined with rotational movements in flexion/extension, lateral bending and axial rotation. At 1 Hz ten million loading cycles take about 4 months. Testing at higher frequencies would therefore be desirable. For this reason, the ISO standard also offers testing at 2 Hz; however, it says the impact of test frequencies higher than 1 Hz on the implant material behaviour as well as on the accuracy of the test machine shall be investigated by the user. Data about such comparative testing, however, are not available so far.

Therefore, in this study, the wear behaviour of a representative lumbar intervertebral disc replacement was tested according to ISO 18192-1:2008(E) at a loading frequency of 1 and 2 Hz. The aim was to investigate whether the testing frequency affects the polyethylene wear rate and the accuracy of the testing machine.

Materials and methods

Seven Prodisc-L lumbar intervertebral disc prostheses (Synthes Spine, USA) were tested in this study (Fig. 1). The Prodisc-L implant design is based on a ball and socket principle. It is composed of three implant components—two metal endplates made of cobalt—chrome alloy, and a plastic core made of ultrahigh molecular weight polyethylene (UHMWPE, GUR 1020). The endplates have a central keel and small spikes for initial fixation to the vertebrae, and a plasma sprayed titanium coating on all bone-contacting surfaces to promote bony ongrowth.

Fig. 1
figure 1

The Prodisc-L artificial total lumbar disc replacement is composed of two cobalt–chrome endplates and a polyethylene core

The polyethylene cores were supplied by the manufacturer (Synthes Spine) in a fully finished and packaged condition. As per manufactures labelling, all components were gamma irradiated at a dose level between 25 to 40 kGy. In this study, medium-sized components were used with a 14-mm intradiscal height and a superior component with a 6° lordosis angle. The highest medium-sized implants, such as the one selected here, have been most commonly used for wear testing. Size M allows for comparisons between different implant designs and large height improves specimen handling.

The experimental setup was based on the ISO 18192-1:2008(E) standard (ISO 18192-1): Before testing the cores were pre-soaked in calf serum at 37 ± 2°C for 43 days. After this pre-soak period, the seven implants were mounted to specimen cups of a Spine Wear Simulator (MTS Systems Corp., USA) (Fig. 2). This simulator offers unconstrained translation in the horizontal plane. As prescribed by the ISO standard it allows for application of an axial force simultaneously with rotational movements in flexion/extension, lateral bending and axial rotation. The endplates were oriented horizontally, which is a deviation from the ISO standard, but which is not assumed to influence the differences between the loading frequencies. The positioning of the implants was such that their centre of rotation was situated at the centre of the axes of rotation of the test machine. This was accomplished using specimen holders, which were individually adapted to the geometry of the implant on the one hand and the position of the centre of rotation of the testing machine on the other hand.

Fig. 2
figure 2

MTS Spine wear simulator. Six implants (1–6) are loaded three-dimensionally. A loaded soak control specimen (0) accounts for fluid uptake. Fz axial load, Mx lateral bending, My flexion/extension, Mz axial rotation

All testing was performed using bovine calf serum. The serum was diluted with de-ionized water to a concentration of 30 ± 2 g protein/l. To minimize microbial contamination, 10 ml Partricin/l medium was added. 24 g EDTA/l medium were added to minimize precipitation of calcium phosphate on to the bearing surfaces.

Testing was performed according to the loading protocol shown in Fig. 3 of the ISO standard: A cyclic axial load (Fz) between 600 and 2,000 N, cyclic extension/flexion movements (My) between −3° and 6°, cyclic lateral bending movements between −2° and 2° and cyclic axial rotation movements between −2° and 2° were applied simultaneously. Loading and rotational displacement curves for the 1 and 2 Hz testing are shown in Fig. 3a, b, respectively. One of the seven implants was mounted to the loaded soak control station: this station was only loaded in axial direction (Fz) but no rotational movements were applied. Testing was carried out at a temperature of 37 ± 2°C.

Fig. 3
figure 3

a Phasing and amplitudes according to ISO 18192-1. Testing frequency 1 Hz. b Phasing and amplitudes according to ISO 18192-1. Testing frequency 2 Hz

The experiment was composed of eight consecutive test intervals, which were all carried out on the same specimens to reduce the variability that could occur between specimens, including geometrical, material and processing variations (gamma irradiation dose). Four test intervals were carried out at a loading frequency of 1 Hz and four at 2 Hz in an alternating sequence (Table 1). The number of cycles was 1 million per interval at 1 Hz and 2 million at 2 Hz such that the duration was always approximately 12 days. This was also the period of time for renewal of the serum. This approach guaranteed that the time-dependent ageing of the serum was the same in both frequency groups. However, in the 2-Hz group the number of cycles was twice the number of cycles in the 1 Hz group, which could have an effect on the composition of the serum and finally on its lubricating effect.

Table 1 Loading conditions

Analyses were carried out every 12 days after 1, 2, 4, 6, 7, 8, 10 and 12 million loading cycles (Table 1). For this purpose the machine was stopped. The UHMWPE cores were removed from the endplates and cleaned and dried according to ASTM F 2025-06. The weight (AC 120S, Sartorius, measurement error < 0.1 mg) and the height of the cores (MDC-15SB, Mitutoyo, measurement error < 0.01 mm) were then measured. After analyses, the specimens were again mounted to the specimen cups, which were filled with fresh fluid test medium, and transferred to the testing machine. This washing-drying-cleaning procedure took about 8 h such that the viscoelastic components of the implants had time to recover after each test interval.

Heightloss and weight were corrected using the loaded soak control. For this purpose, the heightloss and the weight change of the loaded soak control specimen were measured and subtracted from those of the six wear specimens. This procedure accounts for fluid uptake of the PE core and is based on the assumption that wear is absent if only an axial load is applied. Mean and standard deviation were calculated for the corrected values. For the statistical analysis, first, a comparison between the intervals with identical frequency was carried out using an oneway analysis of variance. The results showed that there was a difference between the first two and the second two intervals for both frequencies. Therefore, the first two intervals were evaluated independently from the second two: the wear rates from all six specimens after the first two intervals at 1 Hz were pooled and were compared with those recorded at 2 Hz, which were pooled accordingly. The same was done for the second two intervals. This procedure was also applied to the heightloss. Wear and heightloss at 1 Hz and 2 Hz were compared with each other using a t test. The level of significance was 0.05.

Results

The weight of the PE cores increased in mean by 0.8 mg (0.7–0.9 mg range) during the pre-soak period.

The mean cumulative wear at 12 million cycles was 62.0 ± 7.6 mg (Fig. 4, Table 2). However, the wear rate (wear per million cycles) was not constant throughout testing. Towards the end it tended to become smaller. This phenomenon can be attributed to an adaptation of the PE-cores to the surface of the upper endplate, which takes place during the first million loading cycles. Due to this effect the data for the first two loading intervals at each loading frequency were pooled separately from the second two intervals. After pooling, the wear rate proved to be higher at 2 Hz than at 1 Hz the values were 5.6 ± 2.3 mg per million cycles at 1 Hz and 7.7 ± 1.6 mg per million cycles at 2 Hz at the beginning and 2.0 ± 0.6 and 4.1 ± 0.7 at the end (p < 0.05, excluding outliers) (Fig. 5).

Fig. 4
figure 4

Cumulative wear of the PE-cores of the Prodisc-L. Individual values and mean with standard deviation of the six implants D-WE1 to D-WE6 after correction for the loaded soak control. Between 0 and 2, and 6 and 8 million cycles the testing frequency was 1 Hz, whereas it was 2 Hz between 2 and 6, and 8 and 12 million cycles

Table 2 Cumulative wear in mg of the PE-cores of the Prodisc-L
Fig. 5
figure 5

Wear per million loading cycles at 1 Hz and 2 Hz for the first two test intervals (left) and the second two intervals (right) at each frequency. The single values (crosses) were pooled for each frequency and a mean value (green bar) was calculated. The groups were statistically compared using a t test. Red crosses represent outliers, which were not included into the evaluation

The mean cumulative heightloss was −0.32 ± 0.04 mm after 12 million cycles of testing (Fig. 6, Table 3). As with the wear rate, the rate of heightloss was not linear but somewhat decreased towards the end of testing and tended to be less when components were tested at 1 Hz compared with 2 Hz. Again, the data were pooled for the first two intervals and the second two intervals separately for both frequencies. The mean heightloss was −0.02 ± 0.02 mm per million cycles at 1 Hz but −0.04 ± 0.02 mm per million cycles at 2 Hz during the first six million loading (p < 0.05 excluding outliers). These values were somewhat smaller during the second six million cycles with −0.01 ± 0.01 for 1 Hz and −0.02 ± 0.01 for 2 Hz (p < 0.05 excluding outliers) (Fig. 7).

Fig. 6
figure 6

Cumulative heightloss of the PE-cores of the Prodisc-L. Individual values and mean with standard deviation of the six implants D-WE1 to D-WE6 after correction for the loaded soak control. Between 0 and 2, and 6 and 8 million cycles the testing frequency was 1 Hz, whereas it was 2 Hz between 2 and 6, and 8 and 12 million cycles

Table 3 Cumulative heightloss in mm of the PE-cores of the Prodisc-L
Fig. 7
figure 7

Heightloss per million loading cycles at 1 Hz and 2 Hz for the first two test intervals (left) and the second two intervals (right) at each frequency. The single values (crosses) were pooled for each frequency and a mean value (green bar) was calculated. The groups were statistically compared using a t test. Red crosses represent outliers, which were not included in the evaluation

The loading accuracy was similar for both loading frequencies. At 1 Hz, the largest deviation from the nominal value was 0.09% (1.8 N) for the maximum axial load and 0.18% (1.1 N) for the minimum load, which was measured in the loaded soak control station. At 2 Hz, this deviation was somewhat larger, 0.21% (4.2 N) and 1.42% (8.5 N), but still clearly within the tolerated limits suggested by the ISO standard. In case of the three rotations, which are measured for the six stations at the same time since they are mechanically coupled, the deviation from the nominal value was <0.005° at both loading frequencies. Additionally, the curves did not show any significant phase shifting between load and angular displacement at both frequencies.

Discussion

The aim of this project was to investigate the effect of the loading frequency on the wear behaviour of a representative metal on polyethylene lumbar intervertebral disc prosthesis (Prodisc-L, Synthes Spine, USA). The results showed that testing at 2 Hz increased the polyethylene wear rate by 1.7 mg/million cycles compared with 1 Hz. Also, the heightloss of the polyethylene core tended to be larger at 2 Hz than at 1 Hz.

The mean wear rate of 5.6 ± 2.3 mg per million cycles at the beginning and 2.0 ± 0.6 mg/million cycles at the end of testing measured at 1 Hz lies within the range of rates reported in the literature. The active L implant, for example was described to have a wear rate of 2.7−2.85 mg/million cycles [6] and the Prodsc-L, the same implant as the one tested in the present study, had a rate of 4.64 mg/million cycles [12].

The wear rate of total joint replacement components is known to depend on various factors such as the friction moment, the lubrication, the test medium, the temperature, the loading scheme and the articulating materials. Also, in case of polyethylene, the type of sterilisation plays an important role.

All of the PE inlays were from a single lot and sterilized using gamma irradiation with a dose of 25–40 kGy. Components were packaged using barrier packaging materials and maintained in a nitrogen environment prior to testing. Previous studies have shown that gamma irradiation in air (oxygen) can adversely affect the mechanical properties and the wear behaviour of polyethylene due to oxidative degradation, especially components that remain on the shelf for extended periods of time [4]. In contrast, components irradiated in an inert atmosphere have shown to maintain their mechanical and wear properties for up to 10 years of shelf ageing. In the present study all specimens were manufactured, packaged and sterilised the same way. Therefore, these factors cannot be made responsible for the different wear rates found with the different loading frequencies.

The load and rotational displacement magnitudes of the specimens were also the same at both loading frequencies. The number of load components, their amplitudes and their phasing have been described to strongly influence the wear rates. Both ASTM and ISO have developed guidelines/methods to determine the wear rate of intervertebral disc prostheses. According to ASTM F 2423 intervertebral disc prostheses may be loaded in only one plane. Either flexion/extension or lateral bending or axial rotation movements are applied in combination with an axial load. In contrast, according to ISO 18192-1 all three movements are applied at the same time. Comparative wear tests revealed that a planar movement as suggested by ASTM F 2423 results in significantly less wear than multiaxial movements as suggested by ISO 18192-1 [6, 11]. It has also been shown that the actuation technique of the test machine can play a role. Testing according to ISO 18192-1 can result in different wear rates depending on the type of test machine [12]. However, again, the loading and rotational displacements were the same for both frequencies, which eliminates this factor as a possible reason for the frequency dependence found in the present study. This conclusion is further supported by the fact that the loading accuracy was almost identical and within the required limits for both loading frequencies.

The test medium was calf serum with a protein content of 30 g/l. EDTA was added to prevent precipitation of calcium phosphate. This mixture is suggested by ISO 18192-1. It has also been used to investigate the wear behaviour of hip and knee joint replacements, however, with smaller protein content. That being said, the use of calf serum for articulating joint wear characterisation is well developed and has been used for over a decade for the evaluation of both total hip and knee joints of various material combinations. However, knee and hip are joints with synovial fluid, while this is not the case for the intervertebral disc. Whether serum is adequate for intervertebral disc prostheses is therefore still under debate. An alternative would be to use saline solution. The effects, however, are still not well understood. Partially, it is said that saline solution increases the friction coefficients and, thus, the wear rates compared with serum [2] while others suspect the opposite due to the lack of proteins.

Similar to the loading protocol, the medium was held constant for all specimens and both frequencies. The test period was always 12 days before renewal of the serum, so the age of the serum at the end of each test interval was identical in both frequency groups. However, the double number of shearing cycles and the double sliding speed (in flexion/extension approximately 4.7 mm/s at 1 Hz and 9.4 mm/s at 2 Hz) in the 2 Hz compared with the 1-Hz group could have led to protein denaturation and, thus, may have decreased the lubricating effect of the serum and increased the wear rate. However, whether during a testing period of 12 days, 2 million loading cycles and the given sliding speed this denaturation takes place to a significant degree still has to be investigated.

The temperature was held constant during testing using a temperature control system. It is reported that increasing temperature decreases the wear rates but increases the precipitation of proteins. These precipitations, if adherent to the articulating surfaces probably decrease the friction coefficients [10]. However, again, the temperature was held constant throughout testing irrespectively of the loading frequency. It should be noted that locally, at the surface of the articulating components, the temperature might have been somewhat higher at 2 Hz than at 1 Hz. This is suspected since frictional heating increases with increasing sliding speed. However, this would have caused an inverse effect with lower wear rates at 2 Hz than at 1 Hz. Differences between the testing temperatures are therefore unlikely as a reason for the different frequency-dependent wear rates.

The lubrication of the articulating surfaces, in contrast to all factors discussed so far, may have been different at the two loading frequencies. Maybe, at 2 Hz, the serum was rapidly squeezed out of the area between metal and polyethylene and did not have the time to become completely re-entrained. This is in contrast to Striebeck’s model, which says that a higher velocity increases the fluid film and, thus, reduces friction. However, compared with knee or hip joint prostheses, the amount of movement is small for a total disc replacement in relation to the size of its articulating surface. This relation may be responsible for the serum to be squeezed out as described earlier and to be cut-off during being re-entrained especially at higher frequencies. If this is the case, this could have increased the friction moment at 2 Hz compared with 1 Hz and could therefore be responsible for the higher wear rates at this loading frequency.

The loading frequencies in vivo are probably smaller than 2 Hz. Testing at 2 Hz would therefore be less physiological than testing at 1 Hz. The authors therefore suggest testing metal on polyethylene disc prostheses at 1 Hz. This will probably create more realistic wear rates than testing at 2 Hz. However, even if carried out at 1 Hz, the current wear testing parameters described in the ASTM guideline and ISO standard may not produce clinically relevant wear rates and particles. Standard testing for example produces symmetrical wear patterns but this may not be the case in vivo [13]. However, some parallels can also be found: Kurtz et al., for example described adhesive/abrasive wear at the dome and at the rim of retrieved Charité III implants as the dominant wear patterns. Similar patterns are also described for samples tested according to ISO 18192-1 [6, 9]. Also the heightloss of 0.1–0.9 mm depending on the time of implantation (1.8–16 years) reported by Kurtz et al. is in the same range as that of 0.3 mm measured after 12 million cycles in the present study. Wear testing therefore requires comparative data of competitive implants for interpretation.

The materials of the articulating surfaces have been shown to strongly influence the wear behaviour. Specifically, wear rates of metal-on-metal joints theoretically are inversely affected by testing rate, leading to a reduced wear rate at higher frequencies [3]. Therefore, the results of the present study should not be generalized to other articulating joint material combinations until actual testing has been performed.

In summary, the present study shows that wear testing with lumbar intervertebral disc prostheses composed of cobalt–chrome endplates and a polyethylene core results in higher wear rates if carried out at 2 Hz than at 1 Hz. The loading frequency of the spine in vivo is mostly below 1 Hz—at least for the loading amplitudes, which are prescribed by the ISO standard. The maximum axial load of 2,000 N for example, is not reached during walking (approximately 1,000 N) but only when climbing stairs two stairs at a time, holding an additional weight, or standing up from a chair [18]. Irrespective of the loading frequency, it is difficult to decide which frequency produces more physiological wear patterns in the absence of retrieval studies. In view of these restrictions, the authors suggest testing of polyethylene-on-metal couplings at 1 Hz even in cases where the accuracy of the testing machine would allow testing at higher frequencies.