Introduction

Multiple studies have established the ability of radiostereometry (RSA) to predict later failure of hip [4, 13, 14, 18, 26] and knee [19, 21] prostheses. Measurement of early implant migration performed with the high-resolution RSA has therefore been recommended as an important part of a stepwise introduction of new implants [15]. Some studies have evaluated the predictive value of early implant migration based on followup of the same cohort of patients [13, 14, 21], whereas others merged short-term RSA data with data on revision rates extracted from registry studies [4, 18, 19]. In a recent evaluation of this kind, van der Voort et al. analyzed 24 RSA studies and 56 studies on survival of cemented and uncemented hip stems [26]. The authors suggested that a mean stem subsidence of more than 0.15 mm in 2 years predicts poor performance of a composite-beam cemented stem, defined as a revision rate for aseptic stem loosening greater than 5% at 10 years.

The Spectron® stem (Smith & Nephew, London, UK), a composite-beam cemented stem, originally had an overall matte surface (Ra: 0.76 μm) and the same length (130 mm) for all stem diameters. The clinical outcome in a randomized study was good with at least 96% survival at 11 years and no stem revisions for aseptic loosening during that period [7]. In 1989 this design was changed to incorporate a rougher proximal surface (Ra: 2.8 μm) and a slightly smoother distal surface (Ra: 0.7 μm). This design, called Spectron EF, also demonstrated a low revision risk [22]. In 1995 the design was changed a second time and the name was changed to Spectron EF Primary. In this version, the neck was polished, stem length was scaled with decreasing length for smaller stem diameters, and two new smaller sizes were introduced. All versions of the Spectron stem were made of cobalt-chromium alloy. The two EF versions had the same surface treatment distal to the collar.

In total approximately 11,700 Spectron EF Primary stems were inserted in Sweden between 1995 and 2013. After 7 to 8 years in service, it became obvious that the new design performed poorer than its predecessor [5, 22]. Early RSA studies indicated a possible, but not significantly, increased subsidence for the smallest stem size [17]. Thien and Karrholm [24] showed in a registry study that the smallest stem size as well as high-offset stems with the longest neck length had a higher revision risk compared with larger sized and standard offset stems. The early subsidence of the Spectron EF Primary stem has been reported to be small and has been considered safe [12]. In addition, early RSA studies of this stem with short-term followup could not establish any clear influence of stem design on the pattern of subsidence [17]. Because of these contradicting results, we found it important to evaluate the Spectron EF Primary stem in the long term and in a larger number of hips to evaluate if RSA could be of any value to predict failure of this design.

Our primary question was if early stem subsidence and rotation measured with RSA within a defined study group could be used to predict risk for later revision resulting from loosening, osteolysis, or implant fracture of the Spectron EF Primary stem. A secondary question was to study if the smallest stem sizes and high femoral stem offset contributed to the risk of stem failure.

Patients and Methods

We identified all patients in our department who had received a cemented Spectron EF Primary stem and been recruited to studies in which stem fixation was measured with RSA. In total 279 THAs (236 patients) had been included in four different clinical studies at our department. The studies evaluated different cement types, cup fixation methods, and polyethylene types. The patients were operated on between 1996 and 2005. Only patients with RSA data available at 2 years were included, leaving 247 THAs (209 patients) (Fig. 1). All included patients were followed until revision, death, or the end of the observation time for this study, which occurred on November 30, 2015. The median followup was 14 years (range, 3–18 years). Fifty-seven of the included patients (59 hips [24%]) died during the followup period. We extracted followup data from our hospital’s patient records and the Swedish Hip Arthroplasty Register. No patients were lost to followup. All types of cup designs (cemented or uncemented), cement types, and polyethylene inserts (ethylene-oxide/low-dose irradiated/high-dose irradiated) were included (Table 1). There were 159 cemented and 88 uncemented cups.

Fig. 1
figure 1

The flowchart shows the patient selection procedure

Table 1 Basic demographic data on the 247 hips included in the study

At the operation all patients received a cemented Spectron EF Primary stem with 0.8-mm tantalum markers attached to one distally and two proximally placed short metal towers made of titanium. In addition, the femoral head center constituted a fixed landmark at the RSA evaluation [1]. The proximal femur was marked with up to nine 0.8- or 1.0-mm tantalum markers. The cup and periacetabular pelvis were also marked with tantalum beads.

Conventional radiographic and RSA examinations were performed within the first postoperative week and then at 3 and 6 months and at 1, 2, 3, 5, 7, 10, 12, 13, 15, and 17 years followup. Exact followup varied slightly because of different study protocols; however, all patients were scheduled for a 2-year examination. Patients enlisted for reoperation also had a preoperative conventional radiographic examination before surgery.

At least three stable RSA reference points in each segment (that is, at least three stable femoral markers or at least two stable stem markers combined with the femoral head center) were required for a valid RSA analysis of rotation. We accepted a maximum mean rigid body error of 0.35 mm and a maximum condition number (RSA rigid body elongation) of 130 for stem rotation [25]. For subsidence measurements, at least two stem markers with a constant distance between them throughout the followup period should be available, one at the tip of the stem and one at the proximal part of the stem including the femoral head center. At the 2-year followup, there were 24 stems in which only two markers were available for analysis of subsidence. Because an increasing number of markers tend to loosen with time, mean errors of rigid body fitting, condition numbers, and markers available for analysis are presented for the last followup examination (Table 1). Radiostereometry analysis was performed with the UmRSA suite (RSA Biomedical, Umea, Sweden), Version 6.0. RSA analysis was performed by three of the authors (P-EJ, BS, JK).

The precision of RSA was evaluated with 224 double examinations at the postoperative examination. The patient was repositioned between the two examinations [25]. Precision was calculated at the 99% confidence level with the formula:

$$99\% precision \; limit = \sqrt {\frac{{\mathop \sum \nolimits_{1}^{n} \left( {MEASURED\, DIFFERENCE} \right)^{2} }}{n - 1} } \times {\text{T - factor}}_{{ 9 9\% \, \left( {\text{two - tailed}} \right),}}$$

assuming that the true difference between the double examinations is zero. Based on these computations, we could in the individual case detect stem subsidence of 0.146 mm (99% confidence limits of the error). The corresponding detection limits for AP tilt, anteretrotorsion, and varus-valgus movement of the stem were 0.48°, 1.02°, and 0.20°, respectively.

All reoperations were performed at Sahlgrenska University Hospital. The patient reports of all reoperations were examined for the exact reason for revision as well as for findings of unanticipated stem loosening during revision surgery. Loose stems, fractured stems, and occurrence of periprosthetic fracture through an osteolytic area, presumed to have been initiated by abrasive wear from stem loosening inside the cement mantle, were defined as failures resulting in reoperation with the potential to be predicted with RSA.

The last available conventional radiographic examinations (AP, lateral, pelvic view) of all nonrevised stems were analyzed by one of the authors (JK).

No Spectron EF Primary stems subsequently revised showed a complete radiolucent zone at the cement-bone interface. Only one revised stem had a radiolucent line exceeding 50% of the cement-bone interface. A common finding in the reoperations was large amounts of grayish hypertrophied synovial tissue in the articular cavity. None of the cases had developed a pseudotumor expanding outside the joint capsule. All extracted stems showed a more or less severely abraded surface. Forty-one of 220 unrevised stems showed radiological signs of stem subsidence within the cement mantle. They were not classified as failures because there were no or only minimum radiolucencies in regions 1 and 7. Minor osteolysis in Gruen zones 1 and 7 were present in some of these cases and also in cases without any signs of stem subsidence. Radiological failure was therefore conservatively defined as presence of osteolysis in Gruen regions 2 to 6 with a minimum size of 4 × 10 mm set arbitrarily. In a few cases osteolytic lesions of a smaller size could be suspected in regions 2 to 6, but because these changes were difficult to separate from bone atrophy as a result of stress shielding, we did not think that they should be regarded as failures. A few stems still in situ at the last followup had cement-bone radiolucencies extending to the proximal part of regions 2 and 6 but in none of them did the radiolucency extend to regions 3 or 5. Thus, the cement-bone radiolucencies did not reach 50% of the interface in any of these cases.

Thirty-two hips were classified as failures with 27 stem reoperations (21 loose stems, five loose stems with implant fracture, one periprosthetic fracture through an osteolytic area) and five radiological stem failures. The crude reoperation rate resulting from stem failure as defined previously was 11%. Inclusion of radiological failure increased this figure to 13%. The 10-year stem survival rate using our stem failure definition was 94% (95% confidence interval [CI], 90%–96%) with 188 patients remaining (Fig. 2).

Fig. 2
figure 2

The graph shows a Kaplan-Meier survival curve with an endpoint of stem failure for all 247 included cases. Nremaining = 82 at 15 years

Statistics

Stem subsidence and stem rotation were presented for stems classified as failures and nonfailures. Comparison between the groups at 2 years was made using nonparametric statistics.

In addition, we performed Cox regressions on continuous stem subsidence as well as stem subsidence dichotomized at 0.15 mm [26] adjusting for age, sex, polyethylene type (conventional/highly crosslinked), stem size (1/2/3 or larger), and femoral head offset (high/standard). The total number of hips in the adjusted analysis for stem subsidence is 244 because of missing data for femoral offset (n = 1) and stem size (n = 2) with no failures in hips with missing data.

A Cox regression including stem rotations around the three principal axes, adjusted for age, sex, polyethylene type, stem size, and offset, was used to identify the axis of rotation that was the strongest predictor for later stem failure. At 2 years, stem rotation could be studied in 224 stems with a femur condition number ≤ 130 and at least two stable stem markers combined with the femoral head center (31 stems in the stem failure group and 193 stems in the nonfailure group). The total number of hips in the adjusted analysis for stem rotation is 221 because of missing data for femoral offset (n = 1) and stem size (n = 2) with no failures in hips with missing data.

No interactions could be analyzed as a result of the limited number of events. The proportional hazards assumption was controlled using the Schoenfeldt test and graphical methods. In all three Cox regressions, polyethylene type violated the proportional hazards assumption and was therefore entered as a stratifying factor. The predictors were checked for collinearity with a Spearman test.

Bilateral cases were treated as independent [20]. A sensitivity analysis was performed excluding the second hip in each bilaterally operated patient.

Because 57 of the included patients (59 hips [24%]) died during followup, we performed a sensitivity analysis using competing risks Cox regression [2, 8].

Results are, when appropriate, presented as mean values with a 95% CI. Accordingly, p values ≤ 0.05 were regarded as significant.

Statistical analysis was performed with the SPSS Version 23 statistic suite (SPSS Inc, Chicago, IL, USA) and Stata IP Version 13 (Stata Corp, College Station, TX, USA). Each study was originally approved by the local ethics committee. The ethical committee also approved crossmatching with register data (003-16).

Results

Does Early Subsidence Predict Later Failure?

Stem subsidence measured by RSA at 2 years was associated with an increased risk of revision (hazard ratio [HR], 6.0; 95% CI, 2.5–15; p < 0.001, adjusted Cox regression). Use of the stem subsidence threshold 0.15 mm as suggested by van der Voort et al. [26] revealed an increased risk of failure if 2-year stem subsidence was ≥ 0.15 mm compared with < 0.15 mm (HR, 5.1; 95% CI, 2.2–12; p < 0.001, adjusted Cox regression). At 2 years the mean (median, SD) subsidence reached −0.30 mm (−0.29, 0.21) and −0.11 mm (−0.07, 0.33) in the stem failure and the nonfailure groups, respectively (p < 0.001, Mann-Whitney). Sixteen stems in the failure group and 157 stems in the nonfailure group had RSA data at both 7 and 10 years. In both groups the subsidence increased between these two occasions (7 versus 10 years: p < 0.001, Wilcoxon signed-rank test) (Fig. 3A).

Fig. 3
figure 3

A–D The figures display RSA measurements of stem migration (mean and standard error) with use of femoral bone markers as a fixed reference segment. All available RSA data at each time point are included. Probability values (Mann-Whitney test) refer to statistical comparison between failed and nonfailed stems at 2 years. (A) Proximal(+)/distal(−) translation; (B) anterior(+)/posterior(−) tilt; (C) ante(+)/retro(−)torsion; (D) valgus(+)/varus(−) tilt

Does Early Rotation Predict Later Failure?

Analysis of stem rotation measured by RSA at 2 years showed that only rotation about the y-axis corresponding to retrotorsion was associated with stem failure (HR x axis, 0.90; 95% CI, 0.37–2.2; p = 0.82; HR y axis, 1.7; 95% CI, 1.1–2.5; p = 0.018; HR z axis, 0.27; 95% CI, 0.03–2.6; p = 0.26) (Table 2). The stem failure group showed increased retrotorsion (failure group: mean 1.1° [median 0.91, SD 1.2]; nonfailure group: 0.41° [0.30, 0.83], p < 0.001, Mann-Whitney) and increased varus angulation (failure group: −0.12° [−0.08, 0.21]; nonfailure group: 0.03° [−0.01, 0.69], p = 0.003, Mann-Whitney). There was a minimum mean (median, SD) posterior tilt (rotation about the transverse axis) in both groups (failure group: −0.14° [−0.12, 0.33]; nonfailure group: −0.07° [−0.06, 0.37], p = 0.97, Mann-Whitney) (Fig. 3B–D).

Table 2 Cox regression analysis of stem subsidence and rotation with associated hazard ratios

Additional Factors Associated With Failure

We found that sex and stem size were independent risk factors for stem failure and also stem offset in two of the three adjusted analyses (Table 2).

Sensitivity Analyses

Exclusion of the second hip in bilaterally operated patients did not alter any of our main conclusions (data not shown). A competing risks regression corresponding to our three adjusted analyses did not alter our main conclusions (data not shown).

Discussion

Many well-intentioned, seemingly small modifications of existing implants have led to inferior results [11, 16]. Earlier research suggests that this also has occurred with the Spectron EF Primary stem [22, 24]. The clinical performance of the original Spectron stem was good [6]. Thanner et al. [23] demonstrated close to zero subsidence for the second generation of Spectron stems (Spectron EF) when used with Palacos® cement (Heraeus, Hanau, Germany). The mean subsidence at 1 and 2 years were −0.03 and −0.06 mm, which is less than half the values recorded in the present study (−0.08 and −0.014 mm). Because the clinical results of the second-generation Spectron design (EF) were much better than the third generation (EF Primary) [22], it seems reasonable to conclude that the tolerance for subsidence of these stems with a rough surface is very small and close to zero. We sought to determine whether the clinical problems observed in association with this stem could be predicted with use of RSA.

Our study has some limitations. We included THAs from several studies with different inclusion criteria, types of cups, polyethylene, bone cements, and with procedures performed by different surgeons during different time periods. This was necessary to obtain sufficient statistical power. Although our material represents one of the largest RSA studies published with followup for longer than 10 years, the number of stems included was still too small to observe sufficient number of failures to allow adjustment for all confounding factors, including possibly important interactions. We also lack information about comorbidities, body mass index, bone quality, and activity levels, all potentially important confounders. Despite these limitations, our study represents the best current knowledge regarding this commonly used hip implant.

Furthermore, the partly atypical loosening pattern of the Spectron EF Primary stem prompted a definition of radiological stem failure that is arbitrary and not validated. Harris et al. [10] suggested that a continuous radiolucent line surrounding the entire bone-cement interface is an indication of definite failure and a corresponding line visible along 50% to 99% of the interface would indicate a probable failure. Gruen et al. [9] proposed that radiological stem failure included radiolucent lines at the cement-bone and cement-stem interfaces as well as visible cement fractures. At the most recent followup, none of the stems still in situ and only one in the revision group had a radiolucent line extending over more than 50% of the interface. Thus, probable or definite loosening of the cement mantle from the bone did not occur except in one case revised after followup at 5 years. Most commonly failures seemed to have started as debonding from the cement mantle, which in some cases proceeded to more or less severe abrasive wear with varying degrees of osteolysis. Because minor osteolyses in Gruen regions 1 and 7 were common and sometimes very difficult to distinguish from stress shielding, we decided to restrict the radiographic failure criteria to osteolysis in regions 2 to 6. Not all stems found to be loose at revision showed osteolysis. Thus, our radiological failure definition probably underestimates the true stem failure rate, because we could not decide in a reproducible way from conventional radiographs if some stems were macroscopically loose.

We found that early micromotion measured with RSA increased the risk of clinical failure, consistent with previous studies [4, 13, 14, 18, 19, 21, 26]. However, the total mean stem subsidence for the Spectron EF Primary stem at 2 years did not exceed the threshold of 0.15 mm suggested by van der Voort et al. [4, 13, 26] despite the fact that the 10-year stem survival rate was below 95%. This discrepancy indicates that migration thresholds may vary between different design concepts and that early migration patterns perhaps should be considered with later analyses for some designs (perhaps up to 3 years after stem insertion), adding complexity to the interpreting of early RSA migration studies. Our patients were followed prospectively, which means that signs of radiographic failure were noted at an early stage, which in turn may explain why 0.15 mm was a valid internal discrimination threshold in our study.

The two smallest stem sizes, in particular size 1, were associated with an increased risk of stem failure, consistent with the findings of Thien and Karrholm [24] who also demonstrated a corresponding increased revision risk for the smallest Lubinus SP2 (Link, Hamburg, Germany) stem size. The small stem sizes presumably fail as a result of a contact area/rotational torque ratio that is below the limit necessary to maintain a stable stem-cement interface. Stems with high offset also tended to fail slightly more than those with standard offset. This was a new option added to the EF Primary stem system, which probably resulted in increased torque forces that facilitated failure when the area of the stem-cement interface became reduced below a critical level with smaller stems. Thien and Karrholm [24] demonstrated a markedly elevated revision risk for Spectron EF Primary stems with a combination of high offset and longest neck length and to a lesser degree also for the Lubinus SP2 stem. Male sex was an independent risk factor for stem failure in our study. This could be explained by a wider cortex and narrower femoral canal in men, restricting them to relatively smaller implants compared with females and also with relatively higher offset. In general, males did however receive larger stem sizes than did females. Patients with failing stems were also younger than those with stable stems, even if age was not an independent factor for failure. As mentioned in our limitations, we did not have access to body mass index and activity level data in this study. Contrary to our findings, Thien and Karrholm [24] did not find any increased risk of failure associated with male sex. The reason for this discrepancy is unknown.

In 41 of the hips classified as successes by us, there was stem-cement separation at the shoulder of the prosthesis, suggesting that the stem had at some point subsided and debonded from the mantle at Gruen zone 1. In these 41 hips, four (10%) had a combination of high offset and the smallest stem size compared with 0.6% in other nonfailing stems. It may be that these stems had achieved a certain degree of secondary stabilization preventing further motion during activity and subsequent abrasive wear. However, it must be presumed that these stems are at increased risk for clinical failure with longer followup.

The cost for an RSA evaluation is approximately USD 300 to 400. Four to five RSA examinations during the first 2 to 3 years after surgery would imply a maximum cost of USD 2000. In 2003, Crowe et al. [3] calculated that the average cost for a THA revision was USD 21,224. Use of the two most well-documented cemented stems in Sweden would save at least three to five revisions per 100 procedures over 10 to 15 years. In Sweden almost 12,000 stems were used in primary THAs. Thus, RSA is cost-effective in this crude analysis even if this type of monitoring is used in conjunction with other outcome measures. RSA examination facilities can be established with relatively little effort at most centers, and if necessary, images can be sent to specific centers for analysis.

In conclusion, we found that stem subsidence and amount of retrotorsion up to 2 years as measured with RSA could predict later aseptic failure of the Spectron EF Primary stem. The comparatively low tolerance level for early micromotion of this stem may be related to debonding at the stem-cement interface and abrasive wear between the rough stem surface and the cement mantle. The increased failure rate of the third generation of the Spectron stem could also be related to some of the new design features added such as introduction of smaller sizes and the high-offset option. As noted, seemingly minor design changes of a well-established implant can have a substantial influence on its performance. As a result, it seems prudent to recommend clinical trials with RSA on a restricted number of cases whenever an implant undergoes even minor design changes. Such premarket trials may delay marketing and increase costs for manufacturers, but can prevent large-scale use of implants with inferior performance.