Introduction

Metal-on-metal (MOM) bearings gained renewed attraction after a high incidence of wear-induced failures with metal-on-polyethylene bearings was observed in young and active patients [7, 8]. As a result of encouraging early results of MOM hip replacements in the early 2000s, they were widely adopted for clinical use throughout the world. In 2008, MOM bearings were used in approximately 35% of all hips replaced in patients in the United States [4]. During the last few years, there has been increasing concern about MOM hip replacements regarding adverse reactions to metal debris associated with the MOM articulation [1, 3, 15]. Use of modern MOM bearing surfaces has not been associated with an overall increased risk of cancer in the short-term [19, 27]. However, severe cardiac and neurologic manifestations have been reported in patients with extremely high systemic metal ion levels [28].

One of the major MOM hip replacement designs, the Articular Surface Replacement (ASR™; DePuy, Warsaw, IN, USA), was recalled by its manufacturer in 2010 because of high failure rates reported from multiple sources [2, 14, 23]. Because a patient with an adverse reaction to metal debris can be asymptomatic [30], may have low metal ion levels [14], and even normal cross-sectional imaging [13], diagnosing an adverse reaction is challenging. Further, the exact clinical implications of different radiologic findings (eg, cystic pseudotumors) in patients with MOM hip replacements are not well understood [12]. Adverse reactions to metal debris are seen with all implant types, and in approximately 50% of cases it is not associated with increased wear [20]. The true prevalence of adverse reactions to metal debris is not known in any cohort of patients with MOM hips because determining it would require a mass screening program using clinical evaluation, laboratory tests (blood metal ion measurement), and cross-sectional imaging of all patients in that cohort. To the best of our knowledge, such studies have not been reported. Furthermore, information on risk factors for adverse reactions to metal debris is scarce [15], and it is unclear whether the same risk factors for adverse reactions apply to hip resurfacing procedures and large-diameter head MOM THAs.

The primary aims of our study were (1) to analyze and report the prevalence of adverse reactions to metal debris among patients who received a small-headed (< 50 mm) ASR™ prosthesis during hip replacement or an ASR™ XL acetabular system during THA at our institution, and (2) to investigate whether the risk factors for the adverse reactions differ between hip resurfacing procedures and THAs. To achieve these goals, we used data obtained from a mass screening program implemented at our institution for these patients.

Patients and Methods

DePuy Orthopaedics voluntarily recalled their ASR™ MOM hip system in August 2010, and the UK Medicines and Healthcare Products Regulatory Agency announced a medical device alert regarding ASR™ hip arthroplasty implants in September 2010 [21]. After the announcement, our institution established a mass screening program to identify possible articulation-related complications in patients who had received either an ASR™ prosthesis during a hip resurfacing procedure or an ASR™ XL prosthesis during THA at our institution. All patients attending the screening received an Oxford hip score questionnaire [5], underwent a thorough clinical examination (including the Harris hip score [11], which covers pain, function, absence of deformity, and ROM) at our outpatient clinic, and were referred for measurement of whole blood cobalt and chromium levels. In addition, AP and lateral radiographs of the hip and an AP pelvic radiograph were taken before each visit. Patients with a femoral head size less than 50 mm were considered to be at high risk of having an adverse reaction develop, and thus all were referred for magnetic artifact reduction sequence MRI [9, 15]. If magnetic artifact reduction sequence MRI was contraindicated or could not be done because of patient-related factors (such as claustrophobia), the patient underwent ultrasonography of the affected hip.

One thousand thirty-six ASR™ MOM hip arthroplasties were performed in 887 patients at our institution between March 2004 and December 2009. In 482 operations (424 patients), a femoral head size less than 50 mm was used. One hundred forty-two patients (168 hips) received an ASR™ hip resurfacing prosthesis and 281 patients (312 hips) received an ASR™ XL THA prosthesis, respectively. Stems manufactured by DePuy were used in all ASR™ XL THAs: a proximally coated Summit® stem in 233 (74%), a hydroxyapatite-coated Corail® stem in 54 (17%), and an S-ROM® stem in 24 (8%) operations, respectively. Furthermore, a short Proxima™ stem was used in two operations (1%). One patient received an ASR™ hip resurfacing prosthesis in one hip and an ASR™ THA prosthesis in the other hip. This patient was excluded from analyses when comparing implant groups. Twelve patients also had an ASR™ hip arthroplasty in the contralateral hip but with a femoral head size greater than 50 mm. All 386 living patients (442 hips; one patient had bilateral implants—one revised before screening and one available for screening) who have not had revision surgery with a femoral head size less than 50 mm were asked to participate in a screening program, and 379 (98%) agreed to do so (Fig. 1). The mean age of patients in the hip resurfacing group was 53 years (range, 14–77 years), and in the THA group was 58 years (range, 15–79 years) (p < 0.001; Table 1). Minimum followup for the whole cohort was 2.9 years (mean ± SD, 4.9 ± 1.7 years; range, 2.9–8.1 years). Informed consent was obtained from all patients for participating in this study. We obtained permission to perform this study from the ethical committee of the hospital district in which the study was conducted.

Fig. 1
figure 1

The study flow chart is shown. Eight patients were excluded from the study. One patient had bilateral implants, of which one hip was revised before screening* and the contralateral hip was available for screening.

Table 1 Demographic data for the patients

All primary operations were performed by or under direct supervision of seven experienced hip surgeons (JP, TP, PH, PK, TM, UP, HS) and according to the standard protocol at our institution. A posterior approach was used in all cases and external rotators were detached along the incision of the posterior capsule and reattached by absorbable sutures through drill holes to the greater trochanter. Postoperatively patients were allowed immediate full weightbearing with crutches and without any major restrictions of movement.

Failure was defined as a revision operation secondary to an adverse reaction to metal debris. Revision surgery was considered if (1) there was a clear pseudotumor observed on cross-sectional imaging regardless of symptoms or whole blood metal ion levels; or (2) the patient had elevated whole blood metal ion levels and hip symptoms despite a normal finding on cross-sectional imaging; or (3) the patient had a continuously symptomatic hip or progressive symptoms regardless of imaging findings or metal ion levels. Symptoms included hip pain, discomfort, sense of instability, and/or impaired function of the hip and sounds from the hip (clicking, squeaking). Whole blood metal ion levels were regarded as being elevated if either chromium or cobalt exceeded 7 ppb [22]. Diagnosis of adverse reactions to metal debris was based on perioperative findings. Failure was classified as being secondary to adverse reactions to metal debris if the following criteria were met: (1) there was presence of metallosis or macroscopic synovitis in the joint; and/or (2) a pseudotumor was found during revision; and/or (3) a moderate to high amount of perivascular lymphocytes along with tissue necrosis and/or fibrin deposition was seen in the histopathologic sample; and (4) perioperatively there was no evidence of component loosening or periprosthetic fracture. Furthermore, infection was ruled out by multiple (at least five) bacterial cultures obtained during revision surgery.

Of all 379 patients attending screening, 368 patients (97%) underwent cross-sectional imaging. MRI was performed in 319 patients (370 hips) and ultrasonography in the remaining 49 patients (51 hips). Three patients (three hips) did not have imaging owing to comorbidities. Imaging in one patient with a well-functioning implant (chromium and cobalt < 3 ppb; Oxford hip score, 48 points) was postponed because of the patient’s decision. MRI was performed with two 1.5-T machines (Siemens Magnetom Avanto 1.5 T; Siemens Healthcare, Erlangen, Germany; and GE Signa HD 1.5 T; General Electric Healthcare, Waukesha, WI, USA). All examinations were done with magnetic artifact reduction sequence: coronal and axial T1-weighted fast spin echo and coronal, axial, and sagittal short tau inversion recovery. Magnetic artifact reduction sequence MR images were analyzed by one of the authors (PE) and two senior musculoskeletal radiologists (PM, AP).

Whole blood metal ion levels were available for all patients attending the screening protocol. All patients attending the screening protocol had their blood samples taken from the antecubital vein using a 21-gauge needle connected to a Vacutainer™ system (Becton, Dickinson and Company Franklin Lakes, NJ, USA) and trace element blood tubes containing sodium EDTA. The first 10 mL of blood was used for other laboratory tests such as C-reactive protein and erythrocyte sedimentation rate measurement. The second 10 mL was used for cobalt and chromium analysis. In the Finnish Institute for Occupational Health, standard operating procedures were established for cobalt and chromium measurement using dynamic reaction cell inductively coupled plasma (quadrupole) mass spectrometry (Agilent 7500 cx, Agilent Technologies, Santa Clara, CA, USA).

Student’s t-test was used when comparing continuous variables between groups and noncontinuous variables were compared using the Mann-Whitney U test. Continuous variables were distributed to appropriate subgroups. For age, a cutoff value of 50 years was used. A cutoff value of 40 years used by others [9] would have led to too small subgroups, since only 36 patients in our study group were younger than 40 years. Femoral diameter was analyzed as a continuous variable. Acetabular inclination was divided into two groups: 50° or less and greater than 50°. Preoperative ROM was divided into two groups based on the mean value minus ½ SD, which yielded to following distribution: less than 110° and 110° or greater in the THA group and less than 130° and 130° or greater in the hip resurfacing group. To appropriately study the influence of cup position on the risk of adverse reactions to metal debris, head size and acetabular inclination were combined and cup coverage was used in the adjusted Cox regression analysis. Cup coverage is equal to the lateral acetabular component edge (γ), as previously described [6]. Subtended acetabular component angle or functional arc (α) was obtained from the ASR™ cup templates in AGFA software Version 11.6 (Agfa, Greenville, SC, USA). Assessment yielded to biphasic distribution of the arcs. Functional arc correlated significantly with cup size in sizes from 39 mm to 47 mm (slope, 0.75º/mm; r2 = 0.9959). In larger cups, the correlation also was significant (slope, 0.50º/mm; r2 = 0.9953). Because there was no relevant correlation with cup coverage and femoral diameter (r2 = 0.081, p = 0.051), the latter also was included in the multivariable analysis. In addition to the aforementioned categorical variables, gender and implant type (hip resurfacing versus THA) were studied as risk factors for adverse reactions to metal debris. Cox regression analysis was used to estimate the unadjusted (crude) and adjusted risk ratios of different variables on the risk of adverse reactions to metal debris-related failure. Comparison of survivorship by strata factor was performed using the log-rank test. The Wald test was applied to calculate p values for data obtained from the Cox multiple regression analysis. Because femoral diameter is known to be smaller in female patients, we estimated colinearity between variables used in the Cox regression model by calculating variance inflation factor. Variance inflation factors were obtained by multivariable regression analysis using followup as a dependent variable. Significance level was set to 0.05. Statistical analyses were conducted with IBM Statistics Version 19.0 (SPSS, Chicago, IL, USA).

Results

At the time of last followup, 162 hips in 131 patients had undergone revision surgery (including those revised before the screening program). This represented 16% of the population of ASR™ arthroplasties we performed, at a mean of 5 years. Adverse reactions to metal debris were diagnosed in the majority (n = 138 [85%]) of these revisions (Table 2). The prevalence of adverse reactions to metal debris was 31% in the ASR™ XL THA group and 25% in the hip resurfacing group. Cumulative 7-year survivorship was 51% (95% CI, 45%–57%) for the hip resurfacing group and 38% (95% CI, 33%–44%) for the THA group with any revision as the end point, respectively (p = 0.001) (Fig. 2). For revision for adverse reactions to metal debris as the end point, the cumulative 6-year survivorship was 73% (95% CI, 69%–78%) for the hip resurfacing group and 61% (95% CI, 67%–65%) for the THA group at 6 years, respectively (p = 0.003).

Table 2 Causes of revisions in ASR™ hip resurfacing and THA cohorts
Fig. 2
figure 2

The graph shows the overall survivorship for ASR™ hip resurfacing (HR) and THA cohorts with any revision as the end point.

Reduced cup coverage (THA, risk ratio [RR] p < 0.001; hip resurfacing, RR p = 0.019) was an independent risk factor for adverse reactions to metal debris in the THA cohort (Table 3) and hip resurfacing cohort (Table 4). High preoperative ROM (RR 1.92, p = 0.04), use of the Corail® stem (RR 1.86, p = 0.03), and female gender (RR 2.79, p = 0.003) were associated with an increased risk of adverse reactions to metal debris only in patients undergoing THA (Table 3). The variance inflation factor ranged from 1.137 to 1.450 in the THA group and from 1.057 to 1.219 in the hip resurfacing group implicating that there is not a considerable amount of colinearity between predictor variables. Patients who had THAs had significantly higher whole blood cobalt levels than patients who had hip resurfacing (Table 5). This difference was evident in patients with unilateral (p = 0.002) and bilateral (p < 0.001) hip arthroplasties. However, there was no difference in chromium levels between hip resurfacing and THA cohorts (Table 5). Whole blood chromium and/or cobalt level exceeded 7 ppb in 18% of patients who had unilateral hip resurfacing and in 37% of patients who had unilateral THA, respectively. A pseudotumor was found in 42 (10%) hips by cross-sectional imaging (Table 5). There were no differences in clinical scores between the hip resurfacing and THA cohorts (Table 5).

Table 3 Results of unadjusted and adjusted survival analysis in the THA group
Table 4 Results of unadjusted and adjusted survival analysis in the hip resurfacing group
Table 5 Clinical, laboratory, and cross-sectional imaging findings of patients attending the screening program

Discussion

During the last few years, increasing concern has arisen around MOM hip arthroplasties regarding adverse reactions to metal debris associated with the MOM articulation [1, 3, 15]. Information regarding risk factors for adverse reactions to metal debris is scarce [15], and it is not known whether the same risk factors for adverse reactions to metal debris apply to hip resurfacings and large-diameter head MOM THAs. We therefore aimed to use a systematic screening program to determine (1) the prevalence of adverse reactions to metal debris among patients who underwent small-headed (< 50 mm) ASR™ hip resurfacing procedures and ASR™ XL THAs at our institution, and (2) the risk factors for these adverse reactions and if they different in hip resurfacings compared with THAs.

A major limitation in our study was inadequate assessment of cup orientation. Extremes of cup version are known to be associated with an increased risk of adverse reactions to metal debris-related failure [17]. We did not calculate cup version in this study because we lacked appropriate tools to measure version accurately. We also included patients with unilateral and bilateral hip arthroplasties. It is debatable whether it is appropriate to include bilateral implants in the survival analyses. It can be assumed that patients who received bilateral implants who have experienced possible metal hypersensitivity-related failure on one side are also at an increased risk of failure of the other hip for the same reason, even if the components were properly implanted. Furthermore, it is debatable whether the systemic exposure of metal ions affects the contralateral hip. This may cause unobserved heterogeneity and it could be addressed by acquiring a shared frailty model in regression analysis [24]. We did not use a shared frailty model as a result of the small number of bilateral revisions owing to adverse reactions to metal debris (six patients).

Finally, we studied only one implant with a known design flaw that predisposes the bearings to edge-loading and the patients with these hip replacements to adverse reactions to metal debris. Most presumably, this is the reason the ASR™ implant has been withdrawn from the market. However, several facts suggest that these results can most likely be generalized to other MOM implants as well. First, the design flaw of the ASR™ prosthesis (ie, poor cup coverage) was only one of the risk factors for adverse reactions to metal debris: effects of high ROM and gender, for instance, are not implant-dependent. Second, adverse reactions to metal debris are seen with all implant types, and approximately 50% of failures of MOM hip implants have low wear rates of the bearing surfaces [20].

In the current study, the prevalence of adverse reactions to metal debris was higher than that reported by Langton et al. [14]. In their study, the prevalence was 14% in the hip resurfacing group and 29% in the THA group, respectively. Survival rates in our study also were worse than those reported by Langton et al. [14]. We believe this is because our cohort included only patients with a small femoral head size, meaning that our patients were more prone to edge-loading as a result of a reduced functional arc [10]. Therefore, failure resulting from increased wear originating from the bearing surface instead of the taper is likely to be more prevalent in our cohort.

The median whole blood chromium level of patients who had unilateral THA was comparable to that of patients who had unilateral hip resurfacing, whereas the median cobalt level was significantly higher in patients who had THA than in patients who had hip resurfacing. Furthermore, the median cobalt level exceeded the reported safe upper limit for unilateral large-diameter MOM prostheses in THAs [29]. The ratio of cobalt to chromium was two in the THA cohort, whereas the same value was one in the hip resurfacing cohort. This finding suggests wear and/or corrosion at the taper-trunnion junction produces mainly cobalt ions, because it is highly unlikely that wear pattern of the bearing surfaces would differ between THAs and hip resurfacing procedures.

We examined risk factors for adverse reactions to metal debris and found that reduced cup coverage was strongly associated with an increased risk of adverse reactions in hip resurfacing and THA cohorts. Small head diameter, by contrast, did not directly lead to an increased prevalence of adverse reactions. In our cohort, there were 32 resurfaced small-headed hips with cup coverage greater than 35°, and only two of these (6.3%) have been revised so far. Reduced cup coverage also was a significant risk factor for adverse reactions to metal debris-related failure in the THA cohort. Therefore, taper damage or taper corrosion appears not to be solely responsible for the high prevalence of adverse reactions in patients who received the ASR™ XL prosthesis during THA.

In the current study, patients who received a hydroxyapatite-coated Corail® stem in the primary THA were found to be especially at high risk for adverse reactions to metal debris-related failure. As Summit® and Corail® stems have identical 12/14 tapers, the wear or corrosion process in the taper-trunnion junction should not differ between them. The marked difference between these stem designs is their coating; whereas the Corail® stems have a hydroxyapatite coating, the Summit® stems are proximally porous-coated. Hydroxyapatite coating is shown to degrade with time and result in hydroxyapatite flake release and presumably third-body wear [25]. This being the case, the problem with the Corail® stem goes beyond the ASR™ bearing system and the higher than expected failure rate also should be seen with other MOM bearing systems coupled with a Corail® stem or other hydroxyapatite-coated stem designs. Confirmation of the reason for this finding warrants additional research, with clinical and retrieval analyses being required.

Risk factors associated with adverse reactions to metal debris in larger head sizes may differ from those established in our study with small-heads. Functional arc of the cup increases with increasing head sizes [10] thus offering more cup coverage and decreasing the occurrence of edge-loading. Thus, especially in patients with large head implants in hip resurfacing, the prevalence of adverse reactions may be lower than in our current cohorts. Owing to increased cup coverage with larger head sizes, other factors may be more influential to development of adverse reactions to metal debris. At the time of writing, almost all patients with large-head ASR™ prostheses have completed the screening program at our institution, and we will analyze and report the results of these patients in the future.

The infection rate was higher (2.9%) in our patients in the THA cohort than in the hip resurfacing cohort (0.6%), respectively. One explanation for this finding could be the significantly higher blood cobalt ion levels of the THA cohort compared with the hip resurfacing cohort: cobalt ions are known to cause tissue necrosis [26], and hematogenic bacteria may readily adhere to avascular necrotic tissue. However, patients in the THA cohort were substantially older and had different preoperative diagnosis distribution than patients in the hip resurfacing cohort (Table 1). Thus, the reason for the difference in infection rates between the cohorts is most likely multifactorial. Still, high cobalt ion levels may play a part in this phenomenon. Whether high metal ion levels predispose to prosthetic joint infections warrants further research.

We found a high rate of revision attributable to adverse reactions to metal debris in the hip resurfacing and THA groups. Although the implant we studied has been recalled, it is conceivable that our findings may apply more broadly to other designs. The main strength of our study was the inclusion of clinical, laboratory, and radiographic evaluations of the patients in this study. For instance, we identified several large pseudotumors by cross-sectional imaging in patients with an excellent clinical outcome or with nonelevated metal ion levels (< 7 ppb). This substantially enhanced the accuracy of our end point analysis, and enabled us to estimate the true prevalence of adverse reactions to metal debris in the study cohorts. We found several significant risk factors for adverse reactions in patients who had received ASR™ XL THAs; namely, female gender, stem type (Corail®), high preoperative ROM (> 110°), and reduced cup coverage (< 25°), the last being a risk factor for adverse reactions in the hip resurfacing cohort as well. This finding implies a more complicated failure mechanism in THAs compared with hip resurfacing procedures. The mechanisms leading to failure in patients with Corail® stems or patients with high preoperative ROM are unclear and warrant additional research. Furthermore, there is an unmet need for research that tracks the outcome of patients with other MOM implant designs and larger head sizes.