Introduction

Background

Metal-on-metal bearing (MoM) total hip arthroplasty (THA) has been evolving since 1990, with the development of resurfacing MoM THA [i.e., hip resurfacing (HR)] and stemmed MoM THA. These procedures have been used mainly in highly active young people, because they are expected to confer high dislocation resistance with low wear. However, starting in 2006, postoperative pain, osteolysis, and necrosis of soft tissues around the hip joint have been reported [1,2,3]. Such abnormal tissue reactions around the hip joint include aseptic lymphocyte-dominated vasculitis-associated lesions, adverse reactions to metal debris (ARMD), aseptic local tissue reactions (ALTRs), and pseudotumor. The incidence of ALTRs was found to be higher in stemmed MoM THA than in HR [1, 2], suggesting that metal corrosion at the head–neck junction, called taper junction failure or trunnionosis, is the cause [1, 3].

The timing of revision is important because the pathogenesis of ARMD, and ALTRs, if left untreated, may lead to necrosis of the periarticular bone and muscles, resulting in repeated dislocation and periarticular muscle dysfunction even after revision. The U.S. Food and Drug Administration recommends early revision for advanced ALTRs [4]. We have also experienced that muscle reconstruction can be very difficult, and it is highly desirable to perform revision before soft tissue failure occurs. Additionally, in a previous report, pseudotumor was significantly more common in symptomatic cases [5]. In the UK, magnetic resonance imaging (MRI) was recommended in all symptomatic patients after MoM THA [6]. In our hospital, we also performed MRI for patients who had undergone MoM THA and were experiencing little pain in the affected area, and we monitored them carefully. Ongoing hip pain and pseudotumor enlargement over time, as determined by MRI, were indications for revision surgery.

Rationale

The U.K. Medicines and Healthcare Products Regulatory Agency published postoperative management guidelines for MoM THA in 2012 [6]. Elevated blood cobalt and chromium levels were identified as findings suggestive of a possible soft tissue reaction, and a threshold of 7 ppb was proposed. Contrarily, Hart et al. reported that the diagnostic sensitivity of ALTRs was only 51% when 7 ppb was used as a cutoff value [7]. Furthermore, Kwon et al. stated that blood metal concentration alone is not an indication for surgery [8]. It remains controversial whether metal ion concentrations in postoperative blood are useful for monitoring the development and progression of ALTRs. Moreover, measurement of blood metal ion concentrations is not a routine test because it is not covered by insurance companies in some countries. Test effectiveness and efficiency also need to be improved. It also remains unclear whether postoperative blood metal ion concentrations can be a predictor of revision due to ALTRs.

The effects of the implant type, age, sex, head size, and cup placement angle on ARMD, ALTRs, and pseudotumor after stemmed MoM THA have been reported. However, there are conflicting reports that women aged < 40 years have a higher risk of pseudotumor [9] and that age and sex are not related to the occurrence of pseudotumor [10]. There are also reports that pseudotumors are caused by metal ion release due to edge loading resulting from differences in cup design or poor cup installation [11,12,13]. Regarding the influence of the implant type in HR, implants such as the Articular Surface Replacement™ (DePuy Orthopaedics, Inc., Warsaw, IN, USA), Conserve® Plus hip resurfacing system (Wright Medical Technology, Nashville, TN, USA), and Cormet™ Hip Resurfacing System (Corin, Cirencester, UK) are known to be prone to the occurrence of ALTRs, and the revision rate is relatively high [14, 15]. However, the occurrence of ALTRs and revision rate are not clearly specified for stemmed MoM THA. Regarding the effect of head size, the head size of stemmed MoM THA was reported to affect the occurrence of ALTR [1, 3, 10].

Among the stemmed THA materials, there is a choice of polyethylene and metal liners available from Zimmer-Biomet’s M2a line (Zimmer-Biomet, Warsaw, IN, USA), and the M2a-Magnum, which allows the use of a large head, was introduced later. Despite reports of the medium- to long-term outcomes of various implants in MoM THA, no studies have compared the use of polyethylene and metal liners or small and large femoral heads with the same surgeon and the same stem. Thus, the purpose of this study was to determine the (1) long-term survival rate of stemmed MoM THA compared with that of metal-on-polyethylene (MoP) bearing THA, (2) effect of head size and cup placement angle on revision rate, and (3) predictors of revision.

Materials and methods

Participants

A non-blinded, prospective, randomized clinical trial was designed to compare the survival rates of primary MoP THA and primary MoM THA.

Among the patients with osteoarthritic diseases who underwent primary THA performed by the same surgeon at our hospital between June 2006 and July 2010, 118 patients with 140 hips met the eligibility criteria for our study. The exclusion criteria were as follows: (1) endocrine disorders affecting bone metabolism (excluding primary osteoporosis), (2) osteoarticular diseases not affecting the hip joint, (3) history of trauma requiring rest for more than 1 month, (4) inability to walk without support (need for walkers or wheelchairs), and (5) presence of metal in the body before THA. First, patients were randomly assigned to THA with a MoP or MoM bearing based on their birth month. Patients born in odd-numbered months were assigned to the MoP bearing group, and patients born in even-numbered months were assigned to the MoM bearing group. Since April 2009, when M2a-Magnum became available, it has been used in MoM THA instead of M2a-Taper. All procedures were performed using a posterolateral approach, and the patient walked with their full weight on the first postoperative day. Among them, 110 patients with 130 hips who could be followed up for more than 2 years after surgery were finally included in the study (Fig. 1).

Fig. 1
figure 1

Study flowchart. THA total hip arthroplasty

Implants

We used the M2a-RingLoc with a metal-on-polyethylene bearing (M2a-MoP: group P), M2a-Taper (group T) with a MoM bearing, and M2a-Magnum (group M) with a MoM bearing, all manufactured by Zimmer-Biomet. The metal liners were made of forged cobalt–chromium, and the bearing surface was machined and polished. The liner was fixed with a tapered lock, and the back side of the M2a tapered liner was blasted. The system was available with two different head diameters, 28 mm and 32 mm, to match the cup diameter. M2a-Magnum had different head diameters depending on the cup size. The modular head was made of cobalt–chromium alloy, cast, and polished. All were secured with tapered locks. The stem was a cementless Bi-Metric XR (Zimmer-Biomet) designed for Japanese patients, which was porous proximally and available in four shapes (SPP, for champagne-fluted medullary cavity with a neck-shaft angle of 131.5°; RPP, for standard Japanese medullary cavity with two neck-shaft angles of either 131.5° or 126.5°; and CDH, for severe secondary osteoarthritis and stove-piped medullary cavity with a neck-shaft angle of 131.5°).

Blood tests

Metal ion concentrations in whole blood and bone metabolism markers in serum were measured preoperatively and at 6 months, 1 year, and 2 years postoperatively. Metal ion concentrations were also measured during the revision surgery. The measured metal ions were cobalt and chromium. Cobalt was measured by inductively coupled plasma analysis (standard value: 0.12–0.41 μg/L) and chromium by atomic absorption spectrophotometry (standard value: less than 10 μg/L). The bone metabolism markers were type I collagen cross-linked N-telopeptide (NTX) and bone alkaline phosphatase (BAP). NTX was measured by enzyme immunoassay (reference values: females after menopause 10.7–24.0 nmol BCE/L; females before menopause 7.5–16.5 nmol BCE/L; and males 9.5–17.7 nmol BCE/L). BAP was measured by chemiluminescent enzyme immunoassay (reference values: females after menopause 3.8–22.6 μg/L; females before menopause 2.9–14.5 μg/L; and males 3.7–20.9 μg/L).

Imaging evaluation

In all patients, plain radiography and computed tomography were performed preoperatively and postoperatively to evaluate loosening and osteolysis. In all 77 hips with MoM THA (44 hips in group T and 33 hips in group M), MRI of the hip joint was performed to evaluate ALTRs at least once a year starting 1 year postoperatively using a Siemens MAGNETOM Avanto 1.5T-MRI system. To reduce metal artifacts, MRI was performed using a metal artifact reduction sequence. The hip joint was evaluated mainly using the horizontal section. In group P, MRI was performed when the patient complained of pain.

Criteria for revision surgery

In the MoM THA group, when the patient continued to have dull pain and there was an increase in ALTR findings on MRI, we recommended THA revision to the patient and performed it with the consent of the patient.

Statistical analyses

Data were compared among the three groups (P, T, and M). Fisher’s exact probability test and the Kruskal–Wallis test were used to analyze the demographic data of the patients. The Kruskal–Wallis test and multiple comparisons (with Bonferroni adjustment) were used to compare blood metal ion concentrations and bone metabolism markers preoperatively and at 6 months, 1 year, and 2 years postoperatively. The cumulative survival rate was calculated using the Kaplan–Meier method with revision as the endpoint, and risk factors for revision were identified by multiple logistic regression analysis, with revision as the dependent variable. Receiver operating characteristics (ROC) curves were drawn for the identified risk factors, and the sensitivity and specificity were calculated. Significance was set at p < 0.05. The statistical analysis software used was IBM SPSS Statistics version 25 (IBM Corp., Armonk, NY, USA).

Ethical considerations

This prospective study was performed in accordance with the ethical standards of the 1964 Declaration of Helsinki and its later amendments or comparable ethical standards. The study was approved by the institutional review board at our university (no. 1414), and all study participants provided informed consent to participate in the study and to publish the results.

Results

Demographic data and clinical outcomes

The demographic data and clinical outcomes for the study patients are shown in Table 1. The primary diseases were osteoarthritis in 101 patients with 118 hips and osteonecrosis of the femoral head in 9 patients with 12 hips. The mean age at the time of surgery was 63.1 ± 9.5 years (mean ± standard deviation) in 93 female and 17 male participants. Perioperative complications included dislocation in one hip and intraoperative fracture of the acetabular rim in one hip in group T, although no additional surgery was required. The mean postoperative follow-up period was 133.7 ± 39.1 months (approximately 11 years). When comparing the three groups, patients in group T tended to be younger and had a higher body mass index (BMI). The femoral head size was greater than 40 mm in all patients in group M. The cup inclination angle was less than 50° in all hip joints, and there was no difference among the groups (p = 0.686). The incidence of ALTRs in the MoM THA group was 31/44 hips (70.5%) in group T and 21/33 hips (63.6%) in group M, exceeding 60%, with no significant difference between the two groups (p = 0.583). The revision rate at the last observation was 1/53 hips (1.9%) in group P, 19/44 hips (43.2%) in group T, and 13/33 hips (39.4%) in group M. The revision rate was significantly higher in the MoM THA group (p < 0.001). The cause of revision was infection and dislocation in one case in group P and ALTRs in all cases in the MoM THA group. Among the revision cases, all cases in the MoM THA group had ALTR findings on MRI, and 12/44 hips in group T and 3/33 hips in group M had periprosthetic osteolytic changes on plain radiography and computed tomography (Table 1).

Table 1 Description of comparison groups in the study

Whole blood metal ion concentrations

Whole blood metal ion concentrations were significantly increased in the MoM THA group after surgery. The metal ion levels were not normally distributed at each time point of blood collection in each group. The median cobalt concentrations were significantly higher in groups T and M than in group P during the postoperative period (p < 0.01). At 2 years postoperatively, the concentrations were 0.3 μg/L in group P, 1.7 μg/L in group T, and 1.3 μg/L in group M. The maximum cobalt concentration was 19.0 μg/L in group M at 1 year postoperatively. At the revision surgery, the median cobalt concentration was 3.2 μg/L in group T and 1.0 μg/L in group M, and the maximum cobalt concentration was 8.1 μg/L in group M. The cobalt levels increased significantly in many patients, even after 2 years postoperatively (Table 2, Fig. 2).

Table 2 Descriptive data for whole blood metal ion levels and serum bone metabolism marker levels
Fig. 2
figure 2

Box plot showing metal ion levels in whole blood and outlier distribution for a cobalt and b chromium. The metal ion levels were significantly increased in the M2a-Taper and M2a-Magnum groups after surgery, especially for cobalt ions

The median chromium concentrations were also significantly higher in groups T and M than in group P during the postoperative period (p < 0.01). At 2 years after surgery, the values were 0.4 μg/L in group P, 1.3 μg/L in group T, and 1.4 μg/L in group M. The maximum chromium concentration was 5.2 μg/L in group M at 6 months and 1 year after surgery. At the revision surgery, the median chromium concentrations were 1.05 μg/L in group T and 0.95 μg/L in group M, and the maximum chromium concentration was 5.6 μg/L in group M. In nearly all patients, the chromium levels did not change significantly after 2 years postoperatively and remained within the reference value (Table 2, Fig. 2).

Serum concentrations of bone metabolic markers

Regarding the concentration of serum bone metabolic markers, that of NTX decreased over time after surgery and was lower than the preoperative levels at 2 years postoperatively. The concentration of BAP increased slightly from 6 months to 1 year postoperatively and then decreased to a level equal to or slightly lower than the preoperative levels at 2 years postoperatively (Table 2, Fig. 3). The levels of bone metabolism markers were not associated with metal ion levels, occurrence of ALRTs, or revision.

Fig. 3
figure 3

Box plot showing serum markers of bone metabolism and outlier distribution for a NTX and b BAP. The levels of bone metabolism markers were not associated with the metal ion levels. NTX type I collagen cross-linked N-telopeptide, BAP bone alkaline phosphatase

Long-term survival rates

In the survival analysis using the Kaplan–Meier method with revision as the endpoint, the survival rates were 100% in group P, 80.9% in group T, and 65.2% in group M at 10 years postoperatively. The survival rates were 96.2% in group P (14 years), 46.6% in group T (14 years), and 47.8% in group M (12 years) at the last observation (Fig. 4). When comparing the MoP THA and MoM THA groups, the survival rate was 100% vs. 74.4% at 10 years postoperatively and 96.2% vs. 41.7% at the last observation (p < 0.001). The survival rates were not significantly different between groups T and M (p = 0.083).

Fig. 4
figure 4

Kaplan–Meier survival analysis of revision surgery due to any reason. a The plot shows that the survival rates for M2a-Taper and M2a-Magnum were much lower than those for M2a-MoP at more than 10 years postoperatively. b The survival rate in the MoM THA group was less than 50% at the maximum follow-up period. MoP metal-on-polyethylene bearing, MoM THA metal-on-metal total hip arthroplasty

Predictors of revision

The factors influencing revision were examined by multiple logistic regression analysis using the variable reduction method with the likelihood ratio. The dependent variable was “revision surgery,” and the explanatory variables that were entered into the analysis were “gender,” “age at surgery,” “BMI,” “implant,” “cup inclination angle,” “head size,” and “cobalt, chromium, NTX, and BAP levels at 2 years postoperatively.” The significant variables were “implant” and “cobalt levels at 2 years postoperatively.” The odds ratios were 33.0 (95% confidence interval [CI] 3.8–286.0) for M2a-Taper compared with M2a-MoP, 32.4 (95% CI 3.6–292.2) for M2a-Magnum compared with M2a-MoP, and 1.76 (95% CI 1.04–2.97) for whole blood cobalt levels at 2 years postoperatively (Table 3). The use of MoM THA itself appeared to markedly increase the risk of revision; however, cobalt concentration was suggested to be a risk factor, although only weakly.

Table 3 Multiple logistic regression for prediction of revision surgery

A total of 77 hips in groups T and M were divided into two groups according to whether they had undergone revision, and the trend of cobalt levels was observed. The median cobalt concentration increased until 1 year after surgery and then decreased to a median of 1.2 μg/L at 2 years after surgery in the group without revision but increased further to a median of 2.0 μg/L in the group with revision (p = 0.015). Similarly, when the data at 2 years postoperatively were compared between the two groups divided according to the presence or absence of ALTRs, a significant difference was noted (p = 0.019) (Fig. 5). When the ROC curve was drawn, with the dependent variable being revision and the explanatory variable being cobalt levels at 2 years postoperatively, the area under the curve was 0.644 (p = 0.018), with a sensitivity of 0.700 and a specificity of 0.651 when the cutoff value was set at 1.55 μg/L. As a predictor of revision, the cobalt concentration in whole blood at 2 years postoperatively was less accurate.

Fig. 5
figure 5

Line graphs showing the change in whole blood cobalt levels over time in the MoM THA group divided by whether the patient had revision surgery or not (a) and by the presence or absence of ALTRs (b). The median cobalt level at 2 years postoperatively exceeded 1.5 μg/L in the patients who had revision surgery and ALTRs. MoM THA metal-on-metal total hip arthroplasty, ALTR aseptic local tissue reactions

Discussion

Regarding the THA survival rate with revision as the endpoint up to the maximum postoperative follow-up period in our study, the bearing surface of MoM had approximately half the survival rate of polyethylene. Therefore, the stemmed MoM THA resulted in a very low survival rate (p < 0.001) and high ALTR rate. Our study could not reveal the risk factor for revision, such as the head size of the MoM bearing used in primary THA. However, cobalt levels continued to increase postoperatively, although they were not accurate predictors of revision.

Recent studies showed the survival rates of stemmed MoM THA to be approximately 95% at 5 years and 90% at 10 years, and a report recorded a good survival rate of 91.4% at 22.8 years [16,17,18,19,20,21]. By contrast, Matharu et al. reported a high failure rate of 27.1% at 10 years postoperatively in 569 patients who underwent surgery using a combination of Pinnacle and a cementless Corail femoral stem [22]. As described above, the survival rate of stemmed MoM THA varies depending on the implant type and stem combination. In this study, groups T and M with MoM bearing surfaces had a survival rate that was below 50% approximately 15 years after THA; thus, the survival rate of M2a stemmed MoM THA was not favorable. Our strict revision criteria of dull pain and increasing ALTR findings on MRI images may have contributed to the lower survival rate; however, the difference in survival rates between the MoP THA and MoM THA groups was large, regardless of our strict revision criteria.

MoM THA can use a relatively large ball head with stability and wide range of motion. However, the long-term results of MoM THA with a large head are not necessarily good, and some reports indicate that MoM THA with a smaller ball head provides better long-term results than MoM THA with a larger ball head [23,24,25,26]. Focusing on the implant models used in this report, in 2015, Lombardi et al. reported the outcome of large-head M2a MoM THA with an average of 7 years of follow-up, with an ALTR incidence of 3.3% and a revision rate of 7.5% [27]. Koutalos et al. reported a survival rate of 85.3% in 79 joints with the M2a-Magnum model at 8.8 years postoperatively [28]. A study of 53 joints using the 38-mm head M2a-Taper reported a 98% survival rate at 10 years and a 74% survival rate at 13 years [29]. The survival rate of 159 joints using the M2a-Taper with a 28-mm or 32-mm head was 91.6% and 82.9% at 168 months with osteolysis and revision as the endpoint, respectively.[30]. In this study with more than 10 years of follow-up, since there was no difference in survival rate between group T with a 28-mm or 32-mm head and group M with a large-diameter head, the head size was not a significant prognostic factor for revision. Therefore, it can be said that regardless of the diameter of the head, M2a stemmed MoM THA itself performed poorly.

Previous reports have shown that blood metal concentrations are not useful in diagnosing the development and progression of ALTRs, while others have found them to be useful [31,32,33]. As for the occurrence of ALTRs with the M2a-Magnum model, the safe value is below 4.1 ppb for cobalt and 4.2 ppb for chromium [34]. Martin et al. also reported that the possibility of pseudotumor development was elevated in cases where metal ion levels increased to 3.0 ppb or higher in the early postoperative period [35]. However, a previous study reported that the metal ion concentration did not increase for 2 years after surgery using the M2a-Magnum model [36]. In this study, we examined the relationship between early postoperative blood metal ion concentrations and revision, including the progression of ALTRs. A whole blood cobalt level at 2 years postoperatively exceeding 1.5 μg/L was suggested to increase the possibility of revision and ALTR development. However, no strong association was found, and the accuracy of this measurement as a predictor was low.

Bone metabolism markers may be useful as surrogate markers for osteoporosis of peri-implant bone, osteolysis, and aseptic loosening of the implant. Savarino et al. reported that blood procollagen I C-terminal extension peptide was significantly decreased and NTX was significantly increased in patients with aseptic loosening [37]. Von Schewelov et al. also showed that urinary NTX was significantly higher in cases of osteolysis [38]. In this study, we measured serum BAP and NTX up to 2 years postoperatively, although we did not find any associations with the occurrence of ALTRs, revision, or elevated whole blood metal ion levels. Bone metabolism markers in the early postoperative period could not be used as surrogate markers to predict these occurrences.

This study has some limitations. Namely, there was a significant difference in the patient age among the three groups, the number of surgical cases was not equally allocated, there was a difference in the timing of surgery in group M, the majority of patients in group P were not symptomatic and did not undergo MRI, and metal ion concentrations at the last observation were not measured in patients who did not undergo revision.

In conclusion, this study of THA procedures performed at the same institution by the same surgeon and with the same stem found that the survival rates of M2a-Taper and M2a-Magnum were much lower than those of M2a-MoP at more than 10 years postoperatively. In the M2a-Taper and M2a-Magnum groups, there was no significant difference in survival rate depending on the implant type or head size. Therefore, the outcomes of stemmed MoM THA are not favorable, and this surgical procedure cannot be recommended. In addition, since there was no strong relationship between the whole blood metal ion levels in the early postoperative period and revision, the metal ion concentration did not emerge as a valid predictor for revision. Based on the results of this study, the use of stemmed MoM THA should be considered with caution, and further detailed studies with a larger number of patients are warranted. Since there are no valid predictors of revision, it is important to monitor the patient’s symptoms and use regular imaging to perform revision at the appropriate time.