Introduction

Total hip arthroplasty (THA) is commonly used to treat severe arthritis, trauma and congenital diseases of the hip [13]. Over 200,000 THAs are performed in the United States every year [4], and the demand for primary THAs is expected to grow by 174% to 574,000 by 2030 [4]. Metal-on-polyethylene (MOP) bearings have a long history of use in THA [5]; however, their survival has been limited, with only a few lasting longer than two decades. Periprosthetic osteolysis caused by wear debris released from the bearing surface of the polyethylene bearings is the major problem in hip arthroplasty [6]. The International Congress of Bone and Joint Surgeons, held in 1995 was prompted by concerns that the alternative metal-on-metal (MOM) bearing should be reconsidered for use in clinical practice [7].

There has been a significant expansion in the worldwide use of MOM bearings in the past decade [8]. Wear is inevitable following MOM-THA [9], although the volumetric wear rates and osteolytic potential of MOM bearings have been shown to be lower than those of MOP bearings in laboratory experiments, and probably also in vivo. However, MOM bearings produce minute particles and evidence suggests that they produce orders of magnitude more metal particles than MOP bearings [8]. Elevated metal ion concentrations have been reported in serum, urine and erythrocytes, though the local and systemic effects of these are unknown [912]. There has, therefore, been concern regarding the increased use of MOM-THA as an alternative to contemporary MOP-THA, and the choice remains controversial [8, 10, 13].

This meta-analysis aimed to address that clinical choice based on the results of published research. We evaluated and compared metal ion concentrations, complications, reoperation rates, clinical outcomes and radiographic outcomes of MOM-THA and MOP-THA. This is the first analysis to compile and evaluate all the available data on MOM implants compared with MOP implants for THA. The inclusion of only prospective randomized trials enhances the level of evidence and the robustness of estimates compared with previous literature reviews or other single trials [14].

Methods

Search strategy

We conducted a meta-analysis of all English and non-English articles identified from MEDLINE (1966 to December 2010), Embase (1980 to December 2010), the Cochrane Central Register of Controlled Trials, PreMEDLINE and HealthSTAR. Additional studies were identified by contacting experts and searching reference lists and abstracts presented at the American Society for Bone and Joint Surgeons Research from 1995 to 2010. We used Medical Subject Headings (MeSH) terms and free words, including metal (metal on metal, metal bearings, metal implant), polyethylene (metal on polyethylene, polyethylene bearings, polyethylene implant) and hip arthroplasty (THA, total hip replacement). We also sought information about unpublished and ongoing studies from the authors of the included studies and from experts in the field.

Selection criteria and quality assessment

The present meta-analysis followed the PRISMA guidelines [15, 16]. Each publication was independently reviewed by two investigators who were blinded to the journal, author, institution at which the study was performed and date of publication. Eligible studies compared MOM-THA and MOP-THA and provided sufficient numerical information on at least one of the following pre-specified end points: reoperation for any cause, all-cause mortality, local and general complications, radiographic outcomes and metal ions (including cobalt, chromium and titanium concentrations). We also investigated function and health-related quality of life if these had been assessed using valid scoring systems or questionnaires. Two of the authors independently assessed each published study for the study design quality using a 21-point scale [17]. We used Cohen’s κ coefficient to measure agreement beyond chance between reviewers [18]. Disagreements were resolved by discussion with a third investigator.

Data extraction

Two investigators independently extracted data from the studies using a structured form. The following information was sought from each report: year of publication, enrolment period, country and region, number of patients, study design, mean age, percentage male, loss to follow-up and materials design. The reviewers also extracted and electronically recorded event rates with nominators and denominators for different end points, as well as the means and standard deviations for functional scores and quality of life assessments. The reviewers resolved disagreements by discussion with a third investigator.

Data analysis and statistical methods

We analysed binary end points (e.g., reoperations, complications and mortality) by calculating relative risks (RR) and 95% confidence intervals (CI). Weighted mean differences (WMD) and pooled standardised mean differences (SMD) were calculated for differences in functional scores and quality of life instruments. Means and standard deviations of metal ion concentrations are reported in micrograms per litre. The method of Hozo et al. [19] was used to convert data reported as medians and ranges, and the recommendations of the Cochrane Methods Group was followed for data reported as medians and 25th and 75th percentiles [20]. Data were extracted from graphs for two trials that failed to report exact metal ion concentrations. WMD were calculated for differences in metal ion concentrations. If data were duplicated in more than one study, the data from the most recent study were used. For the meta-analysis, both a fixed-effects model (weighted with inverse variance) and a random-effects model were considered. Heterogeneity between studies was assessed using Cochran Q statistics. For values of the Cochran Q statistic p < 0.10, the assumption of homogeneity was deemed invalid and a random-effects model was reported. The analysis was carried out using Review Manager Version 5.

Results

Search results

The search strategy retrieved 1,075 unique citations. Of these, 1,032 citations were excluded after the first or second screening based on titles or abstracts, and 43 articles remained for full-text review. Two randomised trials were reported in duplicate [2124]. The related publications were assessed for overlapping and unique information relevant to this analysis. Eight studies [2128] enrolling a total of 669 patients were included in the final meta-analysis (Fig. 1 ).

Fig. 1
figure 1

Flow chart demonstrating trial inclusion criteria

Study characteristics and quality

The characteristics of the eight selected studies are summarized in Table 1. Seven were randomized controlled trials and one was a prospective randomized trial. All the studies described balanced patient baseline characteristics, attempted a minimum follow-up of more than 24 months and specified postoperative care. Most studies reported metal ion concentrations (n = 5), including serum metal ions (cobalt and chromium; n = 4), urine metal ions (cobalt, chromium and titanium; n = 2) and erythrocyte metal ions (cobalt, chromium and titanium; n = 2). Zijlstra et al. [23, 24], however, only provided data for serum cobalt ion levels over a 2- to 10-year follow-up period. Data for serum metal ions were therefore derived from only two studies [25, 26]. Four studies reported complications (including all-cause mortality). All the studies provided functional scores and quality of life assessments; all included Harris hip scores (HHS, n = 8), and two included Western Ontario and McMaster University Scores (WOMAC, n = 2). Most studies used radiographic evaluation (n = 7) and four of them provided the data according to the technique described by DeLee and Charnley [29].

Table 1 Main characteristics of the included studies

The reviewers achieved excellent agreement, and the assessment of the study quality was excellent (intraclass correlation, 0.93; 95% CI, 0.39–0.99). The κ values for the various components of the study design (such as randomization and blinding of patients, clinicians, and those assessing outcomes; conduct of the statistical analysis; and follow-up) ranged from 0.79 to 1.0.

Metal ion concentrations

Cobalt

Serum cobalt concentrations were used for outcome assessment in two randomised trials accounting for 159 patients. Our results showed that patients who received MOM implants had significantly elevated serum cobalt concentrations at the 2-year follow-up, compared with preoperative levels (WMD 0.67, 95% CI 0.48–0.86, p < 0.0001). There was also a significant difference in postoperative serum cobalt concentrations between MOM-THA and MOP-THA (WMD 0.64, 95% CI 0.49–0.79, p < 0.0001) (Table 2).

Table 2 Analysis of cobalt, chromium and titanium concentrations

Urine and erythrocyte cobalt concentrations in the MOM group increased significantly from the preoperative to the postoperative evaluation, while there was no difference between preoperative and postoperative evaluations in the MOP group. Our results also revealed significant inter-group differences in postoperative cobalt concentrations (MOM-THA vs. MOP-THA, Table 2).

Chromium

Serum, urine and erythrocyte chromium concentrations increased significantly in the MOM group during the 2-year follow-up period. The WMD of serum chromium concentrations was 0.59 (95% CI 0.44–0.74, p < 0.0001). There was no difference between the preoperative and postoperative evaluations in the MOP group (WMD 0.01, 95% CI −0.03 to 0.05, p = 0.66). Serum, urine and erythrocyte chromium concentrations in the MOM group were significantly higher than in the MOP group at the 2-year evaluation (WMD 0.58, 95% CI 0.34–0.82, p < 0.0001) (Table 2).

Titanium

There were no significant differences between preoperative and postoperative erythrocyte titanium concentrations in the two groups. There was no difference in erythrocyte titanium concentrations between the MOM-THA and MOP-THA groups at the 2-year evaluation (WMD 0.05, 95% CI −0.21 to 0.32, p = 0.70).

There were significant increases in urine titanium concentrations from preoperative to postoperative levels in both MOM and MOP patients. During the 2-year follow-up period, there was no difference in urine titanium concentrations between the MOM-THA and MOP-THA groups (WMD 0.04, 95% CI −0.08 to 0.16, p = 0.56) (Table 2).

Complications and reoperation rates

Complication rates were reported in four studies. No significant differences in the rates of total complications, dislocations, trochanteric bursitis, wound infection, thigh pain, or all-cause mortality were found between MOM-THA and MOP-THA (Table 3). Six studies provided data on reoperation rates. Overall, there was no significant difference in reoperation rates between patients undergoing MOM-THA and MOP-THA (RR 0.86, 95% CI 0.22–3.40, p = 0.83).

Table 3 Complications, reoperation rate, Clinical clinical outcomes, and radiographic evaluation, complications, all-cause mortality and implant survival for MOP-THA vs. MOM-THA

Function and health-related quality of life

All the studies provided HHS scores, but two randomised trials were reported in duplicate [2124], and only the data from the more recent of these were used in the present study [22, 24]. Finally, the HHS was used for outcome assessment in six randomised trials accounting for 623 patients, with follow-up intervals ranging from 24 to 120 months. Overall, the HHS did not differ between patients undergoing MOM-THA and MOP-THA (Table 3). The WMD in favour of MOM-THA was 1.73 (95% CI −0.04 to 3.50, p = 0.06) (Fig. 2).

Fig. 2
figure 2

Forest plot of the Harris hip score. The size of the data marker corresponds to the weight of the study. The diamond and vertical broken line represent the summary estimate. The result favours MOM groups, but that the difference is not significant. Fixed effect model is used for meta-analysis

The WOMAC is a disease-specific, self-administered outcome measure designed specifically for patients with osteoarthritis of the knee or hip. It specifically addresses pain, stiffness, and physical function. The WMD at final follow-up after 24 months was 1.67 (95% CI −4.58 to 7.92, p = 0.60) with no differences between MOM-THA and MOP-THA. Pain scores were also evaluated by SMD, and the results in MOM-THA and MOP-THA patients were similar (Table 3).

Radiographic evaluation

The DeLee and Charnley evaluation was available from three trials and no differences were found between MOM-THA and MOP-THA. The RR of Zone 1 was 1.40 (95% CI 0.54–3.61, p = 0.49), Zone 2 was 1.61 (95% CI 0.88–2.94, p = 0.12) and Zone 3 was 1.25 (95% CI 0.30–5.11, p = 0.76) (Table 3).

Discussion

This meta-analysis aimed to provide additional insight into the options for THA, focusing on the role of MOM implants, in light of the significant body of evidence suggesting that patients treated with MOM implants have higher metal ion concentrations than those treated with MOP implants. Our results demonstrated significantly elevated erythrocyte, serum and urine metal ion levels (cobalt and chromium) among patients who received MOM-THA. However, no significant differences in total complication or reoperation rates were found between MOM-THA and MOP-THA. Clinical function scores and radiographic evaluation results were also similar in the two groups. This analysis found insufficient evidence to identify any clinical advantage of MOM-THA, compared with MOP-THA.

The study procedure took internal and external validity into consideration. To avoid selection bias, Embase, the Cochrane Library, Cochrane Central Register of Controlled Trials, PreMEDLINE, HealthSTAR and CBMdisc, as well as MEDLINE, were all searched for relevant articles. To minimize bias in the selection of studies and in data extraction, articles were independently selected on the basis of the inclusion criteria by reviewers who were blinded to the journal, author, institution and date of publication. The quality of the studies was assessed using a “21-point scale” scoring system to ensure their high quality.

At the 2010 Congress of EFORT (European Federation of National Associations of Orthopaedics and Traumatology), there was a dramatic shift in the preferences of surgeons regarding the use of MOM implants. The importance of this subject is increasing as a result of the recent recall of chrome cobalt acetabular hard-bearing implants, because of fixation failure, as well as the clinical appearance of pseudotumours and aseptic lymphocytic vasculitis-associated lesions (ALVAL), especially in females.

The results of this meta-analysis showed that serum, urine and erythrocyte cobalt concentrations increased significantly from preoperative to postoperative evaluations in patients who received MOM implants, while there were no differences between preoperative and postoperative evaluations in the MOP group. There were also significant differences in postoperative serum cobalt concentrations between the MOM-THA and MOP-THA groups. Our analysis also showed that serum, urine and erythrocyte chromium concentrations increased significantly in the MOM group during the 2-year follow-up period, and there were significant differences in serum, urine and erythrocyte chromium concentrations between the MOM and MOP groups at the 2-year evaluation.

Elevated levels of metal ions have also been examined in several studies of MOM-hip resurfacing (MOM-HR) [3034]. Systemic distribution of metal particles from THA or HR to remote sites such as the lymph nodes, bone marrow, placenta, kidney and spleen has been demonstrated [3538], and freed metal ions can be measured in whole blood, serum, plasma, urine and semen [3943]. The levels of the metal ions have been shown not to correlate with age, functional results, or gender [4446]. However, Moroni et al. [12] recently found that chromium ion concentrations in the MOM-HR group at 5 years were greater in females compared with males, suggesting that gender may be a confounding factor.

Reference values for healthy controls have been described and different countries provide guidelines or acceptable limits of environmental and industrial exposures [47]. In non-occupationally exposed subjects, urinary cobalt is usually below 2 μg/g creatinine and serum/plasma cobalt below 0.5 μg/L. In persons not occupationally exposed to cobalt, the concentrations of chromium ions in serum and in urine do not usually exceed 0.5 μg/L and 5 μg/g creatinine, respectively [48, 49]. Elevated metal ion levels after MOM-THA have been well corroborated and concentrations that exceed the thresholds established in industry have frequently been recorded. However, it is difficult to define a safe level of chronic metal ion exposure for patients with a MOM-THA [7, 47], and no consensus currently exists regarding biomonitoring of metal ion levels following MOM-THA or HR [7, 47].

Cobalt and chromium ions have been shown to cause DNA damage [50], mutagenic changes [51], delayed-type IV T-cell hypersensitivity [5255] and dose-dependent cell necrosis [56], but the implications of these changes are unclear. Moreover, the biological effects of metallic particles on cells relevant to bone, osteoblasts and osteoclasts have not been fully elucidated [57].

Elevated cobalt and chromium levels may have detrimental short-to-long-term effects on patients as a result of their local or systemic effects [6, 7, 55, 58]. Local soft tissue changes are seen at the implant site [5961], and the incidence of these soft tissue changes appears to be increasing. Recent evidence from patients who have undergone MOM-THA or HR has shown an association between raised levels of cobalt and chromium ions and metal allergy [62, 63], pseudotumours [64] or ALVAL [65]. Moreover, Fujishiro et al. reviewed 612 capsular and interface tissues obtained from 130 patients at revision THA and found that perivascular and diffuse lymphocytic inflammation were common in tissues around failed non-MOM implants. However, they also found that the extent of inflammation in some tissues around failed MOM implants was positively correlated with metal debris [66].

Because most cobalt and chromium ions are eliminated by the kidney, nephrotoxicity caused by these ions has become a major focus of study [11, 67], though most studies found no association between metal levels and renal markers (serum creatinine or creatinine clearance) during the short-medium term [11, 67]. Malignant tumours around MOM bearings are extremely rare. Several epidemiological studies have investigated the long-term risk of cancer [6870]; however, most of these studies were underpowered and the follow-up periods were short, and no risks have been identified to date [47]. Further longer-term, large-scale controlled trials are needed to monitor THA (or HR)-induced low-intensity (but long-term) trace-element exposure to rule out the potential of metal-induced cancers and nephrotoxicity [11, 47, 67, 68].

Some studies identified both female sex and femoral component head as predictors of reoperation in MOM-THA or HR, while other multivariate analyses suggested that female sex might be indirectly related [9, 7173]. A previous large study compared MOM-THA with MOP-THA on the basis of hip registry data from the Müller Institute (Berne, Switzerland), which included over 58,000 hips from 45 centres throughout Europe. The investigator identified all reoperations because of aseptic loosening and matched them by age, gender, diagnosis, hospital, type of system and date of surgery to a group of patients with no aseptic loosening. They found that the risk in the MOP-THA group was higher than in the MOM-THA group but that the difference was not significant [74].

There has been increasing dispute recently regarding the survival of MOM bearings, with no definite conclusions regarding the relationship between metal ion levels and the risk of reoperation. Langton et al. reported a possible relationship between elevated chromium ions and increased femur neck fracture and reoperation rate [9, 75], though no differences between MOM-THA and MOP-THA were identified in our study. The low number of reoperation events means that estimates from meta-analyses should be more discreet. Nevertheless, elevated cobalt and chromium ion concentrations, metal-induced pseudotumours and the high reoperation rate mean that MOM-HR is not recommended in women younger than 40 years [76, 77].

The complications recorded in the current meta-analysis included dislocation, trochanteric bursitis, wound infection, thigh pain and all-case mortality, many of which might not have been related to material differences. Dislocation is likely to be associated with surgical approach, inclination of the cup position, fixation technique and the experience of the surgeon [9, 72, 7884]. Wound infection would be influenced by the timing of prophylactic administration of antibiotics, wound class, operative procedure and patient risk index [8589]. Thigh pain was related to size of the femur head and fitting of cement stems [9, 90]. The pooled analysis found no significant differences in total complication rates between MOM-THA and MOP-THA, indicating that the different bearing surfaces (MOM or MOP) had no significant influence on the incidence of total complications.

Many published studies found that age, physical status, physical activity and arthritis of other joints might influence clinical scores [79, 71, 73]. No immediate relationship between cup-liner material and clinical scores was identified [8]. Our study found similar results for both treatment groups after examining assessments made using different hip-function scoring systems, and neither HHS nor WOMAC differed between patients undergoing MOM-THA and MOP-THA. This suggests that the different bearing surfaces (MOM or MOP) had no significant influence on clinical scores. Another potential bearing system uses ceramic-on-ceramic, the advantages of which include extreme hardness and scratch resistance, improved lubrication creating a low coefficient of friction resulting in excellent wear resistance and decreased and less bioactive particulate debris compared with polyethylene or metal bearings. They do, however, also have disadvantages, such as fracture of the ceramic, accelerated wear, rattling and high cost [9195].

Our study had several limitations. The main weakness of this study was the low number of randomised trials analysed. Although most studies reported metal ion concentrations (n = 5), only two studies provided data for each metal ion concentration (as serum metal ions, urine metal ions or erythrocyte metal ions). The data from the eight selected studies were therefore not as good as claimed (complications in four studies, ion concentration in only two) and the result should therefore be interpreted with caution. Moreover, the meta-analysis was not performed when heterogeneity was significant. The low number of studies means that estimates from the analyses were imprecise and did not allow any meaningful conclusions to be drawn. Publication bias might also have distorted the results. Second, most of the selected studies were underpowered and the follow-up period was short: shorter than the accepted objective criteria periods for survival rate and most metal-induced diseases. Because of insufficient evidence, the implications of these changes are unclear.

In summary, this analysis found insufficient evidence to identify any clinical advantage of MOM-THA, compared with MOP-THA. Cobalt and chromium ion concentrations were elevated following MOM-THA, but there was no significant difference in total complication rate (including all-case mortality) between the two groups in the short- to mid-term follow-up period. MOM bearings in THA should be used with caution.