Introduction

The use of metal-on-metal (MoM) bearing surfaces gained popularity over a decade ago. These articulations were thought to minimize wear and subsequent osteolysis, allow for larger head sizes, and decrease dislocation rates. MoM bearing surfaces also led to the reintroduction of hip resurfacing arthroplasty (HRA), a more bone-conserving procedure, which in the past had performed poorly with metal-on-polyethylene bearing surfaces [8, 35]. At its height, the use of MoM THA and HRA accounted for approximately 35% of all hip arthroplasties with nearly 31% of HRA being implanted in women [36]. A number of studies showed low rates of dislocation and high rates of function with both MoM THA and HRA. However, other publications have shown unacceptably high rates of revision and complications (eg, adverse local tissue reaction [ALTR], osteolysis, etc) [1, 26, 30]. Furthermore, studies have indicated that women experience a higher rate of complications compared with men, particularly after HRA. However, many have speculated this is the result of differences in component sizing. Despite a vast literature describing HRA clinical outcomes, outcomes stratified by gender are limited [2, 4, 11, 12, 14, 16, 36, 38].

Underreporting of gender differences in orthopaedic outcomes is not unique to MoM arthroplasty. Several orthopaedic leaders have stressed the importance of reporting gender differences in orthopaedic studies [22, 24]. Thus, to more fully evaluate the rate of complications after MoM arthroplasty, we sought to perform a systematic review of the literature. A systematic review provides the ideal method to aggregate all available evidence in a rigorous manner, minimizing bias as well as issues surrounding insufficient sample size. It further provides insight into potential directions for new research on the topic.

The specific aims of our systematic review were to: (1) compare the rate of ALTR; (2) dislocation; (3) aseptic loosening; and (4) revision between men and women undergoing primary MoM HRA.

Search Strategy and Criteria

Our original aim was to perform a systematic review of all papers evaluating MoM hip arthroplasty (including both THA and HRA). Thus, a systematic review of peer-reviewed English language literature was conducted using the MEDLINE and EMBASE search engines on July 10, 2014 (Prospero registration number CRD42014012906). Inclusion criteria were levels I to III articles that reported clinical outcomes after primary MoM hip arthroplasty (THA or HRA) with minimum 2-year followup. Articles needed to specifically evaluate for the complications in question (ALTR, dislocation, aseptic loosening, and revision). To be included in the final analysis, outcomes needed to be reported as discrete numbers as opposed to relative risks or odds ratios. Furthermore, outcomes as well as demographic variables needed to be reported specifically by gender for both the treatment and control groups. Exclusion criteria included review articles (such as systematic reviews and meta-analyses), level IV to V evidence, no gender reporting, less than 2-year followup, and previously reported data (ie, duplicate data). In the event of a duplicate subject publication, the article with a greater number of subjects was included. The specific Boolean search term used was: ‘(((“hip resurfacing” OR “hip arthroplasty”) AND (“metal-on-metal” OR “metal on metal”)) AND English [Language])’.

An initial search yielded 971 potential articles on MoM hip arthroplasty (THA and HRA) for inclusion (Fig. 1). Articles were screened by two independent reviewers (BDH, BJE) using the aforementioned inclusion and exclusion criteria. After our screening, the bibliographies of all included studies were screened for potential articles, which yielded four additional articles for inclusion. In total, 13 articles met inclusion criteria for our systematic review. Of the available articles, 10 discussed HRA outcomes, whereas three outlined MoM THA outcomes. Given the paucity of data evaluating MoM THA [3, 25, 28], the systematic review was narrowed to only include HRA studies. At the outset of the systematic review, the author’s hope was that pooled data evaluating clinical outcomes scores could be reviewed. However, a singular study [27] included such data, and subsequently the decision to exclude this analysis from the review was made as well.

Fig. 1
figure 1

Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) flowchart demonstrates the search strategy for our systematic review of gender differences in MoM HRA.

Data Extraction

Two independent reviewers (BDH, BJE) collected data from each included article using data abstraction forms. Abstracted data included study publication year, authors, country of publication, enrollment dates, level of evidence, component type, number of patients, number of men and women, age, body mass index, surgical approach, functional outcomes scores (eg, Harris hip score), followup duration, and complications (ALTR, dislocation, aseptic loosening, and revisions) (Table 1). Study methodological quality was assessed using the Modified Coleman Methodology Score [7]. The Modified Coleman Methodology Score is a score used to evaluate a study’s methodology with scores ranging from 0 to 100. A Modified Coleman Methodology Score of “excellent” studies range from 85 to 100, good studies range from 70 to 84, fair studies range from 55 to 69, and poor studies from 0 to 55. Overall the quality of research was poor with an average Modified Coleman Methodology Score of 41.7 ± 5.9 (Table 2).

Table 1 Included studies with demographic data and complication rates (articles sorted chronologically)
Table 2 Modified Coleman Methodology Scores (articles sorted chronologically by year)

Statistical Methods

Weighted demographic data and complication rates were calculated for males and females (Tables 1, 3). To determine the effect of gender on the rates of complications, unadjusted females/males odds ratios (ORs) and 95% confidence intervals (CIs) were calculated. The reported p values refer to a one-sided (likelihood ratio) test for difference in complication rates between genders. Probability values of < 0.05 were considered significant. A Mantel-Haenszel statistical method was used to evaluate for any heterogeneity among the included studies as well as the presence of any publication bias. For those subanalyses with possible bias, a random effects model was used. All statistical tests were performed using Review Manager (RevMan, Version 5.3; Copenhagen, Denmark: The Nordic Cochrane Centre, The Cochrane Collaboration; 2011).

Table 3 Demographic information for HRA studies (articles sorted chronologically)

Note on Language Use

In general, the term sex refers to biological phenomena, and gender refers to phenomena in which there might be an element of cultural overlay. In this study, it is likely that some endpoints were related to sex (such as the frequency of ALTR), whereas others almost certainly are gender (such as the frequency of revision surgery for any reason), and some are uncertain in that they may have both social and biological components. For simplicity, we use the term gender as well as gender-related terms (such as “woman” rather than “female”) throughout the article.

Results

Adverse Local Tissue Reaction

When comparing rates of ALTR between genders, our systematic review showed a higher rate of ALTR reaction in women (OR, 5.70 [2.71–11.98]; p < 0.001) (Fig. 2). A heterogeneity analysis indicated that our ALTR data were moderately uniform (I2 = 46%, p = 0.13), and thus a random effects model was used. In total, four articles reported gender differences in ALTR rates. This included a total of 9296 patients (68.7% men; n = 6389) with 66 women (2.3%) and 97 men (1.5%) experiencing an ALTR.

Fig. 2
figure 2

Gender differences in ALTR after MoM HRA are shown.

Dislocation

Although dislocation was a rare event in our cohort, women had to have a higher rate of dislocation compared with men (OR, 3.04 [1.2–7.5]; p = 0.02) (Fig. 3). A heterogeneity analysis indicated that our dislocation data may be affected by mild heterogeneity (I2 = 37%, p = 0.20). Dislocation rates were reported by gender in four included articles. These papers represented a total of 6565 patients (n = 4480; 68.2% men), including 10 dislocations (0.50%) occurring in women and nine dislocations (0.2%) occurring in the male cohort.

Fig. 3
figure 3

Gender differences in dislocation after MoM HRA are shown.

Aseptic Loosening

Women had to have higher rates of aseptic loosening compared with men (OR, 3.18 [2.21–4.58]; p < 0.001) (Fig. 4). A heterogeneity analysis indicated that our aseptic loosening data were uniform (I2 = 0%, p = 0.97). Five articles reported rates of aseptic loosening by gender. This included 11,247 patients of whom 7802 were men (69.4%). Aseptic loosening was present in 68 women (2.0%) and 54 men (0.70%).

Fig. 4
figure 4

Gender differences in aseptic loosening after MoM HRA are shown.

Revision

Women demonstrated a higher rate of revision compared with men in our systematic review (OR, 2.50 [2.25–2.78], p < 0.001) (Fig. 5). A heterogeneity analysis indicated that our revision data were uniform (I2 = 8%, p = 0.36). Nine articles reported rates of aseptic loosening by gender, representing 44,713 patients including 30,778 men (68.8%). Revisions were observed in 724 women (5.2%) and in 669 men (2.2%).

Fig. 5
figure 5

Gender differences in revision after MoM HRA are shown.

Discussion

Hip arthroplasty has revolutionized the treatment of end-stage degenerative processes affecting the hip. Because this intervention has expanded its indications to higher-demand patient populations, including younger and more active individuals, there is an increasing need to develop implants with improved longevity and durability. In comparison to conventional metal-on-polyethylene bearing surfaces, MoM bearing surfaces have demonstrated diminished volumetric wear while allowing larger femoral head sizes and thereby decreased dislocation risk [8, 35]. This led to early adoption and high rates of use [31, 32, 36]. Unfortunately, data emerged demonstrating early failure and high rates of complications in certain devices [26, 30]. These complications appeared to be more common in women, although results were mixed [4, 9, 11, 14, 18]. This could in part be explained by the lack of gender-specific outcomes in the orthopaedic literature, a reporting bias called into question in recent years [22, 24]. No prior hip arthroplasty reviews have reported their data stratified by gender. Therefore, to further probe the role of gender in complication rates after MoM hip arthroplasty, we performed a systematic review to compare the rate of complications (eg, ALTR, dislocation, aseptic loosening, and revision) after primary MoM HRA. Our findings underscore the importance of reporting gender-specific outcomes.

Like with all systematic reviews, our findings are limited by the data available. In an attempt to provide a higher quality review, we included only studies of levels I through III evidence. Furthermore, given our specific aims of probing the role of gender in complication rates after MoM hip arthroplasty, we were limited by the studies that reported clinical outcomes stratified by gender, which, unfortunately, was a low number. Although it is not ideal to include retrospective literature in a pooled analysis such as this, only two of the included studies were prospective, and we thus decided to include level III studies in our analysis. Although our original intention was to include a review of both MoM THA as well as HRA, as a result of the limited research evaluating gender differences in MoM THA outcomes, we were unable to include these studies in our analysis. This too was the case with functional outcomes after either MoM THA or HRA, because a singular study reported functional outcomes by gender. Despite the attempt to include high-quality literature, the overall quality, as evidenced by our Modified Coleman Methodology Score, remained poor. Finally, perhaps the biggest limitation of our review provides the greatest opportunity for future research, namely the causative factors behind our findings of increased complications in women. Although our data have demonstrated higher rates of complications in women after MoM HRA, the cause of this finding remains elusive; unfortunately, a causative relationship cannot be explored with the data available. Suggested causes for higher rates of failure in women have included an increased incidence of metal allergy in women, gender differences in ligamentous laxity, bone quality, anatomical differences between the male and female hips, the latter having a higher prevalence of developmental dysplasia, and finally the most commonly implicated etiology is related to femoral head and acetabular component sizing (which could lead to suboptimal lubrication regimes and/or edge loading from suboptimal contact geometries) [1113, 15, 29, 33, 34]. Of note, we could not control for component design or size, because the included studies did not provide this information in such a manner in which it could be explored. Furthermore, our heterogeneity analysis indicates that there was variable clinical heterogeneity among the studies included and the individual outcomes analyzed.

When specifically evaluating rates of ALTR between genders, our data demonstrated an increased rate of complication in women; however, a paucity of data specifically evaluating this finding exists [4, 5, 11, 23]. With the available evidence, conclusions are limited as to the cause of our observed findings. However, authors have speculated the increased rate of ALTR may be related to a number of factors including differences in femoral head-neck anatomy, acetabular anatomy (eg, hip dysplasia), femoral head size, age, component malpositioning, and component design [4, 5, 11, 12, 37]. Increased preoperative femoral head-neck ratio (HNR), particularly those greater than 1.3 (a finding more common in women), has been shown to be a risk factor for subsequent ALTR and failure. This was thought to be the result of downsizing of the femoral heads at the time of surgery, leading to greater rates of impingement and edge loading [12]. Acetabular anatomy has also been implicated as a cause of failure in HRA, because Glyn-Jones et al. demonstrated higher rates of failure in patients with hip dysplasia, a more common diagnosis in women [11]. Furthermore, malpositioned acetabular components (particularly those cups with increased abduction angles) and resultant edge loading are felt to have an association with ALTR [5]. Perhaps the most consistently implicated factor, in the studies included in this particular analysis, was component size. Smaller femoral heads have been shown in numerous series to be related to high rates of ALTR, a finding corroborated by the Australian Arthroplasty Registry [4, 5, 11, 12, 31]. However, although it is tempting to attribute the relationship of increased rates of ALTR solely to smaller head sizes in women, the Canadian Arthroplasty Registry demonstrated an independent effect of gender on increased rates of ALTR regardless of head size [4].

Although an uncommon complication after primary MoM HRA, dislocation appeared to occur more frequently in women in the few studies reporting this outcome by gender [2, 5, 14, 19]. Authors of the included studies provided little discussion of gender differences in dislocation rates; however, much has been made of the smaller femoral head sizes in women as it relates to other complications (eg, revision and ALTR) [4, 5, 11, 12, 31]. Similarly, as previously mentioned, authors have demonstrated the influence of HNR as well as acetabular geometry on MoM HRA impingement, altered kinematics, and subsequent failure. With the given data we are unable to determine if gender is an independent risk factor for dislocation after HRA because we could not control for head size in our analysis. In the THA literature, however, it has clearly been shown that larger heads have lower rates of dislocation [20, 21]. Another potential cause of dislocation may be related to ALTR, a more common complication seen in women. Damage to the soft tissue envelope of the hip, and particularly the abductor musculature, has been implicated as a potential cause of increased dislocation in the setting of ALTR [6, 10]. Moving forward, if the relationship between gender and dislocation is real, future studies should explore causative factors for a potential increased rate of dislocations in women.

Low rates of aseptic loosening were observed among the studies reporting outcomes by gender, much lower than rates of loosening seen in registry data, which are as high as 33% [2, 4, 5, 14, 27, 31]. Nevertheless, even with the low rates of aseptic loosening in our sample, women appeared to have higher rates of loosening with the data available. At this point, a rationale for the higher rates of aseptic loosening observed in women remains speculative. Implicated factors among the included studies included an increased rate of metal allergy among women, smaller femoral head sizes used in women, and a higher rate of cup malposition (possibly related to higher rates of hip dysplasia) [2, 5]. Current registry data available do not parse out the rates of aseptic loosening by gender; thus, correlations are not possible at this time. However, the Australian registry does appear to show a correlation between smaller femoral head sizes (≤ 50 mm) and higher rates of aseptic loosening [31]. Unfortunately, their analysis did not control for gender. In future iterations of registry data, it would be interesting to see if this relationship remains despite gender or if there is truly a relationship with gender.

Among the included studies, revision was the most commonly reported complication with nearly all of the included studies reporting revision rates by gender. With the included data, women had a higher rate of revision. This finding has been corroborated by the most recent Australian registry data from 2013 that demonstrated a threefold increase in the rate of revision among women at 1-year post-HRA implantation [31]. However, on further analysis of the Australian data, they demonstrate that this relationship appears to be related to femoral component head size with equivalent rates of revision regardless of gender in femoral head sizes greater than or equal to 50 mm [31]. Smaller femoral head size was frequently cited as a likely explanation for the increased rate of failure among the included studies. However, similar to previously discussed complications, femoral as well as acetabular geometry, metal allergy, and component malpositioning were all cited as possible explanations for the increased of revision in women. An additional factor worth investigating in further studies includes the modes of failure stratified by gender, because this was not uniformly reported in the current studies. This would provide insights into possible causes for failure and surgical technique as well as implant improvements for the future.

Although mixed results have been reported with regard to gender differences in complications after MoM HRA, these findings are skewed by the general lack of gender-specific outcome reporting. Our systematic review of levels I through III studies demonstrated increased rates of ALTR, dislocation, aseptic loosening, and revision in women after MoM HRA. Although these findings shed light on the differential complication rate between genders, further study is necessary to elucidate the root causes of these findings. Moving forward, we would encourage researchers to investigate gender as a possible risk factor for complications and furthermore to report demographic data for all patients enrolled in studies (particularly those who experience complications).