Introduction

Due in part to the widespread success of the procedure [1, 2], procedural volume for total hip arthroplasty (THA) is projected to surge in the coming decades [3,4,5,6]. Yet, while a majority of patients are satisfied following the operation [7, 8], there remains a subset of this patient population that suffers from various post-operative complications [9, 10]. Therefore, as providers continually attempt to improve the quality care in light of increasing caseloads [11,12,13], there has been an increasing need for novel methods of maintaining and improving THA outcomes. The implementation of robotic assisted total hip arthroplasty (RA-THA) has extensively been explored in this domain.

While THA procedures utilizing robotic-arm assistance developed in the 1980’s [14, 15], consideration for the widespread use of RA-THA has only recently begun to emerge. Support for these procedures has mainly focused on the potential benefits that RA-THA provides in relation to implant placement and orientation [16, 17]. Specifically, current RA-THA modalities utilize preoperative planning software that aids in limiting unnecessary resection while guiding providers regarding proper alignment of the prosthetic joint components through tactile feedback [18, 19]. Yet, while numerous studies have compared radiologic outcomes and implant placement accuracy between RA-THA and manual THA (mTHA) cohorts [20,21,22,23,24,25,26,27,28,29,30], there is a need for additional information regarding differences in functional outcomes and complication rates between the two methods of joint replacement.

As the use of RA-THA continues to emerge, more information regarding how outcomes in RA-THA cohorts compare to manually performed manual THA patients is needed in order to determine whether robotic assistance can be a viable option for widespread use. Therefore, the purpose of our systematic review is to compare differences in early and mid-term outcomes between RA-THA and mTHA in relation to (1) functional outcomes and (2) complication incidence.

Methods

Search strategy

The PubMed, Embase, and Cochrane library databases were comprehensively searched for all articles that compared functional outcomes between RA-THA and mTHA cohorts published between October 1994 and May 2021. In combination with “AND” or “OR” Boolean operators, the following keywords were utilized: “robotic”; “robotic-assisted”; “THA”; “total hip arthroplasty”; “outcomes”; “patient-reported outcome measures (PROMs); “complications”.

Articles were included if they met the following inclusion criteria: (1) full-text English manuscript was available, (2) controlled prospective or retrospective studies, (3) studies that utilized FDA approved robotic systems for RA-THA procedures, (4) studies reporting on functional outcomes or complications following RA-THA and mTHA. Additionally, exclusion criteria applied to studies consisted of the following: (1) case series or case reports without mTHA controls, (2) studies that did not utilize a robotic-arm, (3) cadaveric studies, (4) studies that did not report on functional outcomes or complications.

Two researchers independently conducted the query. If there was a disagreement regarding article inclusion, the senior author was consulted to determine if the article should be included.

Study selection

The initial search yielded a total of 526 potentially relevant articles. Following duplication removal, a total of 398 articles remained. After a thorough review of the abstracts, and application of the predetermined inclusion and exclusion criteria, 47 articles were further considered. The full-text manuscripts of these articles were then examined, resulting in 18 articles being further considered. No additional articles were identified following a thorough review of the reference lists of these studies (Fig. 1). Therefore, our final analysis included 18 studies which reported on a total of 2811 patients (RA-THA: n = 1194 (42.48%); mTHA: n = 1617 (57.52%)) (Table 1).

Fig. 1
figure 1

PRISMA diagram depicting article selection process

Table 1 Studies included in our analysis

Implemented systems

Two RA-THA modalities were implemented among included studies. ROBODOC (Curexo Tecnology, Fremont, CA) is a fully-active system which autonomously prepares the femoral canal [18]. Conversely, the MAKO system (Stryker Orthopaedics, Mahwah, NJ) is a semi-active system since the surgeon ultimately has control of the resection through tactile and auditory feedback provided by the system [19]. The ROBODOC and MAKO systems were implemented by eight and ten included studies, respectively.

Statistical analysis

When three or more studies evaluated certain PROMs and complications, a pooled analysis utilizing Mantel–Haenszel (M–H) models was conducted utilizing data from final follow-up. Per previous methodology, random-effects models were implemented for pooled analyses whose I2 values were > 50% (high heterogeneity) [31]. Conversely, for articles with either low (I2 = 0–25%) or moderate (I2 = 25–49%), a fixed-effects model was implemented. When possible, pooled analyses were additionally conducted while segregating by the type of implemented robotic system. Significance was determined for p-values < 0.05.

Results

Functional outcomes

Harris Hip Scores

The most commonly reported patient-reported outcome (PRO) was the Harris Hip Score (HSS), with a total of ten studies comparing scores between RA-THA and mTHA cohorts [14, 20, 21, 26, 32,33,34,35,36,37] (Table 2). Pooled analysis demonstrated no significant difference between RA-THA and mTHA (Mean Difference (MD): 5.05, 95% Confidence Interval (CI) − 0.88 to 10.98; p = 0.10) (Fig. 2). No significant differences were similarly demonstrated when evaluating the MAKO (MD: 6.76, 95% CI − 1.23 to 14.75; p = 0.10) and ROBODOC (MD: 1.73, 95% CI − 1.29 to 4.75; p = 0.26) systems independently.

Table 2 Mean Harris Hip Score hip score differences between robotic total hip arthroplasty (THA) cohorts and manual THA cohorts
Fig. 2
figure 2

Pooled analysis comparing Harris Hip Scores (HHS) between RA-THA and mTHA cohorts. M–H Mantel–Haenszel, Higher Scores Poorer Outcomes

WOMAC

A total of four studies reported patient scores on the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) [20, 21, 26, 36]. Pooled analysis by fixed-effects model demonstrated significantly higher scores among the RA-THA cohorts (MD: − 3.57; 95% CI − 5.62 to − 1.52; p = 0.006) (Fig. 3). There was not enough data to segregate by implemented robotic system.

Fig. 3
figure 3

Pooled analysis comparing Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) between RA-THA and mTHA cohorts. M–H Mantel–Haenszel

Forgotten Joint Score

The Forgotten Joint Score (FJS) was implemented by four analyses [35, 37,38,39] (Table 3). All included analyses utilized the MAKO system. There was no significant difference between RA-THA and mTHA patients by random-effects model (MD: 8.73, 95% CI − 4.79 to 22.25; p = 0.21) (Fig. 4).

Table 3 Forgotten Joint Scores (FJS) for included RA-THA and mTHA cohorts*
Fig. 4
figure 4

Pooled analysis comparing Forgotten Joint Scores (FJS) between RA-THA and mTHA cohorts. M–H Mantel–Haenszel

Pain

A total of four studies evaluated differences in pain between patients undergoing RA- and manual THA utilizing a visual analogy scale (VAS) [20, 35, 37, 38]. A fixed-effects model demonstrated that there were no differences in VAS pain between THA techniques (MD: − 0.19, 95% CI − 0.65 to 0.27; p = 0.41) (Fig. 5). This was similarly demonstrated when isolating cohorts evaluating the MAKO semi-active system (MD: − 0.18, 95% CI − 0.64 to 0.28; p = 0.44).

Fig. 5
figure 5

Pooled analysis comparing pain scores between RA-THA and mTHA cohorts. M–H Mantel–Haenszel, Higher Scores Poorer Outcomes

When utilizing alternative pain measurements, Barger et al. reported pain values significantly higher for the RA-THA patients [20]. Specifically, Health Status Questionnaire (HSQ) pain scores (83.75 ± 20.40 vs. 72.65 ± 16.31; p = 0.019) and modified HHS (mHHS) pain values (41.81 ± 5.05 vs. 39.09 ± 7.37; p = 0.025) were higher for their RA-THA cohort.

Merle d’Aubigné

The Merle d’Aubigné (MDA) hip score was utilized by three studies that evaluated the ROBODOC fully active system [32, 40, 41] (Table 4). No significant difference was seen between THA cohorts by fixed-effects model (MD: 0.24, 95% CI − 0.05 to 0.52; p = 0.10) (Fig. 6).

Table 4 The Merle d’Aubigné (MDA) hip score differences between robotic total hip arthroplasty (THA) cohorts and manual THA cohorts*
Fig. 6
figure 6

Pooled analysis comparing Merle d’Aubigné (MDA) hip scores between RA-THA and mTHA cohorts. M–H Mantel–Haenszel

Short-Form

A Short-Form survey (SF) was implemented by three difference analyses [14, 21, 35]. Similar to other reported PROs, there was no consensus regarding whether manual THA or RA-THA yielded superior outcomes. For the ROBODOC system, Bargar et al. [14] reported no significant difference in Short-Form 36 (SF-36) scores between the robotic THA and the manual THA groups up to 24-month follow-up.

This was similarly demonstrated by Bukowski et al. while utilizing the MAKO system at 1 year post-operatively for SF-12 values [21]. Furthermore, although Domb et al. found significantly higher SF-12 physical scores among their RA-THA cohort at minimum 5-year follow-up (50.30 ± 8.83 vs. mTHA: 45.92 ± 9.44; p = 0.002) [35], no differences were found between cohorts for SF-12 mental scores (p = 0.17).

UCLA

Two studies reported on the University of California Los Angeles (UCLA) physical activity scale [20, 21]. As with the previously discussed outcomes, there was no agreement between these analyses. While Bargar et al. [20] found no difference between RA-THA (6.09 ± 1.89) and mTHA (5.71 ± 1.45; p = 0.417) cohorts, Bukowski et al. [21] contrarily demonstrated that the RA-THA cohort achieved higher mean postoperative UCLA (6.3 ± 1.8 vs. 5.8 ± 1.7; p = 0.033).

Additional PROs

Various additional PROs were reported across the included studies. As a whole, there was mixed evidence regarding the differences between cohort scores. Results of these remaining PROs are available in Table 5.

Table 5 Comparisons between RA-THA and mTHA cohorts for additionally utilized patient-reported outcomes (PROs)

Gait

While gait pattern and walking speed are universal parameters of assessing the functional capacity [42, 43], only two studies reported on gait-related outcomes [29, 40].

In their study of ROBODOC assisted procedures, Nishihara et al. [40] found no difference between the RA-THA (mean: 14 days, range 7–31 days) and mTHA cohorts (mean: 16 days, range 7–46 days) in the amount of time since surgery required for the patient to achieve independent ambulation of at least 500 m without using a cane (p = 0.0552). Moreover, the number of patients who were able to walk more than 6 blocks without using a cane within 13 days of the procedure was significantly greater for the RA-THA cohort (41 vs. 28; p < 0.05).

Utilizing the MAKO system, Peng et al. [29] analyzed the range of motion, walking speed, and gait mechanics in patients who received robotic and manual THA and compared each to the native contralateral hip mechanics. In the RA-THA cohort, they found no difference in peak range of motion in the frontal or axial planes (p > 0.05). However, net sagittal plane range of motion was significantly reduced (mean (± SD) difference: − 3.0 ± 4.9°; p = 0.043). Contrarily, no significant difference in net range of motion was found for the mTHA cohort in any direction (p > 0.05). The authors additionally demonstrated no significant difference between the two groups in terms of walking speed (RA-THA = 1.91 ± 0.49, range 0.5–2.9; mTHA = 1.76 ± 0.34, range 1.0–2.6 mph; p = 0.262). Finally, there was no difference in the degree of gait asymmetry between the RA-THA and mTHA cohorts.

Complications

Revision rate

Revision rates between RA-THA and mTHA cohorts were comparable for a majority of studies. By pooled analysis, there was no significant difference between all cohorts (MD: 0.73, 95% CI 0.26–2.03; p = 0.54) (Fig. 7). This was similarly demonstrated for studies only evaluating MAKO systems (MD: 0.56, 95% CI 0.23–1.38; p = 0.21). All reported revision rates are available in Table 6.

Fig. 7
figure 7

Pooled analysis comparing revision rates between RA-THA and mTHA cohorts. M–H Mantel–Haenszel

Table 6 Revision rate differences between robotic total hip arthroplasty (THA) cohorts and manual THA cohorts

Dislocation rate

No significant differences in dislocation rates were found between cohorts when including studies evaluating both ROBODOC and MAKO systems (Odds Ratio (OR): 1.81, 95% CI 0.71–4.58; p = 0.02) (Fig. 8; Table 7). This was similarly demonstrated when evaluating MAKO studies independently (OR: 0.66, 95% CI 0.15–3.02; p = 0.60). However, when evaluating only ROBODOC studies, there was a significantly higher rate of dislocations among RA-THA cohorts (OR: 2.87, 95% CI 1.07–7.07; p = 0.04).

Fig. 8
figure 8

Pooled analysis comparing dislocation rates between RA-THA and mTHA cohorts. M–H Mantel–Haenszel

Table 7 Dislocation rate differences between robotic total hip arthroplasty (THA) cohorts and manual THA cohorts

Infection

There is a significant scarcity of evidence in literature in terms of comparing the incidence of infection after RA-THA versus mTHA. Within 5-year follow-up, Domb et al. reported two superficial infections in their RA-THA cohort while no infections were found in mTHA patients [35]. Similarly, Perets et al. [37] reported a higher incidence of superficial infections in their RA-THA cohort (n = 6) compared to mTHA controls (n = 2). However, this difference was not significant (p = 0.15).

Conversely, while one mTHA patient suffered from infection in the analysis by Kong et al., no infection was reported in their RA-THA cohort [34]. Similarly, Honl et al. [32] reported two revision surgeries due to deep infection within their mTHA cohort while none of the patients in the RA-THA cohort developed an infection requiring reoperation. This was similarly demonstrated by Kamara et al. [30] who reported two reoperations for infection in their mTHA cohort; one for a superficial soft tissue infection that was resistant to local wound care and the other a deep hematogenous infection over a year post operatively.

Nerve injury

Reports of nerve injury were similarly not common in the identified comparative studies included in this review since most of the included studies did experience any cases of nerve injury in either cohorts. Honl et al. [32] reported an incidence of partial lesion of the peroneal division of the sciatic nerve in four cases (7%) in the RA-THA cohort. An insignificantly lower rate of nerve injury was identified in their mTHA cohort with only one case of partial femoral nerve injury identified (p = 0.31). Additionally, Domb et al. reported one case of sciatic nerve injury among mTHA patients as well as three cases of thigh numbness [35]. Furthermore, Perets et al. reported two cases of lateral femoral cutaneous (LFCN) injury and one case of incisional numbness in their mTHA cohort [37]. No complications related to nerve injury were reported for RA-THA patients in either of these studies.

Intraoperative femoral fractures

Bargar et al. [14] reported three intraoperative femoral fractures (7%) in their mTHA group. No fractures were seen in their RA-THA cohort. Similar results were observed by Hananouchi et al. [41], who demonstrated insignificant differences in fracture incidence between the RA-THA (0, 0%) and mTHA (2, 7%) cohorts (p = 0.21). This was additionally found by Lim et al. [26] (RA-THA: 0% vs. mTHA: 8%; p < 0.05) and Kamara et al. [30] (RA-THA: 0 (0%) vs. mTHA: 3 (2%); p > 0.05). Conversely, Nishihara et al. [40] reported a significantly lower intraoperative fractures rate their RA-THA cohort (n = 0, 0%) compared to their mTHA cohort (n = 5, 7%; p < 0.05).

Periprosthetic fractures

In their 14-year follow-up, Bargar et al. [20], reported one periprosthetic fracture in their RA-THA cohort that occurred 2 years postoperatively and one periprosthetic fracture 3 years postoperatively in their mTHA cohort. The only othe periprosthetic factor reported among included studies occurred in the mTHA cohort of Nakamura et al. [27]

Heterotopic ossification (HO)

Three studies evaluated the incidence of heterotopic ossification (HO) following THA, with all studies agreeing there was no difference found between RA-THA and mTHA cohorts.

Within a follow-up period of 24 months, Honl et al. [32] reported a 10% prevalence of Brooker [44] grade 2 or 3 HO in both the RA-THA and mTHA cohorts (p = 0.31). Similarly, Nakamura et al. [27] did not find any significant difference in the incidence of HO between their RA-THA and mTHA cohorts within their 67-month follow-up period. Specifically, a rate of HO for Grade 1 (11% vs. 11%), 2 (7% vs. 1%), and 3 (0% vs. 3%) was observed for the RA-THA and mTHA cohorts, respectively (p = 0.1). These patients similarly demonstrated insignificant differences in HO rates at longer follow-up (p = 0.2) [45].

Persistent pain

Nakamura et al. [27] found one case (1%) of persistent thigh pain and two cases (3%) of persistent knee pain at the ROBODOC navigation pin insertion site for the RA-THA cohort. However, these resolved at 1 year and 1 month, respectively. In their mTHA cohort, persistent thigh pain was reported in four cases (5.6%) with spontaneous resolution at 1 year.

Additionally, Nishihara et al. [40] reported that four (5.1%) patients in their RA-THA cohort had persistent thigh pain, with two cases resolving within the 3-month postoperative period. On the other hand, the mTHA cohort had 11 (14%) patients who reported persistent thigh pain at 1 month postoperatively (p = 0.0573), a number that decreased to three (4%) patients at 3 months (p = 0.6494).

Discussion

As the implementation of RA-THA continues to be considered among adult reconstructive surgeons, information regarding how robotic assistance influences surgical outcomes has become increasingly important. While RA-THA may yield superior implant placement, evidence regarding its impact on functional outcomes and complication rates should be considered when evaluating the practicality of RA-THA use. Our systematic review and meta-analysis demonstrated that, as a whole, no significant differences were found between mTHA and RA-THA groups in terms of functional outcomes and gait comparison. Specifically, no significant differences were demonstrated for a majority of pooled analyses and when segregating by robotic system. For outcomes without sufficient data for a pooled analysis, there were no significant differences reported among included studies. Additionally, reported complication rates between both cohorts were comparable.

The findings of the present study suggest that functional outcomes may not necessarily be the differentiating factor when considering RA-THA implementation. Rather, consideration of complications such as revision occurrence and dislocation may yield a stronger understanding of RA-THA viability given the economic burden and patient dissatisfaction associated with these adverse events [19, 46]. While there was mixed evidence regarding the rates of these complications between mTHA and RA-THA cohorts, the study by Bargar et al. may shed important insight into true RA-THA outcomes given that their follow-up period (14 years) was the longest among included studies [20]. Specifically, the authors reported no significance difference between cohorts in terms of survivability/revision and dislocation. Since revision may not occur until years after the index procedure [47], the comparable rates demonstrated by this longer term follow-up may more adequately demonstrate the efficacy of RA-THA compared to other shorter-term analyses. However, further studies are needed to better understand this relationship.

Given comparable functional outcomes between cohorts, as well as the low rate of complications reported across studies, evaluating other metrics may shed more light on which THA technique should be considered superior. For example, multiple previous meta-analyses have demonstrated superior radiographic outcomes, including higher rates within the Lewinnek and Callanan safe zones, among RA-THA cohorts [47, 48]. Superior implant placement may mitigate the risk of dislocation and subsequently, the need for early revision THA [16, 17, 49, 50]. Similarly, although reported complications were comparable between cohorts, operative time was found to be significantly longer among patients undergoing RA-THA for a large majority of studies [31, 39]. As prolonged operative times in total joint arthroplasty (TJA) have been associated with an increased risk of adverse outcomes [51,52,53,54,55,56], these findings suggest that RA-THA may carry a higher perioperative complication risk, especially during the learning curve associated with RA-THA implementation. Therefore, while operative time should be considered when evaluating the safety of RA-THA procedures, surgeon proficiency, as well as other confounding factors should be additionally evaluated.

Our study has some limitations. Although it remains unclear how much conflicts of interests (COIs) influenced included studies, there has been a rising concern across the literature regarding how COIs among authors comparing RA-THA and manual THA may bias reported outcomes. Notably, DeFrance et al. found that articles reporting positive findings for RA-THA were more likely to contain authors with COIs related to RA-THA manufacturers [56]. Given the nature of the reported data and the heterogeneity regarding which robotic system was utilized by our included studies, we were unable to conduct a pooled analysis of certain data points. Similarly, instruments utilized to measure PROMs following THA have varied over the past few decades, and thus, there was a large variation in which tools were utilized in each study [57]. Therefore, it remains unclear if certain instruments (i.e. joint-specific vs. joint-agnostic) better serve to detect subtle differences between THA cohorts. As for the included studies themselves, as a whole, there was a limited number of patients included in a majority of the studies. This is especially problematic for studies with longer follow-up intervals that had high rates of patients lost to follow-up [20]. Relatedly, limited longer term follow-up seen in included studies limited our inability to compare early and late revision risk between cohorts. Furthermore, a large proportion of studies failed to control for differences in patient demographics between cohorts that may influence functional outcomes and complication risk, such as comorbidity burden, or baseline PROM values (Table 1).

While RA-THA implementation continues to be considered among orthopedic providers, the present analysis demonstrated a lack of significant improvement regarding functional outcomes following robotic assistance. Similarly, complication risk did not differ compared to manual THA patients. Future studies should continue to evaluate RA-THA in larger cohorts in order to further determine its efficacy and to ensure patient safety for those undergoing RA-THA procedures. These studies should continue to consider outside factors, such as surgeon experience and familiarity with new instrumentation, in these evaluations. The costs associated with RA-THA implementation, from installation of the robotic device to servicing of the computer software, should additionally be considered.