Introduction

Osteoarthritis affecting the hip and knee is a prevalent condition in the elderly, leading to a decline in the quality of life (QOL) attributed to joint pain and restricted range of motion [1]. Total hip arthroplasty (THA) and total knee arthroplasty (TKA), recognized interventions for end-stage osteoarthritis, have demonstrated efficacy in alleviating joint pain and improving activities of daily living with sustained positive long-term outcomes [2,3,4]. However, investigations into patient-reported outcome measures (PROMs) following total joint arthroplasty (TJA) reveal that approximately 11–17% of THA patients and 11–25% of TKA patients express dissatisfaction post-surgery [4,5,6,7]. Harris et al. delineated the disparity between patient and surgeon satisfaction after THA and TKA, proposing that the assessment of surgical success should rely on PROMs rather than solely on surgeons’ evaluations [7].

The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) score is a frequently employed tool for postoperative assessment using PROMs in TJA. Previous studies have consistently reported superior early postoperative outcomes with THA as assessed by the WOMAC score compared to TKA [2,3,4, 8, 9]. Despite the acknowledged influence of patient background factors such as age, gender, and body mass index (BMI) on TJA outcomes, assessments of post-TJA WOMAC scores adjusted for patient background remain notably absent in the existing literature.

This study aims to compare the 2-year postoperative outcomes of patients who underwent THA or TKA using WOMAC scores adjusted for patient background. The primary objective is to assess the impact of adjustments for patient background on the outcomes of both THA and TKA.

Material and methods

This study received approval from the Institutional Review Board of our institution. All patients were informed about the study’s purpose and provided verbal informed consent. The procedures adhered to the principles outlined in the Declaration of Helsinki and its subsequent amendments. The trial was registered as a retrospective cohort study with the University Hospital Medical Information Network (UMIN) under registration number UMIN000040542.

Participants

Between April 1, 2013, and August 31, 2018, a total of 412 and 691 patients underwent THA and total TKA, respectively, at our institution. This study included patients diagnosed with osteoarthritis and osteonecrosis who underwent primary THA and TKA and had follow-up evaluations for at least 2 years postoperatively. Exclusions comprised patients undergoing simultaneous and staged bilateral THA or TKA, those with joint reoperation, infection, inflammatory disease, periprosthetic fracture, neurodegenerative disease, or a history of lumbar spine surgery during the follow-up period, and those lost to follow-up. Consequently, 259 THA patients and 326 TKA patients were included. THA was performed by one senior surgeon, while TKA was performed by two senior surgeons. Patient age, gender, BMI, and diagnosis were recorded in the preoperative evaluations.

WOMAC score

The WOMAC score served as the clinical assessment tool, and it was administered preoperatively as well as at 3 months, 1 year, and 2 years postoperatively for all patients. Comprising 24 items evaluating pain (5 items), stiffness (2 items), and physical function (17 items), each item was graded on a scale ranging from 0 to 4 points (with 0 indicating the optimal outcome, and 96 indicating the worst possible outcome) [10]. Only WOMAC scores completed by patients themselves, in person at the hospital during each assessment period, were included in the analysis.

At each designated time point, both the overall WOMAC score and the scores for each subscale (pain, stiffness, and physical function) were compared between the THA and TKA patient groups. Additionally, the extent of improvement in the total and subscale WOMAC scores was assessed between the two groups for each specific period. Furthermore, we identified 24 items within the WOMAC score that exhibited significant differences between THA and TKA patients within the initial postoperative year (p < 0.05). To mitigate bias between the two groups, comparisons were conducted using the same methods with propensity score-matched data. Propensity scores were specifically matched for age, gender, BMI, and diagnosis (osteoarthritis or osteonecrosis), recognized factors known to influence PROMs [11,12,13,14,15].

Surgical technique

In terms of surgical technique, all of 259 non-matched patients undergoing THA opted for the modified Watson-Jones approach, and all selected the cementless proximal fixed taper wedge stem and cementless cup. Similarly, all of 185 propensity score-matched patients for THA utilized the modified Watson-Jones approach.

For TKA, a medial parapatellar approach incorporating a modified gap-balancing technique was applied uniformly across the 326 non-matched patients. Regarding implant design, among the 185 propensity score-matched patients, 65 selected the posterior stabilized mobile bearing design (PS mobile), while 120 opted for the medial pivot fixed bearing design (MP).

Statistical analysis

Statistical analysis was performed using SPSS (IBM SPSS Statistics for Windows, version 26.0; IBM Corp., Armonk, NY). Mann–Whitney U-test was performed to detect differences in WOMAC scores and their improvement between the 2 groups. The level of significance was set at a p-value of < 0.05. For comparisons between the two groups, both propensity score propensity score-matched and unmatched data were evaluated to reduce bias between the two groups. Propensity scores were matched for age, sex, BMI, and diagnosis (osteoarthritis or osteonecrosis), with an allowable range of difference between covariates of 0–0.046 and a standard deviation × 0.25 as the propensity score.

Results

Patient characteristics

The characteristics of the patients are summarised in Table 1. TKA patients showed more advanced age and higher BMI than THA patients (age p < 0.001, BMI p < 0.001), while gender and diagnosis did not differ significantly between the 2 groups. After adjusting for patient characteristics by propensity score, 185 THA patients and 185 TKA patients were compared. The 2 groups showed no significant differences in age, gender, BMI, or diagnosis (Table 1).

Table 1 Non-matched data and propensity score-matched data in THA patients and TKA patients

Total WOMAC score

The total preoperative WOMAC score did not differ significantly between THA and TKA patients in both the non-matched and propensity score-matched analyses; however, THA patients showed significantly superior results than TKA patients at 3 months, 1 year, and 2 years postoperatively (non-matched: 3 months p < 0.001, 1 year p < 0.001, 2 years p < 0.001, propensity score-matched: 3 months p = 0.003, 1 year p = 0.037, 2 years p = 0.012) (Table 2). Regarding the degree of improvement in the total WOMAC score, both non-matched and propensity score-matched data showed significantly superior improvement in THA patients preoperatively to 3 months postoperatively; however, after 3 months postoperatively, there was no significant difference in improvement between the THA and TKA patients (non-matched: preoperatively to 3 months p < 0.001, propensity score-matched: preoperatively to 3 months p = 0.008) (Table 3).

Table 2 WOMAC score total and each subscale (Pain, Stiffness, and Physical Function) outcome in the non-matched data and propensity score-matched data in THA patients and TKA patients
Table 3 WOMAC score total and each subscale (Pain, Stiffness, and Physical Function) change in the non-matched data and propensity score-matched data in THA patients and TKA patients

WOMAC scores for each subscale

In terms of WOMAC scores for each subscale, THA patients had significantly superior physical function scores preoperatively and 2 years postoperatively and pain and stiffness scores at 3 months, 1 year, and 2 years postoperatively in both non-matched and propensity score-matched data. However, in terms of physical function scores at 3 months and 1 year postoperatively, THA patients were significantly superior in the non-matched data, while there was no significant difference between the 2 groups in the propensity score-matched data (Pain; non-matched: 3 months p < 0.001, 1 year: p < 0.001, 2 years p < 0.001, propensity score-matched: 3 months: p < 0.001, 1 year: p < 0.001, 2 years p < 0.001. Stiffness; non-matched: 3 months p < 0.001, 1 year: p < 0.001, 2 years p < 0.001, propensity score-matched: 3 months p < 0.001, 1 year: p < 0.001, 2 years p < 0.001. Physical function; non-matched: preoperatively p = 0.018, 3 months p = 0.002, 1 year p = 0.002, 2 years p < 0.001, propensity score-matched: preoperatively p = 0.029, 2 years p = 0.048) (Table 2).

Regarding the degree of improvement in the WOMAC scores for each subscale, THA patients showed significant improvement from the preoperative period to 3 months postoperatively in pain, stiffness, and physical function scores in non-matched data and in pain and physical function scores in propensity score-matched data. TKA patients showed significant improvement from 3 months to 1 year postoperatively in pain score in both non-matched and propensity score-matched data (Pain; non-matched: preoperatively to 3 months p < 0.001, 3 months to 1 year p < 0.001, propensity score-matched: preoperatively to 3 months p = 0.008, 3 months to 1 year p = 0.006. Stiffness; non-matched: preoperatively to 3 months p = 0.019. Physical function; non-matched: preoperatively to 3 months p < 0.001, propensity score-matched: preoperatively to 3 months p = 0.001.) (Table 3).

WOMAC scores for 24 items up to 1 year postoperatively

WOMAC scores were compared for 24 items in the propensity score-matched data up to 1 year postoperatively. TKA was significantly superior to THA in only 2 items (12 ‘bending to the floor’, 16 ‘putting on socks’) at 3 months postoperatively, while THA was significantly superior in the other items. In particular, 3/5 items of pain (2 ‘Stair climbing’, 3 ‘at night’, 4 ‘at rest’) and 2/2 items of stiffness (6 ‘in the morning’, 7 ‘Occurring during the day ‘) were significantly better for THA at both 3 months and 1 year postoperatively. It was suggested that the difference in satisfaction between THA and TKA patients may be particularly pronounced in pain and stiffness (Table 4).

Table 4 Identification of items with significant differences among WOMAC score for 24 items within 1 year postoperatively in non-matched data and propensity score-matched data of THA patients and TKA patients

A comparison of the degree of improvement in the WOMAC scores for 24 items up to 1 year postoperatively was assessed using propensity score-matched data. It showed that THA was significantly superior to TKA in many items from the preoperative period to 3 months postoperatively; TKA was not superior to THA in any of the items. Comparison of the degree of improvement from 3 months to 1 year postoperatively revealed that TKA was superior to THA in 3 items of pain (2 ‘Stair climbing’, 3 ‘At night’, 4 ‘At rest’) and in 2 items of physical function (8 ‘Stair climbing’, 11 ‘Standing’) (Table 5). However, as shown in Tables 2 and 4, TKA was not superior to THA in the WOMAC scores for pain and physical function as well as the scores for each of the 24 WOMAC items 1 year postoperatively.

Table 5 Identification of items with significant differences among WOMAC score 24 items change within 1 year postoperatively in non-matched data and propensity score-matched data of THA patients and TKA patients

Discussion

Numerous studies have undertaken comparisons of postoperative PROMs between patients undergoing THA and TKA. Several reports have consistently highlighted significantly superior WOMAC scores for THA patients compared to TKA patients at the 1–2 years postoperative interval [2,3,4, 8, 9].

The influence of age, gender, and BMI on patient outcomes following TJA has also been addressed in previous literature. Older age, female gender, and obesity have been associated with inferior postoperative outcomes and satisfaction [11,12,13,14,15].

Within the scope of this study, non-matched data revealed that TKA patients were of older age and had a higher BMI than their THA counterparts. In propensity score-matched comparisons between THA and TKA patients, spanning up to 2 years postoperatively, THA patients consistently exhibited significantly superior WOMAC total, pain, and stiffness scores. However, WOMAC physical function scores did not exhibit a significant difference between THA and TKA patients at 3 months and 1 year postoperatively. This observation implies that age and BMI may contribute to the deterioration of WOMAC scores following TKA surgery. In addition, the improvement in pain scores for TKA is substantial from 3 months to 1 year postoperatively. As the improvement in pain scores for THA is comparatively slower, it is anticipated that in the early postoperative PROMs, THA will demonstrate superior satisfaction results.

Further evaluation of 24 WOMAC score items using propensity score-matched data indicated that THA demonstrated a significantly superior physical function score in only 6 out of 17 items at 3 months and 1 out of 17 items at 1 year postoperatively. Importantly, there was no significant difference between the two groups in the majority of physical function score items.

This study’s strength lies in being the inaugural investigation to assess the WOMAC score between TKA and THA patients utilizing propensity score matching. Propensity score matching, beyond its role in addressing surgical methodology, serves to standardize patient backgrounds to the greatest extent feasible.

This study has several limitations. Firstly, when comparing WOMAC scores between TKA and THA groups, the presence of a floor effect must be acknowledged. Clement et al. noted ceiling effects of 24–26%, 21.2–27%, and 6.4–8% in WOMAC scores for pain, stiffness, and physical function, respectively, at 1 year after TJA [16]. Marx et al. similarly reported ceiling effects for both THA and TKA [17], indicating potential influence on the current study findings. Secondly, as this study focused on unilateral TJA, results may not align with those from simultaneous bilateral TJA. Despite some reports suggesting similar outcomes between simultaneous bilateral TKA and unilateral staged TJA, others indicate an increased risk of postoperative pain, worsened physical function, and complications with simultaneous bilateral TJA [18,19,20]. Thirdly, patient mental, psychological, and lifestyle factors (such as marital status) were not considered and only WOMAC was used in this study [21, 22], despite studies indicating that insufficient social support can exacerbate postoperative pain and hinder physical function [22]. Fourthly, variations in surgical technique and implant design were not accounted for. While PS mobile is used in TKA for strong internal or external deformities, the study did not find a significant difference in PROMs by technique or implant survival [23,24,25,26,27,28]. Although selection bias may exist in implant and approach choices, it is unlikely to substantially impact study results. Fifthly, the results are derived from a single institution, necessitating a multicenter study for validation. Sixthly, the clinical significance of postoperative pain and physical function in THA and TKA patients is challenging to assess, given the standardized minimal clinically important differences (MCIDs) reported by different studies [29, 30]. Lastly, the small number of cases in this study may limit its power to detect significant differences in total WOMAC score and subscale scores. Future studies should employ larger sample sizes for improved validity.

Conclusions

A comparison of patient satisfaction between THA and TKA over a 2-year postoperative period demonstrated that THA significantly outperformed TKA, even after adjusting for patient background. However, WOMAC physical function scores at 3 months and 1 year postoperatively exhibited a significant difference in non-matched data, unlike in propensity score-matched data. This implies that the variance in patient satisfaction between THA and TKA is partially influenced by patient background factors, such as older age and higher BMI. Therefore, physicians should be mindful of the potential impact of patient background on disparities in PROMs for TJA.