Introduction

Total knee arthroplasty (TKA) has proven to be a successful and cost-effective surgical procedure for pain relief and functional restoration in patients with advanced osteoarthritis of the knee. However, despite these well-documented successes of TKA, up to 20% of patients are dissatisfied with their replaced knees, which brings into question the validity of current outcome scoring systems for assessing functional outcomes as perceived by patients [3, 24, 32, 34]. Recent studies have shown that the views of surgeons and their patients on the outcome of medical and surgical interventions do not always agree, especially with respect to the assessment of pain and function [1, 22, 36]. Likewise, functional outcome scores do not necessarily correlate with patient satisfaction in joint replacement [8]. Consequently, in addition to improvements in surgeon-driven objective outcome scales, patient satisfaction has attracted much attention as a key parameter for assessing overall TKA outcome [12, 20, 40].

A number of variables have been implicated in dissatisfaction, including female gender [3, 12], younger age [3], older age [15, 16], rheumatoid arthritis [3, 8], worse pre-operative pain [29], and, recently, a pessimistic personality trait [38]. Patient expectations [31, 32] and mental health scores [6, 17,18,19] have been correlated with satisfaction, as have post-operative pain and function [3, 31, 32], but few of these effects have been reproduced with any consistency. Sample sizes have been small, or data have been collected retrospectively. However, patient expectation, pain relief, and the functional outcome appear the most significant predictors of satisfaction in the literature. This was confirmed by a recent cohort study of 22,798 patients from the National Joint Registry for England, which reported that the most important determinants of satisfaction were the patient’s perception of the success of their operation and post-operative function [4], whereas pre-operative variables had a minimal influence upon post-operative satisfaction. However, these studies examined patient satisfaction at a short follow-up of 6 months [4] or 1 year [37] after surgery, and some patients’ perception of pain and function may continue to improve after this time point and, hence, their level of satisfaction may vary with the follow-up period [6]. Few studies have examined early post-operative outcome scores or the pre- to post-operative change in scores as a predictive tool for patient satisfaction at 2 year after surgery. If found to be predictive, early post-operative outcome scores may offer orthopaedic surgeons an additional tool to identify patients who are at a higher risk of dissatisfaction at 2 years, enabling them to intervene earlier for this group of patients to ensure good patient satisfaction, be it through counselling patients, managing their expectations or reinforcing compliance with physical rehabilitation.

The purpose of this study was to determine whether early post-operative outcome scores such as the Knee Society Score (KSS), Oxford Knee Score (OKS), and Short-Form 36 (SF-36) at 6 months were predictive of patient satisfaction 2 years after TKA.

Methods

Patient cohort and study design

Following approval from our centralised institutional review board (CIRB: 2018/2386), we conducted a review of prospectively collected registry data of 4359 TKAs that were performed between 2006 and 2010 at a single institution. The indication for surgery in all patients was osteoarthritis that was severe enough to warrant TKA after an adequate trial of non-operative therapy. Patients diagnosed to have a primary or secondary infection and patients who suffered from post-operative periprosthetic fractures were excluded from this study. The inclusion and exclusion process was summarised in a flowchart (Fig. 1).

Fig. 1
figure 1

Flow diagram outlining the patient selection process

Outcome measures

An independent health care professional performed the pre-operative and post-operative assessment of all patients. All the patients had pre-operative range of motion (ROM), Knee Society Score (KSS), Oxford Knee Score (OKS), and Short-Form 36 (SF-36) assessed at 6 months and 2 years post-operatively.

The KSS is a surgeon-driven objective scale consisting of two separate scores: one for walking, stair climbing, and use of walking aids (functional score), and another for pain, range of motion, and stability (knee score).

The OKS is a patient-reported outcome measure consisting of 12 questions asking patients to describe their knee pain and function during the previous 4 weeks. Each question is scored on a Likert scale from 1 to 5, and the overall score is determined by adding up the responses. The total score can range from 12 to 60, where 60 is the worst possible score indicating severe symptoms and poor joint function, and 12 is the best score suggesting no adverse symptoms and excellent joint function. This method for reverse scoring was based on the initial system proposed by the Oxford group. To convert the old reversed score into the new score commonly used in modern literature, the following formula can be applied: (Old Score − 60) × (−1) = New Score.

The SF-36 is a generic health-related quality-of-life questionnaire. The SF-36 consisted of eight subscales: Physical functioning, Social functioning, Role-Physical, Bodily Pain, Mental Health, Role-Emotional, Vitality, and General Health. Summary scores were developed to aggregate the most highly correlated subscales and simplify analyses without substantial loss of information. In this study, the medical outcome study approach proposed by Ware et al. [39] was used to derive two higher order summary scores: Physical Component Score (PCS) and Mental Component Score (MCS). These two summary scores were found to account for between 80 and 85% of the reliable variance of the standard eight subscales. They have good validity in discriminating amongst clinically meaningful groups, as well as high internal consistency and test–retest reliability estimates when used in a general population [39].

All outcome scores were evaluated again at 6 months and 2 years post-operatively, together with an assessment of the patient’s fulfilment of expectations and overall satisfaction with the outcome of surgery. Satisfaction scores were rated using a 6-level Likert scales, with higher scores indicating poorer results, similar to a scale used by Klit et al. [25]. We stratified scores into: satisfied and dissatisfied (Table 1).

Table 1 Evaluation of patient satisfaction and expectation fulfilment

Statistical analysis

Logistic regression was used to generate receiver-operating characteristic (ROC) curves to assess the ability of each scoring system to predict satisfaction at 2 years as a primary outcome. The area under the ROC curve (AUC) was interpreted as the probability of correctly identifying whether or not patients were satisfied at 2 years, based on their post-operative score at 6 months. The AUC ranges from 0.5 (indicating a useless test with no accuracy in discriminating whether a patient is satisfied or not) to 1.0 (indicating a test with perfect accuracy in identifying all satisfied patients). A higher AUC hence indicates better discriminatory performance of the scoring system. The AUC range can be stratified into the following: 0.5–0.6 (no accuracy), 0.6–0.7 (poor accuracy), 0.7–0.8 (moderate accuracy), 0.8–0.9 (good accuracy), and 0.9–1.0 (excellent accuracy). The ROC curve analysis was also used to identify a cut-off point for the early post-operative scores that identified whether or not a patient was satisfied 2 years after surgery. The cut-off point on the ROC curve is equivalent to the point at which the post-operative score has maximal sensitivity and specificity in predicting patient satisfaction at 2 years. Statistical analysis was performed using Statistical Package for Social Sciences version 20.0 (SPSS Inc., Chicago, IL, USA). We defined statistical significance at the 5% (p ≤ 0.05) level.

Results

Patient demographics are described in Table 2. At 2 years after surgery, 91.1% of patients were satisfied and 8.9% were dissatisfied. An ROC curve analysis was used to determine a cut-off point for the early post-operative scores or change in scores associated with satisfaction 2 years after surgery (Table 3). For the absolute post-operative OKS at 6 months, we obtained an area under the ROC curve (AUC) of 0.762 (95% CI 0.736–0.788) (Fig. 2), indicating that the OKS at 6 months had a moderate accuracy in predicting whether a patient would be satisfied or dissatisfied at 2 years. A threshold of 21.5 points or less (corresponding to a score of 38.5 points or more on the new system) was identified, and hence, patients whose OKS fell within that threshold could be predicted to be satisfied at 2 years, with 65.7% sensitivity and 74.3% specificity.

Table 2 Patient demographics
Table 3 Results of receiver-operating characteristic (ROC) curve analysis: the probability of correctly identifying whether or not patients were satisfied at 2 years, based on their post-operative score at 6 months
Fig. 2
figure 2

Receiver-Operating Characteristic (ROC) curve on the predictive value of the 6 month post-operative OKS for satisfaction at 2 years

For the absolute post-operative KSS knee score at 6 months, we obtained an AUC of 0.704 (95% CI 0.674–0.734) (Fig. 3), indicating that the knee score at 6 months also had a moderate accuracy in predicting whether a patient would be satisfied or dissatisfied at 2 years, although the accuracy was lower than that of the absolute post-operative OKS. A threshold of 80.5 points or more was identified; hence patients whose KSS knee score fell within that threshold could be predicted to be satisfied at 2 years, with 57.9% sensitivity and 72.7% specificity.

Fig. 3
figure 3

Receiver-Operating Characteristic (ROC) curve on the predictive value of the 6 month post-operative KSS Knee Score for satisfaction at 2 years

The OKS performed significantly better than the KSS knee score (p = 0.03) and the other scoring systems (p < 0.001).

When analysing the pre-operative to post-operative change in scores using the ROC analysis, we found that the AUC was < 0.7 for all scales, indicating poor accuracy in predicting satisfaction at 2 years (Table 4).

Table 4 Results of receiver-operating characteristic (ROC) curve analysis: the probability of correctly identifying whether or not patients were satisfied at 2 years, based on their change in scores pre- to post-operatively

Discussion

For a TKA to be considered successful, a patient must experience pain relief, functional recovery, and satisfaction with surgery [10]. The most important finding of this study was that post-operative OKS of ≤ 21.5 (or ≥ 38.5 points on the new scale) and KSS knee score of ≥ 80.5 at 6 months could reliably predict patient satisfaction at 2 years after TKA.

Several tools have been developed to measure outcomes, compare performance, and provide a platform for quality improvement in orthopaedic surgery. The Knee Society Score (KSS) has been widely accepted as a surgeon-driven objective measure of knee status, whereas the Oxford Knee Score (OKS) and Short-Form 36 (SF-36) have been frequently used as patient-derived, disease-specific, and generic measures, respectively. Recently, besides the improvements in physician-driven and patient-derived outcome scales, patient satisfaction has drawn much attention as a key parameter to evaluate the success of joint replacement, since functional outcomes also do not necessarily correlate with patient satisfaction [8]. This study is the first of its kind to evaluate the prognostic value of post-operative outcome scores (KSS, OKS, and SF-36) at 6 months in predicting patient satisfaction at 2 years after TKA.

The ROC analysis used in our study has been used extensively in cardiothoracic surgery to predict mortality, major adverse events, and prolonged length of stay [21, 26]. Recent studies in orthopaedic surgery have also attempted to use ROC curves to evaluate pre-operative scores to predict satisfaction after UKA [2] and TKA, but to no avail [9, 23]. Judge et al. in their database study on 1784 knees, obtained a Spearman’s rank correlation coefficient of 0.04 between the pre-operative OKS and satisfaction at 6 months, as well as an AUC of 0.56 [23]. The authors reasoned that these scoring systems were designed primarily to design and to periodically assess the post-operative clinical improvement and quality of life following arthroplasty and hence cannot be used on their own pre-operatively to predict patient satisfaction as an outcome. Similarly, Clement et al. studied 2392 TKAs and found that the pre-operative OKS was a poor predictor of satisfaction at 1 year, with an AUC of 0.59 [9]. Notwithstanding, these scores have been shown to have prognostic value for other outcomes. Studies have shown that pre-operative pain scores were significantly associated with post-operative pain scores [5, 7], and worse post-operative pain and function at 6 months will likely persist beyond 2 years [14]. However, our study is the first to show that early post-operative scores, specifically the OKS (AUC 0.762, 95% CI 0.736–0.788) and KSS knee score (AUC 0.704, 95% CI 0.674–0.734) at 6 months, have a moderate accuracy in predicting satisfaction at 2 years. Threshold values were also identified using the ROC analysis. A threshold value of 21.5 points or less (or ≥ 38.5 points on the new scale) was identified for the OKS, whilst a threshold value of 80.5 points or more was identified for the KSS knee score, both with reasonable sensitivity and specificity. Early post-operative scores not only have prognostic value in predicting satisfaction at 2 years, but also offer orthopaedic surgeons an additional tool to identify patients who are at a higher risk of dissatisfaction at 2 years. This enables surgeons to intervene earlier for this group of patients to ensure good patient satisfaction, be it through counselling patients, managing their expectations or reinforcing compliance to physical rehabilitation.

Comparing the relative strengths of prediction between the different scores with different contributions from both patient and surgeon, the KSS knee score also had a moderate accuracy in prediction, although not as accurate as the OKS score. The KSS function score and SF-36, although of some prognostic value, did not function as well as the other scoring systems. Even when comparing the AUC for change in scores, the change in OKS was found to be most predictive. Previous studies have shown that patient-derived outcome scales represent patient satisfaction better than physician-driven outcome scales [28, 35], which likely explains why the OKS was most predictive.

When interpreting results obtained using outcome scales, we typically consider that patient satisfaction is conceivably more related to the amount of change rather than the absolute outcome after TKA. As such, our study also sought to determine whether absolute post-operative scores or pre- to post-operative changes were more predictive of patient satisfaction at 2 years. A study by Kwon et al. found that that patient satisfaction at 1 year was better correlated with absolute post-operative scores at 1 year than pre-operative to post-operative changes [27], whereas the predictive value of absolute post-operative scores and the pre- to post-operative change with regards to post-operative satisfaction at 2 years has not been compared to date. Our study reported similar findings to that of Kwon et al. as we found that the early post-operative scores at 6 months had better predictive value for satisfaction at 2 years as compared to the change in scores pre- and post-operatively. We originally hypothesised that post-operative change amounts would better correlate with patient satisfaction better than absolute outcome measures. It would appear reasonable to expect that patients with substantial pain and a poor functional status pre-operatively are more likely to achieve a post-operative improvement and that this would be reflected by patient satisfaction [3]. However, contrary to our hypothesis, we discovered that absolute outcome levels better predicted patient satisfaction at 2 years than degrees of change. In fact, early post-operative scores had better prediction of patient satisfaction at 2 years than the pre-operative-to-post-operative change for all scales. Thus, our findings suggest that patients appear to discount extent of disability before surgery and that achieved improvements do not drive patient satisfaction. In other words, patients appear to revise their previous goals and redefine treatment success. A recent study reported on such a response shift in patients after TKA [33]. This finding of the absolute post-operative score having a better prediction of patient satisfaction has clinical implications concerning the timing of TKA during the course of knee osteoarthritis. Traditionally, TKA is delayed until pain and functional limitations are intolerable, whereas it has been documented that worse pre-operative pain and function are associated with poorer post-operative outcomes [13, 14]. Our findings advocate that delayed surgical intervention is likely to adversely affect patient satisfaction. Other authors also support this notion by advocating earlier surgical intervention in patients with advanced osteoarthritis [11, 14, 30]. This point should be considered in offering treatment options to patients with advanced osteoarthritis.

In interpreting the findings of this study, several limitations should be acknowledged. First, we did not analyse the effect of others factors that could possibly influence patient satisfaction, such as age, gender, BMI, comorbidities, mental health, and fulfilment of expectations [3, 37], upon the identified threshold values. However, whilst the inclusion of these variables in our analysis may improve the sensitivity and specificity, this will result in multiple thresholds that are beyond clinical use. Furthermore, the precise interplay of these factors has not been clearly elucidated, and further work is required to identify the key risk factors of poor outcome following TKA. Such a multidimensional assessment tool will need to consider a broad range of patient-reported outcomes encompassing satisfaction, pain, function, and health-related quality of life. In this way, clinicians may be able to determine objectively which patients are suitable for TKA and then target pre-operatively those with potentially reversible problems, such as depression, which could then be addressed to improve outcome. Second, the statistical correlation found in this study does not imply that these thresholds will have an effect clinically; hence, further validation prospective studies will need to be conducted to prove that these thresholds in outcome scores are indeed clinically significant.

Conclusion

This study shows that early post-operative scores, specifically the OKS and KSS knee score, can predict patient satisfaction at 2 years after TKA. The threshold values may offer orthopaedic surgeons an additional tool to identify patients who are at a higher risk of dissatisfaction at 2 years, enabling them to intervene earlier for this group of patients to ensure good patient satisfaction.