Introduction

The lifetime risk of developing symptomatic knee osteoarthritis is estimated to be 45 % [1]. According to the National Joint Registry 70,000 primary total knee replacement (TKR) procedures were performed in the UK in 2010 of which at least 1 % were simultaneous bilateral procedures [2]. Subsequent replacement of the contralateral knee following unilateral TKR is performed in approximately 37 % of patients [3].

The threshold for performing TKR is not clearly defined and the importance of the pre-operative function and radiological severity in determining outcome is unclear [48]. The final decision to proceed with surgery is made by the patient following discussion with the responsible clinician, weighing up expectations with lifestyle factors and any relative contraindications [9]. It has been shown that patient expectations are strongly influenced by the experience of their first TKR [10, 11]. When undergoing a staged second side TKR, it is likely that the patient’s experience (positive or negative) with their first TKR will influence their decision on further surgery. Furthermore, a previous positive or negative experience may have an impact on the threshold at which the surgeon offers or the patient accepts the contralateral surgery.

Despite staged bilateral TKR being so common little data is available to inform on outcomes following the procedure. Recently, a series of 70 staged bilateral TKR cases managed with a range of different implant designs has been published [11]. Significant changes in expectations and satisfaction were reported between the first and second knee but the numbers limited comparison of outcomes [11]. No series has compared the outcomes for both knees in staged cases to those in patients who opt to undergo simultaneous bilateral total knee arthroplasty.

The aim of this study was to compare outcomes of bilateral total knee replacements performed simultaneously to those performed at a staged interval in a single centre, single implant series. We also aimed to detect any difference in outcome between the first and second knee in staged total knee replacement patients. We also wished to determine if there was any impact from the duration of the staging interval for patients ultimately undergoing bilateral total knee replacement.

Methods

All patients who underwent bilateral TKR in our unit from January 2005 to January 2011 were identified retrospectively. Every TKR was performed with the uncemented Low Contact Stress (LCS) mobile bearing implant (DePuy Orthopaedics, Warsaw, Indiana). Surgery was performed under the care of six consultant surgeons. Simultaneous bilateral surgery was the preference of one consultant surgeon for patients with appropriately severe symptoms in both knees. The final decision to proceed with simultaneous surgery was, however, the patients own. Planned staged surgery was performed following consultation with the patient at an interval of two to six months. Patients began walking with crutches or other walking aids the day after the procedure according to a standard protocol and were discharged from hospital when safe from a medical and mobility point of view.

The Oxford Knee Score for each knee was collected pre-operatively and at each annual review thereafter to record changes in the level of function and pain. The standard 12-point questionnaire was used with scores for each item recorded by the patient from 0 to 4 and summated to derive a total score from 0 to 48, where 48 represents the least symptoms and best outcome [12, 13]. All questionnaires were completed by the patient and recorded by a specialist arthroplasty nurse in a dedicated clinic. All responses were prospectively entered into our dedicated arthroplasty database.

The annual OKS was utilized when comparing post-operative outcomes between patients and knees–this was collected at one year post-operatively, then annually. The pre-operative and post-operative score was compared for each knee in each patient, and the mean improvement for each knee was calculated.

Following the guidance of the National Research Ethics Service, this study was considered ‘service evaluation’ and did not require research and ethics committee approval.

Statistical analysis

Prior to our database search, a power calculation was performed to determine the ideal sample sizes for statistical analysis. This analysis was performed considering a minimal clinically detectable change (MDC) of 4 points in OKS and a standard deviation (SD) of 10, in accordance with recent literature [14, 15]. According to this calculation, a minimum of 52 cases was required per group and the sample size was selected from our local outcomes database accordingly.

The Wilcoxon signed ranks statistical test was used to compare the paired outcome scores for each knee in individual patients. Outcome scores between independent groups such as the simultaneous and staged groups were compared using the Mann-Whitney U statistical test. Fisher’s exact test was used to compare categorical data (gender, side). Statistical significance was defined as a p-value less than or equal to 0.05.

Results

Post-operative scores, collected at least one year post-operatively, were available for all 656 TKRs (328 patients) and pre-operative scores were available in 476 TKRs (238 patients). Of the 656 TKRs, 500 were staged procedures (250 patients) and 156 were performed simultaneously (78 patients). Of the 476 TKRs with both pre-operative and post-operative scores available for both knees, 344 TKRs were staged (172 patients) and 132 (66 patients) were simultaneous procedures.

Within the staged TKR patients, 66.4 % (166 of 250) were female and for the simultaneous group, 62.8 % were female (49 of 78) (p = 0.587). The mean age for the simultaneous TKR patients was 65 (34 to 81), and patients underwent their first staged TKR at mean 66 years (42 to 92) (p = 0.407).

Overall, the mean post-operative scores in simultaneous cases were marginally better than those for staged procedures (39.1 vs. 37.6); p = 0.045). There was no statistical difference in the mean post-operative scores in all simultaneous cases in comparison to the scores attained in the first knee for staged cases (39.1 vs. 37.9); p = 0.172).

Although overall sample means appeared similar for the first and second staged TKR (37.9 vs. 37.3), a statistical analysis of these scores for each individual patient found a trend to a worse outcome in the second knee in comparison to the first (p = 0.003). The mean second staged TKR outcome was also inferior to that for simultaneous cases (37.3 vs. 39.1; p = 0.025). Overall, of the 250 staged patients, 164 (65.6 %) attained the same post-operative OKS in their second TKR to the first TKR, 57 (22.8 %) had a worse OKS and 29 (11.6 %) a better OKS. For those with a worse outcome in the second TKR, the mean difference between knees was 4.7 points (range 1 to 18) and for those with a better outcome the mean difference was 3.8 (range 1 to 16). Post-operative scores for both knees were compared for scores taken in the same year to help control for any contributing effect from a change in general health or psychological effects including changing expectations. Of note, however, there was no significant change in the OKS recorded in individual patients from the one-year post-operative review to the subsequent reviews out to five-years where this was available (p = 0.109).

The interval between procedures for the staged TKR groups was mean 23 months (range 2 to 74) as shown in Fig. 1. The interval between staged TKR did not influence the mean post-operative score for the second TKR (p = 0.125) or any difference in the final outcome scores between the first and second TKR (p = 0.601). Interval also had no effect on the degree of improvement in OKS for the second TKR (p = 0.251) or the relative amount of improvement in the first TKR in comparison to the second (p = 0.500).

Fig. 1
figure 1

Distribution of staged TKR group by time interval between first and second TKR

Age at the time of surgery did not affect the post-operative score obtained in either the first (p = 0.949) or second staged TKR (p = 0.096) or difference in OKS between staged knees (p = 0.624). However, although age did not affect the improvement in the first knee (p = 0.839), greater age was associated with an inferior amount of OKS improvement in the second knee (p = 0.015). The main driver of this effect was a better mean pre-operative score in patients aged over 70 years (19.1 vs. 15.6; p < 0.001).

Mean pre-operative scores were similar for simultaneous cases to the first knee in staged cases (15.1 vs. 13.7; p = 0.137). Mean pre-operative OKS for the second staged TKR was 17.1, significantly higher in comparison to simultaneous cases (p = 0.047) and the first staged TKR (p < 0.001). A summary of the mean pre-operative and post-operative OKS for the first staged, second staged and simultaneous cases respectively is shown in Fig. 2.

Fig. 2
figure 2

Pre-operative and post-operative mean Oxford Knee Scores for 1st staged, 2nd staged and simultaneous bilateral total knee replacement cases

Mean improvement was similar for a combination of all simultaneous cases to the first staged TKR (24.3 vs. 24.0; p = 0.883) but significantly less in the second staged TKR (mean 20.2; p < 0.001). Mean improvement for both knees in the simultaneous group was better in comparison to the staged group overall (24.3 vs. 22.1; p = 0.008) reflecting slightly inferior post-operative scores and higher pre-operative scores in the second staged TKR.

When improvement in the first staged TKR was compared to the second TKR for only those with a difference in pre-operative scores of 5 or less between knees, the difference in improvement between first and second knee was eliminated (mean 22.1 vs. 22.5; p = 0.980) with comparable post-operative scores (37.1 vs. 36.6; p = 0.125). For those with a pre-operative score difference of 6 or more points between knees, the improvement in the first knee was greatly superior to the second knee (27.6 vs. 15.7; p < 0.0001) with post-operative scores comparable for both knees (39.0 vs. 38.6; p = 0.224).

Discussion

In our series, the mean OKS was statistically better in the simultaneous group than the staged group but the difference was less than the reported minimal important difference (MID) for the OKS between groups [14]. For suitable patients, in our unit, consideration for simultaneous versus staged surgery reflects a combination of discussions regarding these options with the patient, surgeon preference and anaesthetist advice. It is possible that the simultaneous series could represent a healthier or more motivated patient group although age and pre-operative scores were comparable to those of patients undergoing their first staged TKR. There is some evidence that patients who receive either simultaneous or staged bilateral TKR may in general be a healthier population in comparison to those who undergo only single side TKR [16].

For staged procedures, outcomes in the second staged TKR matched the first in most cases although 22.8 % of patients experienced a worse outcome in their second knee. Statistical analysis suggested a trend to worse outcomes in the second staged TKR. One might expect that scored tasks such as stair descent, getting in and out of a car, and shopping might be improved to a superior level with both knees replaced—this was not our finding. The slightly inferior post-operative scores for the second staged TKR cannot be explained by a deterioration in the overall health of the patient between procedures as the comparison was made using the respective OKS scores for both knees completed by the patient either during the same clinic visit, or in another clinic visit during the same year, according to the month of primary surgery. One study has shown that the improvement seen in some components of the OKS, such as kneeling ability, may deteriorate after year one [15]. Although it was not possible for us to track changes in specific components of the OKS in this way, this effect would have resulted in worse reported OKS in the first knee in comparison to the second at intervals over a year and our data did not support this.

There is debate over the minimal clinically important difference or change in OKS and therefore whether small differences in score are perceptible by patients or of clinical relevance [14, 15]. In our series, 22.8 % of patients reported a worse OKS in their second staged TKR by mean 4.7 points (range 1 to 18) and 11.6 % reported a better OKS by mean 3.8 points (range 1 to 16). Any difference in the OKS reported by an individual patient between their respective knees demonstrates the perception of a difference in outcome, and we feel this should be considered clinically important in the setting of bilateral knee arthroplasty. Recent literature reports the minimal detectable change (MDC) in OKS for individuals to be 4 points [14].

‘Improvement’ was generally inferior in the second TKR primarily due to better pre-operative scores with no better or slightly inferior post-operative scores. The better pre-operative score in the second knee suggests a lower threshold for surgery on the second side and may be driven by patient expectations of a similar outcome to their first TKR. This may also reflect a lower threshold for the clinician to offer second side surgery. Evidence supports score ‘improvement’ as a key determinant of patient satisfaction after arthroplasty and our findings must therefore be an important consideration for the decision-making and consent process [17].

The interval between staged TKR procedures did not affect outcome. In our series, there was no trend to differing post-operative scores or improvement in scores for intervals ranging from two to 74 months. Age had no effect on post-operative scores in either simultaneous or staged procedures. Age was, however, found to influence the ‘improvement’ in a second staged TKR, independent of the interval between procedures. This was found to be the result of better pre-operative scores in those aged over 70 and may reflect surgeons tending to offer second side surgery at a better pre-operative OKS threshold in older patients. It could also be the result of different expectations between knee replacements in younger patients—with young patients not wishing to proceed with surgery until they reached a worse level of function due to greater functional demands and evidence of an association with earlier failure [18, 19]. Recent evidence has reported young patients are more likely to reduce their expectations after their first total knee replacement [11].

Strengths of our study include the number of patients in both a simultaneous and staged cohort being supported by a statistical power calculation that was exceeded in all groups. Every patient in our single centre series received the same design of implant and a consistent post-operative rehabilitation protocol. In a study such as this, comparing the outcome between individual patient’s first and second knee, the patient acts as their own control, negating the need for a control population in these cases. For the independent simultaneous and staged groups, we were unable to compare patient comorbidities but age, gender and initial pre-operative OKS were comparable. Furthermore, one recent study has reported that the number of patient comorbidities did not affect expectation, satisfaction or improvement in the OKS after staged bilateral TKR [11]. Where comparison of outcome between knees was performed, we attempted to control for changes in general health or expectations over time by comparing scores reported within the same year, with clinics booked according to the month of primary surgery. Consistent with other studies, however, we found no significant change in OKS after the first post-operative score recorded at one year [11, 15].

In summary, surgeons may wish to use our findings to help inform patients considering bilateral total knee arthroplasty. For staged cases, the post-operative score was not affected by the pre-operative function, age or the interval between procedures. As individual patients attain a relatively comparable post-operative score in both their knees, independent of other variables, the ‘improvement’ in knee function is determined by the pre-operative OKS.