Introduction

Over the past years, there has been a paradigm shift from disease-centred care to patient-centred care [14]. As part of this paradigm shift, the outcomes of clinical intervention reported by patients are becoming more important [9]. Patient-reported outcomes measures (PROMs) are measurements which come directly from the patient without interpretation of the patient’s response by a clinician or anyone else [9]. PROMs are indicators of a disease’s impact on the patient, necessary for determination of efficacy of treatment and useful in the interpretation of clinical outcomes and treatment decision-making [9]. Nonetheless, National Joint Registries are focused on the survivorship of prostheses and do not use PROMs [8]. Within the new paradigm, PROMs are likely to become part of joint registries [33].

The Oxford Knee Score (OKS) is a disease-specific PROM which is used to evaluate knee function before and after total knee arthroplasty (TKA) [7, 22]. The OKS was found to be the most appropriate disease-specific questionnaire in patients with osteoarthritis of the knee [12]. The OKS showed good reliability during further validation using Rasch analysis, which means the OKS is consistent per item and overall score [6, 24]. Finally, the OKS is widely used and has been validated in many countries [11, 13, 16, 20, 29, 31, 36, 43].

In several studies, subscales were extracted from OKS data to analyse pain and function separately using face validity [1, 17]. Using factor analysis, subscales for pain and function were created with high internal consistency [18]. These OKS pain component and function component subscales can be used to assess pain or function in research and clinical practice [19].

Performance measures were found to be necessary in the evaluation of the patient undergoing total knee arthroplasty, because change in self-reported function was mainly influenced by the change in pain [35]. However, performance measures are time-consuming and not easily implemented in daily practice.

PROMs will become more important for the evaluation of the quality of orthopaedic care. Therefore, it is important to know whether the OKS, which addresses pain and function, equally correlates with pain and performance-based functioning. Because self-reported function is mainly influenced by change in pain, it was hypothesized that the OKS correlates more with pain than with performance.

Materials and methods

Prospective data on patients who received a cementless mobile-bearing TKA as part of a randomized clinical trial, Netherlands National Trial Registry 3033, were used [41]. In this trial, postoperative outcome of a titanium-nitride-coated cementless mobile-bearing rotating platform (RP) total knee prosthesis (TKP), the Ceramic Coated Implant (CCI®, currently available as ACS® Basic, Implantcast GmbH, Buxtehude, Germany), was compared with an uncoated cementless mobile-bearing RP TKP, the Low Contact Stress (LCS® Complete, DePuy, Warsaw, IN, USA). Data were collected preoperatively, at 6 months, 1 and 5 years postoperatively. Patients were found eligible for this trial when there was an indication for TKA, age of 40 years or older and the ability to give informed consent. Patients with an indication for revision TKA, persisting pain after previous TKA on the contralateral side and no informed consent were excluded. The study was reported according to strengthening the reporting of observational studies in epidemiology (STROBE) statement.

Study flowchart is shown in Fig. 1. From July 2006 to June 2007, patients planned for TKA were asked to participate in this study. Preoperative and follow-up visits were performed in an outpatient setting by the same observer, at the same location using the same measuring instruments.

Fig. 1
figure 1

Study flowchart. Number of patients assessed for eligibility, participation and exclusion because of missing data. TKA total knee arthroplasty, DKT DynaPort Knee Test

The OKS is a self-report questionnaire which consists of 12 items on pain and function [7]. Each item was scored on a Likert scale from 0 to 4, with a summary score of 0 (worst) and 48 (best) as recommended [28]. The OKS has a reliability coefficient of 4.6–6.45. The OKS subscales were created as suggested [18]. The OKS PCS consists of items 1, 4, 5, 6, 8, 9 and 10, and the OKS FCS consists of items 2, 3, 7, 11 and 12 [18]. The test–retest reliability intra-class coefficient (ICC) was 0.93 (95 % CI 0.92–0.95) for the OKS, 0.91 (95 % CI 0.88–0.94) for the OKS PCS and 0.92 (95 % CI 0.90–0.95) for OKS FCS [19]. To avoid incomplete questionnaires, the OKS was completed in the presence of the same investigator in an outpatient setting at follow-up instead of a postal questionnaire [42].

Visual analogue scale score for pain (VAS) was scored on a scale from 0 to 100 mm, where 0 is no pain and 100 the worst pain ever experienced [15]. The VAS has a test–retest reliability coefficient r of 0.84, p < 0.0001 [34].

The Knee Society Score (KSS) is divided in two subscales: Knee, which consists of a total 50 points for pain, 25 points for range of motion and 25 points for stability with deductions for flexion contracture, extension lag and malalignment; and Function, which consists of 50 points for walking and 50 points for stair climbing with deductions for the use of a walking aid [21]. There is no test–retest reliability known for the KSS.

The DynaPort® Knee Test (DKT, McRoberts, The Hague, the Netherlands) is a joint-specific performance-based measure which is useful for research in patients undergoing TKA [25]. This test was found to be reliable and valid [25, 26]. The test–retest reliability ICC was 0.81 (95 % CI 0.69–0.93) [10]. The DKT was used as described before [10]. The patient was equipped with accelerometer containing neoprene straps, around the chest, left thigh and both shanks [10]. The data recorder was worn on the waist and contained another two accelerometers [10]. With these accelerometers, a standardized set of 29 ADL-related activities were recorded under supervision of the investigator. After the DKT, data were transferred from the recorder to a PC software environment (Dynascope, McRoberts) [10]. The beginning and end of each activity was marked by hand by the supervising investigator [10]. Analysis of these marked recordings was performed in Dynascope by McRoberts. Of each activity, 30 movement features were calculated consisting of accelerations, angles, durations and other variables such as step number, step frequency, relative speed and asymmetry [10]. Weighted averages of these activity scores were clustered and resulted in four average scores for locomotion, rising and descending, transfers, and lifting and moving objects [10]. These four cluster scores were averaged to calculate an overall DynaPort® Knee Score (DKS) [10]. The DynaPort Knee Test has a score range from 0 to 100 [10]. The DKT was performed preoperatively, at 6 months and 1 year postoperatively in an outpatient setting in a secluded part of the hospital.

Outcome measures were the OKS and subscales OKS PCS and OKS FCS, VAS, the KSS and the DKS. Potential confounders were age, sex, preoperative body mass index (BMI), type of total knee prosthesis and previous TKA on the contralateral side. Patients with missing data were excluded.

The study was approved by the medical ethical committee of the VU University Medical Center Amsterdam and registered under ID number 2005/194.

Statistical analysis

Statistical analysis was performed using SPSS 22.0 for Windows (SPSS Inc, Chicago, Il, USA). Internal consistency of the subscales OKS PCS and OKS FCS was measured at each time point using Cronbach’s alpha. An acceptable value of Cronbach’s alpha was considered 0.7–0.9 [2, 37]. Pearson’s correlation coefficients (r) were calculated to assess the link and the degree of relation between outcome measures. The correlation coefficient r was, either positive or negative, considered weak if smaller than 0.35, moderate from 0.36 to 0.68, high from 0.69 to 0.89 and very high if higher than 0.9 [38]. Pearson’s correlation’s 95 % confidence interval was based on bootstrap interval analysis using 1000 samples [27]. To assess changes in outcome, measured at multiple time points, a general linear model repeated-measures procedure was used [5]. Multivariate analysis was used to assess for potential confounders including coating of the TKP, age, sex, preoperative BMI and preoperative TKA on the contralateral side. Mann–Whitney U test was used to assess the influence of contralateral TKA at 5-year follow-up. The number of patients needed in the clinical trial of which data were used was based on the clinical relevant reduction in pain [41]. For the OKS as outcome measure, a post hoc sample size calculation was performed with a preoperative mean of 27.3 points, a standard deviation of 6.9 points, a minimally clinical important difference of five points [4], statistical power of 0.9 and α-level of 0.05. This resulted in a sample size of 23 patients. All statistical tests were considered significant at the 0.05 threshold, and all p values were two-sided.

Results

Baseline characteristics are shown in Table 1. Postoperative contralateral TKA was performed in 34 patients (38 %) within 5 years after the first TKA. At 5-year follow-up, a total of 47 (54 %) patients had bilateral TKA. Clinical outcome measures at baseline are shown in Table 2.

Table 1 Baseline characteristics
Table 2 Clinical outcome measures

The OKS, the OKS PCS and OKS FCS improved over time (p < 0.001) (Table 2). The internal consistency of the items of the OKS PCS and OKS FCS is shown in Table 3. Multivariate analysis showed no influence of coating of the TKP, age, sex, preoperative BMI and preoperative TKA on the contralateral side on the postoperative OKS.

Table 3 Internal consistency of the items of the OKS PCS and OKS FCS

The overall DKS, the cluster scores Locomotion (p < 0.001), Rise and Descend (p < 0.001), Lift and Move (p < 0.001) and Transfers (p < 0.001) improved over time (p < 0.001) (Table 2). Multivariate analysis showed a negative influence of female sex (p < 0.001), increasing age (p < 0.005) and higher preoperative BMI (p < 0.003) on the postoperative DKS.

The VAS, the KSS score and subscores Knee and Function improved over time (p < 0.001) (Table 2). Multivariate analysis showed no influence of coating of the TKP, age, sex, preoperative BMI and preoperative TKA on the contralateral side on the postoperative VAS or KSS.

The correlation coefficient r of the OKS and its subscales with the DKS, VAS and KSS from preoperative to 5 years postoperative are shown in Fig. 2 and Table 4. Overall, postoperatively, the DKS showed moderate correlation with the OKS, the OKS PCS and OKS FCS. The VAS score for pain had high correlation with the OKS and OKS PCS and moderate correlation with the OKS FCS. The KSS had high correlation with the OKS, OKS PCS and OKS FCS, for one exception; at 5 years, the KSS had moderate correlation with the OKS PCS.

Fig. 2
figure 2

Correlation coefficient r of the Oxford Knee Score and subscales with the DynaPort® Knee Score, the visual analogue scale score and the Knee Society Score. The OKS FCS showed the highest correlation with the DKS at each time point. The OKS and OKS PCS had similar high correlation with the postoperative VAS. The KSS showed similar correlation with the OKS, the OKS PCS and OKS FCS at each time point. No correlation between scores was found to be weak or very high. Upper and lower bounds of the 95 % CI are indicated with crossed bars for the OKS, right wing bars for the OKS PCS and left wing bars for the OKS FCS. OKS Oxford Knee Score, OKS PCS Oxford Knee Score pain component scale, OKS FCS Oxford Knee Score function component scale, DKS DynaPort Knee Score, KSS Knee Society Score, P preoperative, V2 visit at 6-month follow-up, V3 visit at 1-year follow-up, V4 visit at 5-year follow-up

Table 4 Correlations of the Oxford Knee Score and subscales with the DynaPort Knee Score, the visual analogue scale score and the Knee Society Score

The correlation of the DKS with the VAS for pain and KSS is shown in Table 5. Preoperatively, the DKS had a weak correlation with the VAS for pain and moderate correlation with the KSS. At 6 months, the DKS had a moderate correlation with the VAS for pain and KSS. At 1 year, the DKS had a weak correlation with the VAS for pain, but a high correlation with the KSS.

Table 5 Correlations of the DynaPort® Knee Score with the visual analogue scale score and the Knee Society Score

In this cohort, limited knee flexion which required mobilization under anaesthesia was found in six patients. In four patients, a revision of the tibial plateau was performed because of persisting pain suspect for aseptic loosening. One patient had a two-stage revision for a suspected deep infection after complicated arthroscopic synovectomy.

Discussion

The most important finding of this study was a high correlation of the postoperative OKS with the VAS and only moderate correlation with performance-based functioning. The postoperative OKS PCS showed high correlations with the VAS, and the postoperative OKS FCS showed moderate correlation with the DKS for performance-based functioning. This coincides with a previous study that showed high correlation of the OKS with the VAS and other studies that showed that self-reported physical functioning is more influenced by pain than performance-based physical functioning [32, 35, 39]. Also, lower OKS scores were found in patients with anterior knee pain after TKA [3]. Ideally, self-report questionnaires are completed with performance-based measures to assess postoperative function [35]. Furthermore, this study showed that postoperative performance-based functioning is influenced by age, sex and preoperative BMI, while the OKS and its subscales are not. This might explain the decrease in correlation of the OKS and its subscales with the postoperative DKS compared to the correlation with VAS and KSS.

At 5-year follow-up, the VAS improved and Cronbach’s α for internal consistency of the items within the OKS PCS reached 0.9. If α is too high, some items may become redundant, testing the same in another way [37]. As part of a total score, the items of the OKS PCS might be over-represented, emphasizing on pain within the overall OKS. Cronbach’s α for internal consistency of the items within the OKS FCS remained the same during 5-year follow-up, which suggests that these items assess different dimensions of function during follow-up.

Several other potential problems of the OKS were described [42]. Its use as a postal questionnaire may be limited as completion of the questionnaire cannot be guaranteed [42]. In our study, patients filled out the form in an outpatient setting under supervision of the researcher to guarantee completion of the questionnaire as suggested [42]. In a previous study, a lower discriminating performance of the OKS after TKA was found [22]. Also, Cronbach’s α for internal consistency of all items of the OKS was found to be 0.92 at 5–8 years of follow-up [42]. This suggests redundancies within the scale. In this study, there was no high α preoperatively, at 6 months and 1 year, but there was a high α for the OKS PCS at 5-year follow-up. This suggests that in time, items of the OKS do not discriminate as well as in the preoperative and early postoperative phase.

The DKS had a weak-to-moderate correlation with the VAS. This suggests that performance-based functioning is little influenced by patient-reported pain. The postoperative DKS had a moderate-to-high correlation with the KSS and only moderate correlation with the OKS. The postoperative correlation coefficients of the KSS and the OKS FCS with the DKS were similar. This suggests that the KSS is less influenced by patient-reported pain than the OKS.

In a comparative study on parapatellar versus subvastus approach in TKA, the DKS was comparable with the preoperative score after 6 weeks [40]. Estimated from a presented graph, the DKS increased from 33 preoperatively to 47 points 3 months postoperatively [40]. In this study, the preoperative score was 46 (SD 17.4), and the first postoperative DKS was 52 (SD 14.9) points at 6 months. This difference in scores might be due to the differences of patients in the cohort and cohort size. In an observational study on unicompartimental knee prosthesis, a preoperative DKS of 35.8 (SD 15.5) improved to 48.8 (SD 15.5) at 6 months, 50.5 (SD 14.8) at 1 year and reached 52.3 (SD 16.6) at 2 years [23]. Compared to this cohort, our cohort showed rather good preoperative DKS, but did not improve as well. This might be due to the differences in age, BMI and sex.

The biggest limitation of the DKT is that it is not easy to interpret. Improvement suggests better function, but no study has shown what the minimally important clinical difference of the DKT actually is. It has been suggested that one half standard deviation is the threshold of discrimination for change [30]. In this study, the minimally clinically important difference for the DKS would therefore be 8.7, which suggests that the DKS clinically improves 1 year postoperatively. Using this threshold of discrimination for change, the cluster scores Locomotion, Lift and Move, and Rise and Descend improve clinically 1 year postoperatively; however, Transfer does not clinically improve 1 year after TKA.

There are several limitations to this study. First, only patients of whom complete data were present were included in this study. Patients unwilling or unable to attend a visit or unable to perform a DKT might have poor results after TKA, which are now excluded. However, the focus of this study was on the correlation between scores, and therefore missing data were not acceptable. Second, the DKT was not performed at 5-year follow-up.

It should be taken into account when the OKS is used as a PROM to evaluate the quality of care of patients undergoing TKA that the OKS correlates more with pain than performance-based functioning. For a more complete evaluation, additional PROMs should be used which correlate better with performance-based functioning.

Conclusion

The OKS and the OKS PCS showed high correlations with the postoperative VAS. Only the OKS FCS showed moderate-to-high correlations with the DKS. This suggests that the OKS is foremost influenced by pain and in a lesser degree by performance. Furthermore, during follow-up, the internal consistency of items within the OKS PCS reaches the value that implicates redundancy and therefore over-emphasizes pain within the OKS.