Introduction

Total hip replacement (THR) is an effective treatment for people with end-stage hip osteoarthritis (OA) that improves quality of life by reducing pain, joint deformity and loss of function. In Australia, nearly 40,000 people underwent THR in 2011, and world-wide demand is expected to double by 2020 due to the ageing generation and obesity [1, 2]. While a majority of individuals can expect improvements in pain and function, some remain dissatisfied after THR, despite procedurally excellent outcomes [3]. Ongoing moderate to severe pain has been reported in up to 13 % of patients and moderate to severe activity limitation in up to 30 % of patients at 2 years or more following THR [4, 5]. A number of baseline risk factors for continuing pain and disability after THR have been reported. [3].

A few recent studies have suggested that those with less severe radiographic change are less likely to respond well to THR [68]. However, these studies have methodological limitations including high rates of losses to follow-up, the use of generic health questionnaires rather than joint specific outcome measures and crude measures of overall radiographic OA severity. We recently investigated whether radiographic knee OA was a determinant of pain and function following total knee replacement (TKR) and found a definite inverse relationship between pre-operative radiographic severity of OA and intermediate-term outcomes after knee replacement [9]. Using the same study design and time frames, the purpose of this study was to evaluate the prognostic value of overall radiographic severity, as well as individual pre-operative radiographic characteristics on the pain and disability experienced by people 1 and 2 years after THR.

Methods

Ethics approval

This study was approved by the human research ethics committee of St. Vincent’s Hospital Melbourne (SVHM), and informed consent was obtained from participants.

Study institution and patients

All patients with OA admitted to SVHM, Australia, who underwent elective primary THR between 1 January 2006 and 31 December 2007, were considered eligible for enrolment into the study. Patients attended a multidisciplinary pre-admission clinic within 8 weeks of surgery, which served as the baseline for our study.

Data collection

Baseline data was prospectively collected and included patient demographics (age, sex, body mass index; BMI), the surgeon’s diagnoses and American Society of Anaesthesiologists (ASA) score [10]. Outcomes included surgery and prosthesis-related variables. The Harris hip score (HHS) [11] and the Short Form Health Survey (SF-12) [12] which were completed at the baseline visit and at 1 and 2 years post-operatively. Post-operative questionnaires were mailed to patients to complete and bring with them to their scheduled follow-up appointments. Additional mail-outs were also completed for non-responders, followed by a phone call 4 weeks later for any incomplete data or missing surveys.

Radiographs

Radiographs taken within 6 months of surgery were assessed by a single observer (PD), who was blinded to outcome scores. Data recorded from the pre-operative anterior-posterior (AP) radiographs of the pelvis included Kellgren and Lawrence (K-L) grading (0–4) [13], the severity of joint space narrowing (JSN) (0–3) and osteophyte formation (0–3) using the Osteoarthritis Research Society International (OARSI) atlas [14], and the degree of bone attrition using a previously described method (Dieppe et al. 2005) [15]. Radiographs showing advanced OA (K-L grades 3 and 4) were further sub-divided by including data from the individual scores of JSN and bone attrition [16]. In this modified K-L (mK-L) grading system, a K-L grade 3 radiograph with mild JSN [1] was graded 3a, and one with more severe JSN [2] 3b. A K-L grade 4 radiograph (complete loss of joint space = 3) was divided into 4a if there was no bone attrition and 4b if there was any subchondral bone attrition. In addition to radiographic OA severity, individual patterns of disease were recorded including presence of protrusio acetabulae, chondrocalcinosis, hypertrophic versus atrophic and supero-lateral versus medial-concentric disease. Intra-observer error was assessed by reading 40 randomly selected films twice, in random order, 1 week apart. Differences were assessed using the kappa statistic [17].

Surgery

Procedures were performed by a team of surgeons using cemented and cementless implants. Individual surgeons did not alter their manufacturer or implant types during the study time frame.

Main independent variables

The main predictor variable was radiographic OA severity using the mK-L with grades 2 and 3a collapsed into a single category (K-L ≤3a) due to small numbers, (n = 30). In addition, individual radiographic features (defined above) were included in initial univariate analyses.

Outcome variables

The outcome variables were the HHS pain and function scores at 1 and 2 years. We evaluated the relationship between the main independent variable (radiographic OA severity), and pain and function scores, adjusting for clinically relevant covariates and individual radiographic features that were associated (p < 0.1) with pain and function in our univariate analyses (supplementary Tables A–B).

Covariates

Multivariable regression analyses were adjusted for gender, baseline age, BMI and ASA score [10]. Other covariates included baseline pain and function and SF-12 mental (MCS) and physical (PCS) function scores. Surgical variables included were the surgical approach, femoral head size and whether cemented or cementless implants were used.

Statistical analysis

Summary statistics (mean, standard deviation [±SD] and percentage [%]) are presented for demographic and clinical characteristics of the study cohort. Separate multivariable linear regression models were created to evaluate the relationship between the mK-L grade and pain and function subscales of the HHS, measured on a continuous scale, at 1 and 2 years. We also dichotomised both pain and function outcomes into two categories based on whether or not patients achieved the minimum difference (MID) in pain and function scores at 1 and 2 years compared to baseline. We estimated the MID based on half the standard deviation of the mean change in pain and function scores [18]. Adjusted logistic regression was used to determine the odds ratio (OR) of achieving a minimum important improvement in pain or function at 1 and 2 years, for each mK-L grade, using K-L 4b as the reference point. Statistical significance was defined as p ≤ 0.05. Analyses were performed using SPSS for Windows version 18.0 (SPSS Inc., Chicago, Illinois).

Results

Study cohort and follow-up

A total of 411 primary THRs were performed for OA in 387 patients during the study period. No simultaneous bilateral THRs were performed, and in those patients who underwent staged bilateral joint replacement, only the second procedure was included in the analysis. Five radiographs were rejected because of poor quality (n = 2) or because no film was available within 6 months of surgery (n = 3), leaving 382 THRs for inclusion. Six patients did not return the questionnaires at 1 year due to deceased (n = 3), subsequent revision hip replacement (n = 2), lost (n = 1) and a further 12 patients at 2 years due to deceased (n = 7), subsequent revision hip replacement (n = 1), cognitive decline (n = 2), declined (n = 1) and overseas (n = 1). Therefore, follow-up pain and function data were available for 374 of 382 (97.9 %) patients at 1 year and 364 of 382 (95.3 %) patients at 2 years following THR.

The mean age was 68.9 (standard deviation (SD) ±9.3) years, 232 (60.7 %) were female, and the mean BMI was 29.9 (±5.5) kg/m2. The change in pain score from baseline was consistent at 1 year (27.1, ±9.6) and 2 years (27.1, ±9.5). The MID in pain score was five points (half of a SD of 9.6 at 1 year). When pain was dichotomised into two groups based on the MID, 360 of 374 (96.2 %) patients at 1 year and 349 of 364 (95.9 %) patients at 2 years achieved the MID in pain. The change in function scores from baseline was 16.2 (±10.9) at 1 year and 15.9 (±11.8) at 2 years. The MID in function score was six points (half of a SD of 10.9 at 1 year and 11.8 at 2 years). When function was dichotomised into two groups based on the MID, 304 of 374 (81.2 %) patients at 1 year and 285 of 364 (78.3 %) patients at 2 years achieved the MID in function. Further breakdown of demographic and clinical characteristics of the cohort are provided (supplementary Table C).

Radiographic findings

The intra-rater reliability scores demonstrated substantial reproducibility (supplementary Table D). The majority of patients with K-L grade 3 had significant joint space narrowing (category 3b), while more than half of those with K-L grade 4 OA also had evidence of bone attrition (category 4b) (Fig. 1).

Fig. 1
figure 1

Frequency bar graphs for modified Kellgren and Lawrence (mK-L) grade of radiographic hip OA severity

Predictors of pain outcome

Relative to baseline, pain scores improved at 1 and 2 years for each mK-L grade (supplementary Table E). Independent determinants of pre-operative pain scores included baseline function and SF-12 PCS and MCS scores. Determinants of pain post-operatively included baseline SF-12 PCS and MCS, mK-L grade and medial-concentric disease (Table 1). Multivariable logistic regression analysis (Table 2) demonstrated significantly lower odds of a clinically meaningful improvement in pain for patients with less severe baseline radiographic changes (mK-L grades ≤3a) at both and 1 and 2 years, when compared to mK-L grade 4b.

Table 1 Multivariable-adjusted association of individual radiographic features with hip pain score
Table 2 Multivariable-adjusted association of modified K-L with the MID in pain and function

Predictors of functional outcome

Relative to baseline, function scores improved at 1 and 2 years for each mK-L grade (supplementary Table F). Poorer baseline function scores were associated with worse radiographic OA severity (Table 3). Independent determinants of pre-operative function scores included older age, female gender, BMI, ASA score baseline mK-L baseline pain, SF-12 PCS and MCS. Post-operatively, older age, higher baseline BMI and ASA score, a femoral head >28 mm (1 year only), baseline function, SF-12 PCS and MCS, surgery through a posterior approach and mK-L grade were all significant predictors of post-operative function scores. Multivariable logistic regression analysis (Table 2) demonstrated significantly lower odds of achieving the MID in function scores for patients with less severe radiographic changes for all mK-L grades <4b at both 1 and 2 years. Advancing age was also associated with lower odds of clinically meaningful improvement in function at 2 years.

Table 3 Multivariable-adjusted association of modified K-L with hip function score

Discussion

In this study, we investigated the role of pre-operative radiographic severity on pain and function, in a consecutive cohort of patients with OA undergoing primary total hip replacement. Overall, fewer patients achieved a clinically meaningful improvement in function (78–81 %) compared to pain (96 %) at 1 and 2 years post THR. Our main finding is that individuals with less severe radiographic changes prior to surgery are less likely to experience a meaningful improvement in pain and function 1 and 2 years post-operatively, when compared to those with more severe changes. Furthermore, the association between radiographic OA severity and function was stronger than the association with pain.

Aside from radiographic severity, determinants of outcome in our study included baseline age, body mass index, co-morbidity status, physical and mental health status as well as surgical parameters including surgical approach and femoral head size. While many of these findings are consistent with existing literature [3], aside from age, these factors were not associated with a clinically meaningful improvement in outcome in our study. Indeed, when the MID was used to determine response to THR, the only determinant of pain outcome was radiographic OA severity, and for function, advancing age was also a determinant of outcome.

There have been few other studies investigating the influence of pre-operative radiographic OA severity on outcomes in THR. A case control study by Cushnaghan et al. [6] reported that those with the most radiographic changes prior to THR had the greatest improvement in physical function. Valdes et al. [8] reported that higher joint space width resulted in an increased risk of worse pain post-THR but did not predict function. We could find only one other study that investigated the influence of pre-operative radiographic OA on pain and function based on achieving a clinically meaningful improvement in outcome. Keurentjes et al. [7] reported that the odds of achieving a minimum clinically important difference (MCID) in physical function at 2 to 5 years following THR were significantly higher in those with severe radiographic OA.

Methodological weaknesses in these studies include exclusion of baseline scores from the analyses [8], low ascertainment rates of follow-up data [6, 7] and the use of generic quality of life instruments to measure outcome, which are notably less responsive to pain and function than disease-specific outcome measures [19]. Despite these limitations, our findings are consistent with those of Cushnaghan et al., Keurentjes et al. and Valdes et al. [68] and support our conclusion that there is an inverse relationship between pre-operative radiographic OA severity and pain and function after hip replacement.

A notable finding of this study is the contrast in radiographic severity of patient presenting for THR when compared to those undergoing total knee replacement (TKR) [9]. In a prior similar study of 478 patients undergoing TKR during the same time frame, 43 % presented with grade 4 radiographic changes, whereas 60 % of our THR cohort presented with grade 4 changes. Furthermore, for knee replacement, there was a stronger association between pain and radiographic OA severity [9] compared to function, whereas for our THR cohort, the association was stronger for function than for pain.

While better outcomes for hip replacement over knee replacement have been previously reported in the literature [20], there are no prior studies indicating whether patients undergoing hip replacement present with later stage radiographic OA severity than for knee replacement. Our two study cohorts are from the same time frame and therefore, it is unlikely that these differences are due to differences in health services systems and processes, such as waiting times for surgery. Rather, it seems that worse radiographic severity is tolerated in patients prior to seeking treatment for hip OA than for knee OA or that the pattern of decline is more rapid in the former group. While we can only speculate as to why this might be, as a load-bearing joint, pain in an arthritic knee during weight-bearing activities including stair climbing is noticeably worse than at rest [21], whereas these differences are not reported in patients with hip OA. This may drive individuals with knee OA to seek treatment earlier than for hip OA. Rapid progression of OA is also a phenomenon reported in those undergoing hip but not knee replacement. It has been noted that a number of individual and radiographic characteristics including higher Kellgren-Lawrence grade at the time of hospital referral confer greater risk of rapid progression of hip OA [22].

Our study has both strengths and limitations. This is a large, prospective study with very few patients lost to follow-up. Radiographs were read by a single observer with good to excellent reproducibility of his findings. A potential weakness is the fact that this is a single-site study based in a tertiary referral centre; therefore, results may have limited generalizability. While the HHS is more responsive to pain than generic health questionnaires [19], the minimum clinically important difference that defines the minimal change perceived by patients to be important has not been established. We therefore used the generally accepted clinically significant benchmark of 50 % of the standard deviation of the change in scores at 1 year [23]. This approach to calculating the MID has been recommended for other hip scores systems [24] and was recently used to determine the MID for the HHS in a randomized controlled trial of younger people undergoing THR [25].

In conclusion, we have shown that there is an inverse relationship between the severity of pre-operative radiographic changes and pain and function at 1 and 2 years post-operative in people undergoing primary THR for OA, and suggest that this has important clinical implications for patient selection, as well as requiring explanation through further research.