Abstract
Patient-reported outcome measures are a critical tool in evaluating the efficacy of orthopedic procedures. The intention of this study was to evaluate reliability, validity, responsiveness and minimally important change of the German version of the Hip dysfunction and osteoarthritis outcome score (HOOS). The German HOOS was investigated in 251 consecutive patients before and 6 months after total hip arthroplasty. All patients completed HOOS, Oxford-Hip Score, Short-Form (SF-36) and numeric scales for pain and disability. Test–retest reliability, internal consistency, floor and ceiling effects, construct validity and minimal important change were analyzed. The German HOOS demonstrated excellent test–retest reliability with intraclass correlation coefficient values > 0.7. Cronbach´s alpha values demonstrated strong internal consistency. As hypothesized, HOOS subscales strongly correlated with corresponding OHS and SF-36 domains. All subscales showed excellent (effect size/standardized response means > 0.8) responsiveness between preoperative assessment and postoperative follow-up. The HOOS and all subdomains showed higher changes than the minimal detectable change which indicates true changes. The German version of the HOOS demonstrated good psychometric properties. It proved to be valid, reliable and responsive to the changes instrument for use in patients with hip osteoarthritis undergoing total hip replacement.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Patient-reported outcome measurements (PROMS) can provide reliable and valid measures of a patient’s degree of pain, impairment, disability, and quality of life. They are a critical tool in evaluating the efficacy of orthopedic procedures and are increasingly used in clinical trials to assess the outcomes of health care. Total hip arthroplasty (THA) in osteoarthritis has shown to have a significant improvement on patients’ health-related quality of life [1]. The Hip dysfunction and osteoarthritis outcome score (HOOS) was developed as an extension of the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) questionnaire, which has been used worldwide as a hip osteoarthritis (OA) specific questionnaire [2, 3]. The HOOS contains five subscales. A normalized score can be assessed for each domain, where 100 indicates no symptoms and 0 indicates severe symptoms. The original version of the HOOS was shown to be valid, reliable, and responsive in hip OA patients and is considered useful for the evaluation of patient-relevant outcomes after THA [2, 3]. Currently it has been translated, validated and published in Chinese, Dutch, French, German, Japanese, Korean, Thai, and Turkish [4,5,6,7,8,9,10,11]. The German cross-cultural adaptation and evaluation was performed with a small number of Swiss-German participants (n51) lacking assessment of responsiveness [7]. In prospective outcome studies, the responsiveness of an outcome measure and its ability to detect change when a change has occurred is an essential characteristic of the validity of the measure [12,13,14].
The aim of this study was to estimate responsiveness and minimally important change and to reassess reliability and validity of the German HOOS in a high number of patients with OA undergoing THA. To determine potential differences between languages, we translated the English HOOS into German and compared it with the Swiss-German version of Blasimann [7].
Materials and methods
Translation
Forward and backward translation of the HOOS was performed according to international guidelines of the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) [15]. Two bilingual translators whose native language was German independently translated the English version forward into German. Two native English speakers performed backward translation of the German questionnaire into English. The final version was tested on 15 patients with OA of the hip to ascertain acceptance and comprehension.
Patients and validation procedure
From November 2014 to January 2016, a total of 251 patients, 154 (61%) women and 97 (39%) men, with a mean age of 68 years (36–89 years) with OA undergoing THA were consecutively recruited at a single institution. Eligibility criteria included adult patients undergoing primary THA. Patients were asked to complete the German HOOS, the German Oxford- Hip Score (OHS), the German Short-Form 36 Health Survey (SF-36), and a numeric scale for pain and disability (NRS). HOOS, OHS, SF-36, and NRS were completed 3–14 days before surgery (t1) and again on the morning before surgery (t2) for reliability testing. 6 months after surgery (t3), all participants were asked to complete HOOS a last time.
Instruments
The HOOS contains five subscales: symptoms (Sym), pain, activity of daily living (ADL), sports/recreation (S/R), and quality of life (QoL) [3]. The pain domain is constituted by the five questions of the original WOMAC pain domain, plus five additional questions, the Sym domain includes the two questions of the WOMAC stiffness domain plus three additional questions. The ADL domain contains the WOMAC function questions (17 questions) and the S/R (four questions) and hip-related QoL (four questions about global difficulty, lack of confidence in hip, lifestyle change or awareness of the hip problem) are newly generated domains, which aim to evaluate the consequences of hip OA on more demanding activities and on QoL [2, 16].
Patients score each question on a five-point Likert scale scored from 0 to 4, with 0 representing the worst stage.
The OHS is a 12-item instrument to evaluate pain and function related to the hip [12]. Each item is scored on a five-point Likert scale from 0 to 4, with 0 representing the worst stage. The measure generates a single overall score ranging from 0 to 48 (summed items), where 48 represents the best health state. A German OHS has been translated and validated for OA and THA patients. It showed to be a reliable instrument for use in patients with OA of the hip undergoing THA [17].
The SF-36 instrument is a widely used generic patient-reported instrument to measure health-related quality of life. It consists of eight domains: physical functioning (pf), role physical (rp), role emotional (re), social functioning (sf), mental health (mh), energy/vitality (e/v), pain (p), general health perception (gh). It has been translated and validated into German [18].
The NRS was used to determine pain and disability of the hip. On a 0–10 scale, 10 represents the most severe pain or disability.
Statistical analysis
The HOOS and OHS scores were entered into a Microsoft Excel spreadsheet (Microsoft Corporation, Redmond WA) and analyzed using SPSS v24 (SPSS Inc. Chicago, Illinois). A p value < 0.05 was considered to indicate statistical significance.
Reliability
Reproducibility
Reproducibility as test–retest reliability was assessed by calculating intraclass correlation coefficient (ICC, Two-way Random Effect Model Absolute Agreement Definition) between HOOS completed at the first visit 3–14 days before surgery (t1) and second time before surgery (t2). An ICC value of 0.7 and above was considered as good [19, 20].
Internal consistency
Reliability also includes internal consistency [20]. Internal consistency is the extent to which items within a scale are homogeneous, thus measuring the same construct [21, 22]. Cronbach´s alpha (α) coefficient calculated to assess internal consistency of the HOOS items. Values of α of 0.7, 0.8 and 0.9 are considered to represent fair, good and excellent degree of internal consistency, respectively [23].
Floor and ceiling effects
Floor and ceiling effects were considered to exist if more than 15% of responses reached lowest or highest possible score [24].
Validity
Construct validity
Describes the extent to which a score relates to other scores [24]. As no gold standard exists the HOOS subscales were compared to OHS and SF-36 and NRS pain and disability using non-parametric correlation coefficients (Spearman´s Rho). Correlation coefficients < 0.4 were considered as low, 0.4–0.59 as moderate and 0.6–0.79 as high correlation. For convergent validity high correlation between HOOS dimensions pain and the OHS domains pain and with the SF-36 domains bp and NRS were hypothesized. For the HOOS dimension ADL, S/R high correlation was expected with OHS domain function and SF-36 domain pf. Low correlations were expected between subscales on different contents.
Responsiveness
Responsiveness is the extent to which a questionnaire is able to detect changes over time or due to an intervention such as surgery [25]. All patients completed HOOS before surgery (t1) and 5–6 months after surgery (t3). To test responsiveness effect size (ES) and standardized response means (SRM) were calculated. ES calculated as the difference between the means before and after intervention divided by the standard deviation (SD) of the same measure before treatment [26]. SRM is calculated as the difference between the means before and after treatment divided by the SD of the change. For both, ES and SRM, values of 0.2, 0.5 and 0.8 were regarded as small, moderate and large effects, respectively [20, 26].
Minimal important change (MIC)
Minimal important change is the smallest change in a treatment outcome that a patient or physician would identify as important. MIC describes a threshold above which outcome is experienced as relevant by the patient and avoids the problem of bare statistical significance [27]. One distribution-based approach to calculate MIC is the minimal detectable change (MDC). It is defined as minimum amount of change that can be considered above the threshold of a measurement error. If the change in a score is higher than MDC, it can be considered as a true change [27]. It is calculated from the standard error of measurement (SEM), which is related to the internal consistency/reliability of the score (Cronbach´s alpha). (SEM = Standard deviation *√1 − Cronbach´s alpha). To allow comparisons with other studies, the MDC was calculated based on the confidence level of 90% (MDC90: MDC = 1.65*SEM* √2) [28, 29].
Results
Our translation of the HOOS showed no significant differences compared to the Swiss-German questionnaire. This was determined by an independent expert group for cross-cultural adaptation of questionnaires and German researchers who compared each question with regard to linguistic and content-related differences.
Reliability
Reproducibility
All five dimensions of the HOOS demonstrated excellent test–retest reliability with ICC values of 0.79 for Sym, 0.87 for S/R, 0.85 for pain, 0.89 for ADL, 0.86 for QoL and 0.88 for the HOOS total. The mean indexes for the baseline and the reliability assessments were 30.6 [Standard deviation (SD) 15.1] and 29.4 (SD 14.3), respectively (Table 1).
Internal consistency
Cronbach´s alpha (α) of 0.69 for Sym, 0.82 for S/R, 0.91 for pain, 0.96 ADL, 0.82 for QoL and 0.97 for the HOOS total demonstrated strong internal consistency (Table 1).
Floor and ceiling effects
No floor or ceiling effects were observed for the HOOS subdomains except S/R which showed a floor effect (Table 2).
Validity
Construct validity
To examine construct validity, the Spearman´s correlation coefficients between HOOS, SF-36, OHS and NRS are examined and shown in Table 3. Convergent validity of the HOOS ADL, S/R subscale was shown with strong correlations (> 0.6) with OHS function and SF-36 domains pf. As hypothesized, HOOS pain subscale correlated strongly with OHS domain pain, SF-36 domain bp and NRS. All these findings were statistically significant (p < 0.05).
Responsiveness
Table 4 shows the responsiveness of the HOOS. All subscales demonstrated excellent (ES/SRM > 0.8) responsiveness between preoperative assessment (t2) and postoperative follow-up (t3) indicating that a very large degree of change was detected following surgery. The highest effect size showed the pain subscale (2.86) representing the best responsiveness, whereas the sports and recreation domain showed lowest (2.07), still representing large effects.
The SEM was 2.81, 2.61, 2.81, 3.70, 3.52, and 2.64 for the German HOOS Sym, pain, ADL S/R, QoL and HOOS total, respectively. MDC90 (90% confidence level) was 6.55, 6.09, 6.55, 8.63, 8.22 and 6.16 for domains Sym, pain, ADL, S/R, QoL and HOOS total, respectively. The mean difference between preoperative and postoperative assessment is shown in Table 4 ranging between 20.23 for pain subscale and 27.30 for QoL. All subdomains and the index showed higher changes than the MDC, which indicates true changes [25].
Discussion
The HOOS is an internationally used PROM which has been translated, validated and published in Chinese, Dutch, French, German, Japanese, Korean, Thai and Turkish [4,5,6,7,8,9,10,11]. Responsiveness has been assessed for the Chinese, French, Japanese, Korean and the original HOOS [3, 4, 6, 8, 9]. Evaluation of responsiveness with minimally detectable change has not been published in any language.
Test–retest reliability of the German HOOS showed excellent results with ICC values ranging from 0.85 to 0.89 for the different subscales. These results are comparable to the other HOOS version, where ICC values ranged from 0.75 to 0.98 [4,5,6,7,8,9,10,11].
Strong internal consistency has been demonstrated for the HOOS total (Cronbach´s alpha) 0.97, 0.82 for S/R, 0.91 for pain and 0.96 for ADL domain of the German version (Table 1). This is comparable to other language versions of the HOOS where Cronbach´s alpha values range between 0.70 and 0.97 [4,5,6,7,8,9,10,11].
No floor or ceiling effects were observed for the HOOS subdomains except S/R which showed a floor effect (Table 2), which is in line with Dutch HOOS, whereas the Chinese, French, Korean, Thai, and Swiss version showed no floor or ceiling effects at all [4, 6, 7, 9, 10]. For the Japanese HOOS, floor and ceiling effects have not been evaluated [8].
Construct validity was determined by comparing the German HOOS with the German SF-36 and OHS (Table 3). Comparison between HOOS and SF-36/12 has been published for all translated versions, whereas a comparison between HOOS and OHS only for the Japanese questionnaire [8].
The HOOS ADL and S/R subscales showed strong correlations (> 0.6) with OHS function and SF-36 pf domains. As hypothesized, HOOS pain subscale correlated strongly with OHS domain pain, SF-36 domain bp and NRS. The Japanese version also showed strong correlation between HOOS and OHS, but lower correlation concerning HOOS and SF-36 subdomains. HOOS subscale pain and SF-36 bp only demonstrated moderate correlation (0.53), which could be explained by cultural differences between the study populations. Divergent validity was shown by low correlation between HOOS domains and SF-36 re and physical subscales, respectively.
Table 4 illustrates the responsiveness of the HOOS. All subscales showed excellent (ES/SRM > 0.8) responsiveness between preoperative (t2) and postoperative follow-up (t3). The highest effect size showed the pain subscale (2.86) representing the best responsiveness, whereas the S/R domain showed lowest (2.07), still representing large effects. Responsiveness has been evaluated for the Chinese, French and Japanese versions, which also showed excellent results [4, 6, 8].
Minimal important change as the smallest change in a treatment outcome has not been described for the HOOS, so far. The aim was to ascertain the smallest amounts of change in the HOOS domain scales that are likely to be clinically meaningful and beyond measurement error for OA of the hip. The SEM, MDC90 and the mean difference between preoperative and postoperative assessment are shown on Table 4. All subdomains and the index showed higher changes than the MDC which indicates true changes.
Our evaluation of the HOOS showed similar results to the validated HOOS versions in other languages. Cultural differences, smaller number of patients and different hip pathologies and surgeries may be the reason for differing results in some aspects of the other HOOS versions. Responsiveness showed excellent results. To our knowledge, we are the first to determine MIC of the HOOS. Our article has estimated distribution-based MDC values for the HOOS to be between 6.1 and 8.6 score points. If a patient improves or deteriorates beyond the MDC 90 value, we can be fairly certain that this is not due to random variation in the score [30]. Correspondence with the developer of the HOOS was conducted to compare our German HOOS to the Swiss-German HOOS by Blasimann [7]. Our translation of the HOOS showed no significant differences indicating no need for another German questionnaire. This was confirmed by an independent expert group for cross-cultural adaptation of questionnaires and German researchers.
In conclusion, the German HOOS demonstrated good psychometric properties. Our study proofed that the German questionnaire is a valid and reliable instrument for patients with OA undergoing THA. It can be used as a tool for evaluating the efficacy of surgical procedures and in clinical trials to assess the outcomes of health care.
References
Laupacis A, Bourne R, Rorabeck C, Feeny D, Wong C, Tugwell P, Leslie K, Bullas R (1993) The effect of elective total hip replacement on health-related quality of life. J Bone Joint Surg Am 75(11):1619–1626
Klassbo M, Larsson E, Mannevik E (2003) Hip disability and osteoarthritis outcome score. An extension of the Western Ontario and McMaster Universities Osteoarthritis Index. Scand J Rheumatol 32(1):46–51
Nilsdotter AK, Lohmander LS, Klassbo M, Roos EM (2003) Hip disability and osteoarthritis outcome score (HOOS)—validity and responsiveness in total hip replacement. BMC Musculoskelet Disord 4:10. doi:10.1186/1471-2474-4-10
Wei X, Wang Z, Yang C, Wu B, Liu X, Yi H, Chen Z, Wang F, Bai Y, Li J, Zhu X, Li M (2012) Development of a simplified Chinese version of the Hip Disability and osteoarthritis outcome score (HOOS): cross-cultural adaptation and psychometric evaluation. Osteoarthr Cartil 20(12):1563–1567. doi:10.1016/j.joca.2012.08.018
de Groot IB, Reijman M, Terwee CB, Bierma-Zeinstra S, Favejee MM, Roos E, Verhaar JA (2009) Validation of the Dutch version of the Hip disability and osteoarthritis outcome score. Osteoarthr Cartil 17(1):132. doi:10.1016/j.joca.2008.05.014
Ornetti P, Parratte S, Gossec L, Tavernier C, Argenson JN, Roos EM, Guillemin F, Maillefert JF (2010) Cross-cultural adaptation and validation of the French version of the Hip disability and osteoarthritis outcome score (HOOS) in hip osteoarthritis patients. Osteoarthr Cartil 18(4):522–529. doi:10.1016/j.joca.2009.12.007
Blasimann A, Dauphinee SW, Staal JB (2014) Translation, cross-cultural adaptation, and psychometric properties of the German version of the hip disability and osteoarthritis outcome score. J Orthop Sports Phys Ther 44(12):989–997. doi:10.2519/jospt.2014.4994
Satoh M, Masuhara K, Goldhahn S, Kawaguchi T (2013) Cross-cultural adaptation and validation reliability, validity of the Japanese version of the Hip disability and osteoarthritis outcome score (HOOS) in patients with hip osteoarthritis. Osteoarthr Cartil 21(4):570–573. doi:10.1016/j.joca.2013.01.015
Lee YK, Chung CY, Koo KH, Lee KM, Lee DJ, Lee SC, Park MS (2011) Transcultural adaptation and testing of psychometric properties of the Korean version of the Hip disability and osteoarthritis outcome score (HOOS). Osteoarthr Cartil 19(7):853–857. doi:10.1016/j.joca.2011.02.012
Trathitiphan W, Paholpak P, Sirichativapee W, Wisanuyotin T, Laupattarakasem P, Sukhonthamarn K, Jeeravipoolvarn P, Kosuwon W (2016) Cross-cultural adaptation and validation of the reliability of the Thai version of the Hip disability and osteoarthritis outcome score (HOOS). Rheumatol Int 36(10):1455–1458. doi:10.1007/s00296-016-3505-4
Yilmaz O, Gul ED, Bodur H (2014) Cross-cultural adaptation and validation of the Turkish version of the Hip disability and osteoarthritis outcome score-physical function short-form (HOOS-PS). Rheumatol Int 34(1):43–49. doi:10.1007/s00296-013-2854-5
Dawson J, Fitzpatrick R, Carr A, Murray D (1996) Questionnaire on the perceptions of patients about total hip replacement. J Bone Joint Surg Br 78(2):185–190
Beaton DE (2000) Understanding the relevance of measured change through studies of responsiveness. Spine 25(24):3192–3199
Hays RD, Hadorn D (1992) Responsiveness to change: an aspect of validity, not a separate dimension. Qual Life Res 1(1):73–75
Wild D, Grove A, Martin M, Eremenco S, McElroy S, Verjee-Lorenz A, Erikson P, Translation ITFf, Cultural A (2005) Principles of good practice for the translation and cultural adaptation process for patient-reported outcomes (PRO) measures: report of the ISPOR task force for translation and cultural adaptation. Value Health 8(2):94–104. doi:10.1111/j.1524-4733.2005.04054.x
Roos EM, Klassbo M, Lohmander LS (1999) WOMAC osteoarthritis index. Reliability, validity, and responsiveness in patients with arthroscopically assessed osteoarthritis. Western Ontario and MacMaster Universities. Scand J Rheumatol 28(4):210–215
Naal FD, Sieverding M, Impellizzeri FM, von Knoch F, Mannion AF, Leunig M (2009) Reliability and validity of the cross-culturally adapted German Oxford hip score. Clin Orthop Relat Res 467(4):952–957. doi:10.1007/s11999-008-0457-3
Bullinger M (1995) German translation and psychometric testing of the SF-36 Health Survey: preliminary results from the IQOLA Project. International Quality of Life Assessment. Soc Sci Med 41(10):1359–1366
Weir JP (2005) Quantifying test–retest reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res 19(1):231–240. doi:10.1519/15184.1
Streiner D, Norman G (2008) Health measurement scales: a practical guide to their development and use. Oxford University Press, New York
Scholtes VA, Terwee CB, Poolman RW (2011) What makes a measurement instrument valid and reliable? Injury 42(3):236–240. doi:10.1016/j.injury.2010.11.042
Lohr KN, Aaronson NK, Alonso J, Burnam MA, Patrick DL, Perrin EB, Roberts JS (1996) Evaluating quality-of-life and health status instruments: development of scientific review criteria. Clin Ther 18(5):979–992
Bland JM, Altman DG (1997) Statistics notes: Cronbach’s alpha. BMJ 314(7080):572. doi:10.1136/bmj.314.7080.572
Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, Dekker J, Bouter LM, de Vet HC (2007) Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 60(1):34–42. doi:10.1016/j.jclinepi.2006.03.012
Wright JG, Young NL (1997) A comparison of different indices of responsiveness. J Clin Epidemiol 50(3):239–246
Cohen J (1988) Statistical power analysis for the behavioral sciences, 2nd edn. Academic Press, New York
Jaeschke R, Singer J, Guyatt GH (1989) Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials 10(4):407–415
McHorney CA, Tarlov AR (1995) Individual-patient monitoring in clinical practice: are available health status surveys adequate? Qual Life Res 4(4):293–307
Wright A, Hannon J, Hegedus EJ, Kavchak AE (2012) Clinimetrics corner: a closer look at the minimal clinically important difference (MCID). J Man Manip Ther 20(3):160–166. doi:10.1179/2042618612Y.0000000001
de Vet HC, Terwee CB, Ostelo RW, Beckerman H, Knol DL, Bouter LM (2006) Minimal changes in health status questionnaires: distinction between minimally detectable change and minimally important change. Health Qual Life Outcomes 4:54. doi:10.1186/1477-7525-4-54
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflict of interest.
Ethical approval
This article does contain studies with human participants. The study was approved by the Ethics Commission of the Faculty of Medicine of Cologne University (ref 15-252) and performed in accordance with the Declaration of Helsinki. Written informed consent from all participants was obtained.
Informed consent
Informed consent was obtained from all individual participants included in the study.
Rights and permissions
About this article
Cite this article
Arbab, D., van Ochten, J.H.M., Schnurr, C. et al. Assessment of reliability, validity, responsiveness and minimally important change of the German Hip dysfunction and osteoarthritis outcome score (HOOS) in patients with osteoarthritis of the hip. Rheumatol Int 37, 2005–2011 (2017). https://doi.org/10.1007/s00296-017-3834-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00296-017-3834-y