Introduction

Osteoarthritis (OA) affects up to 27 million adults in the USA [1], with symptomatic hip OA exhibiting a prevalence of 7.4–9 % [1, 2]. Although the etiology of OA is multifactorial and complex, increasing age is considered the primary risk factor [3]. As the global population over the age of 60 is expected to increase by 20–33 % in the next 20 years [4], the economic burden associated with functional loss and disability from OA is expected to concomitantly increase [5, 6]. Hip OA is considered a major contributor to the development of disability, and the presence of hip OA has been recently associated with increased mortality [7].

Early in the disease process, the fluctuating clinical symptom state of hip OA often limits patient ability to complete everyday functional tasks [8]. Given these limitations, patients will often seek out treatment, including physiotherapy, in hopes of decreasing pain and disability associated with hip OA. Patient-reported outcomes are widely used to help guide medical decision making and evaluate treatment effectiveness, and have been advocated for use by rehabilitation professionals for years [9]. Monitoring change in the self-reported health outcome is the process in which a standardized attempt is made to observe an often complex clinical picture.

An important note with regard to functional outcome measures is to recognize the multidimensional and complex nature of function. In essence, multiple dimensions including sensory, affective, evaluative, cognitive, behavioral, pain, physical, and social constructs can influence ones function [10]. Increasing interest in acquiring the patient’s viewpoint has been driven by an emerging prevalence of chronic disease and its management, in which the objectives of the intervention are aimed at treatment of symptoms and improving function rather than decreasing mortality. Self-reported patient outcomes are aimed to quantify what the patient has experienced as a result of treatment rather than physiologic change in impairment or the provider’s perspective of health status [11]. At present, there is no fixed self-reported outcome that is considered appropriate to address all constructs of hip OA.

Previously, the patient acceptable symptom state (PASS) was designed to capture the patients’ satisfaction with their current state and has been used to gather outcome data in patients following treatment for impairments associated with rheumatoid arthritis, ankylosing spondylitis, and osteoarthritis [1215]. The PASS is a single question that allows a patient to report their level of satisfaction with their current state of being. The PASS differs from other measures such as the global rating of change score (GRCS), which requires the patient to remember their previous health status and evaluate their current status against that prior level. The concept behind the PASS and purpose of these previous studies was to identify thresholds in commonly used self-report outcome measures beyond which patients consider themselves to be feeling well. These studies suggested the PASS has the potential to provide more meaningful information about the proportion of patients achieving an improvement beyond the level accepted as the minimal clinically important improvement (MCII) and achieve a state they consider satisfactory [14].

Recognized limitations of the MCII values include reliance on a single point estimate based on the mean, dependency upon baseline status, and multiple methodologies resulting in a wide range of reported values [1518]. Given the limitations of the MCII, the Outcome Measures in Rheumatology Clinical Trials proposed that OA research trials utilize a complementary measure to the MCII to assess how the individual perceives how good they feel versus how much better [19]. The PASS is anchored to the personal experience of the individual and therefore is thought to be a more robust measure of the patient’s overall satisfaction and adaptation [14, 20]. In addition, the utilization of PASS dichotomizes those who successfully responded to an episode of care from those who did not [21]. Patient satisfaction can be considered the ultimate end point from the patient’s perspective and can also be thought of as giving an end point to the assessment of the quality of health care.

An identified target among commonly used outcome measures would be a welcome addition to clinicians when creating long-term goals and planning for discharge. The primary aim of this study was to determine cut-points for commonly used hip OA-related outcome measures signaling patients’ satisfaction with their current symptom state.

Methods

Participants

The study consisted of subjects with a clinical diagnosis of hip OA who were part of a larger, randomized controlled trial designed to investigate the long-term effectiveness of 3 different physiotherapy programs, as compared to usual care, in subjects with OA of the hip or knee [22]. The current study focuses only on those 70 subjects with OA of the hip, who were randomized into a physiotherapy treatment group (23 in the exercise therapy group, 25 in the manual therapy group, and 22 in the exercise and manual therapy group). Subjects assigned to the usual care group were excluded from analysis given that PASS estimates are in reference to a patient’s satisfaction with active treatment.

The sample represented consecutive subjects fulfilling the eligibility criteria from March 2008 to March 2009. All subjects agreed to enrollment in the study and provided their signed informed consent. The study was granted ethical approval by the Lower South Regional Ethics Committee of the New Zealand Ministry of Health. Details of inclusion and exclusion criteria are described in detail elsewhere [22]. Briefly, subjects were recruited from primary and secondary care sources: patients of family practice physicians and patients referred to the Department of Orthopaedic Surgery, Outpatient Clinic, Dunedin Hospital, Dunedin, New Zealand, for an orthopedic consultation for consideration of hip joint replacement surgery. Subjects were eligible for inclusion in the study if they met clinical criteria for diagnosis of OA of the hip according to American College of Rheumatology criteria and at their baseline assessment were able to walk 10 meters without an assistive device [22, 23]. Exclusion criteria were as follows: (1) previous hip joint replacement surgery of the affected joint; (2) any other surgical procedure of the lower limbs in the previous 6 months; (3) scheduled surgical operation within 3 months; (4) rheumatoid arthritis; (5) initiation of opioid analgesia or corticosteroid or analgesic injection intervention for hip pain within the previous 30 days; (6) uncontrolled hypertension or moderate to high risk for cardiac complications during exercise; (7) physical impairments unrelated to the hip preventing safe participation in exercise, manual therapy, walking, or stationing cycling including vision problems that affect mobility, body weight greater than 155 kg, neurogenic disorder, primary or significantly limiting back pain, advanced osteoporosis, or inability to walk 10 m without an assistive device; (8) an inability to comprehend and complete study assessments, or an inability to comply with instructions; and (9) stated inability to attend or complete the proposed course of intervention and follow-up schedule [22].

Examination procedures

Data were collected at baseline and 9-week assessment visits at the Centre for Physiotherapy Research, School of Physiotherapy, University of Otago, Dunedin, New Zealand. After the subjects signed an informed consent document, they completed a baseline questionnaire consisting of demographic information, answered various medical history questions, and underwent a standard physical examination. The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC 3.1) assessed pain and disability related to OA.

Outcome measures

Patient acceptable symptom state (PASS)

The PASS at the 9-week and 1-year follow-up was used as the anchor in the study [15]. The PASS is a measure of patient opinion that asks subjects to rate their satisfaction with their current state of being and thus, their treatment. At the final visit, subjects’ opinions of their state was recorded by their answering ‘Yes’ or ‘No’ to the question ‘Taking into account all the activities you have during your daily life, your level of pain, and also your functional impairment, do you consider that your current state is satisfactory?’.

WOMAC 3.1 index

The WOMAC 3.1 index is a condition-specific instrument and has been shown to be valid and responsive for osteoarthritis conditions [2427]. The WOMAC index consists of 24 questions, five affiliated with pain, two affiliated with stiffness, and 17 affiliated with physical function. The numeric rating scale version was used [28] and was rated on an eleven-point scale (0–10). Total scores can range from 0 to 50 for pain and 0–170 for physical function with higher scores reflecting more pain and poorer physical function.

Global rating of change scale (GRCS)

The GRCS [29] is a measure of patient perception that asks subjects to rate the change in their symptoms. The question reads, ‘Please imagine how you would have described your OVERALL health status 9 weeks ago. How do you feel in general today as compared to 9 weeks earlier as far as your osteoarthritis of the left/right hip is concerned?’ The GRCS used in this study had 15 possible numerical values corresponding to verbal descriptions ranging from +7 ‘A very great deal better’ to −7 ‘A very great deal worse,’ as described by Jaeschke et al. [29]. The GRCS has been well validated and extensively used in research as an outcome measure and to compare outcome measures [30, 31].

Physical performance measures

Table 1 outlines the physical performance measures used in this study. The timed-up-and-go (TUG), 40-m self-paced walk test (SPWT), 30-s chair stand test (CST), and the 20-cm step test were investigated to determine PASS estimates for four commonly used physical performance measures in hip OA clinical trials [3235].

Table 1 Description of selected physical performance measures

Intervention

Standardized interventions were provided at the School of Physiotherapy, University of Otago, under the supervision of licensed practicing physiotherapists (n = 5). These five therapists were previously trained to administer the intervention protocols in a standardized manner. Subjects underwent a 9-session physiotherapy program and were randomly allocated to receive either (a) manual therapy, (b) exercise therapy, or (c) both manual therapy and exercise therapy. Further details of the intervention protocols have been described elsewhere [22].

Responsiveness

The investigation of responsiveness depends on the research design being employed during a period when change is expected. Based on the previous results, it was recognized that both self-report and physical performance measures could determine whether change had occurred following our physiotherapy program. A baseline examination was performed for all subjects, and both self-report and physical performance outcome measures were repeated after the treatment period (at approximately 9 weeks). To determine patient satisfaction with treatment, the patient completed the PASS question the day of the post-treatment follow-up visit.

Statistical analysis

All statistical analyses were performed using IBM SPSS Statistics version 20 (Copyright IBM Corporation 1989, 2011). Descriptive statistics, as well as means and standard deviations for baseline, 9 week, and change scores were calculated for all outcome measures. Outliers were examined through use of histograms and normative analyses.

Cut-points were identified using an anchor-based approach. The PASS question was used as the external criterion, based on the patient’s subjective perception of their satisfaction with their current symptom state after receiving physiotherapy treatment. To determine cut-points for all outcome measures associated with the PASS, receiver operating characteristic (ROC) curves were used to discriminate between patients answering ‘yes’ to the PASS question and those answering ‘no.’ Using this approach, identified cut-points are based upon the concepts of sensitivity and specificity and the ability to dichotomize subjects as satisfied or unsatisfied. The identified cut-point was determined to be the magnitude of change associated with the uppermost left-hand corner of the curve, where both sensitivity and 1-specificity are maximized [30]. Area under the ROC curve (AUC) estimates and their associated 95 % confidence intervals (CI) and p values were also provided. AUC can be interpreted as the probability that a randomly chosen ‘satisfied’ patient score will have a higher score than a randomly chosen ‘unsatisfied’ patient score [36, 37]. An AUC between 0.7 and 0.8 is considered to be acceptable and 0.8–0.9 is considered to be excellent [37] (Figs. 1, 2).

Fig. 1
figure 1

a Receiver operating curve identifying PASS cut-points at 9 weeks. WOMAC Western Ontario and McMaster Universities Arthritis Index, PF physical function subscale, TUG timed-up-and-go, SPWT self-paced walk test. b Receiver operating curve identifying PASS cut-points at 9 weeks. GRCS global rating of change scale, SPWT self-paced walk test, m/s meters per second, 30-s CST 30-s chair stand test

Fig. 2
figure 2

a Receiver operating characteristic curve identifying PASS cut-points at 1 year. WOMAC Western Ontario and McMaster Universities Arthritis Index, PF physical function subscale, TUG timed-up-and-go, SPWT self-paced walk test. b Receiver operating characteristic curve identifying PASS cut-points at 1 year. GRCS global rating of change scale, SPWT self-paced walk test, m/s meters per second, 30-s CST 30-s chair stand test

To assess the extent of subjects’ changes after interventions detected by the self-report and physical performance measures, the proportions of subjects with change scores exceeding the values of the PASS estimates were examined. We also reported the sensitivity, specificity, and positive likelihood ratios associated with each of the PASS scores allowing for the ability to determine how well-identified cut-points correctly classified subjects as satisfied or unsatisfied.

Results

A total of 70 subjects with hip OA were randomized into a treatment group and completed the baseline examination. Sixty-five [24 males, 41 females] of the 70 subjects (93 %) completed the 9-week and 1-year follow-up examination. Subjects ranged in age from 41 to 85 years, with a mean ± SD age of 66.5 ± 9.4 years, and mean (SD) WOMAC physical function score was 74.98 (37.24) out of 170 at baseline. At the 9-week follow-up, 29 of the 65 subjects (45 %) were classified as ‘satisfied’ and 36 (55 %) were classified as ‘unsatisfied’ based on the PASS. Subjects classified as ‘unsatisfied’ (n = 36) did not differ from the entire sample in terms of age, gender, duration of symptoms, or mean TUG, 40-m SPWT, 30-s CST, and 20-cm step test scores (p > 0.05) at baseline. Significant differences in body mass index (BMI) were found between the two groups at baseline with those reporting ‘not satisfied’ exhibiting higher BMI scores (p = 0.05). The demographic characteristics of the study sample are summarized in Table 2. A total of 20 (30.8 %) subjects had replacement surgery of the index hip at 1-year follow-up.

Table 2 Descriptive statistics of the sample

Table 3 reports the PASS cut-points for the 7 outcome measures after 9 weeks and 1 year and gives their sensitivity, specificity, percent correctly classified, and AUC estimates with 95 % confidence intervals. At 9 weeks, reported PASS cut-points were found to be significant between subjects classified as ‘satisfied’ from subjects classified as ‘unsatisfied’ for the WOMAC pain and function subscales (p < .001), the 30-s CST (p < .05), and the 20-cm step test (p < .05). As an example, subjects with hip OA considered their state satisfactory if their WOMAC physical function subscale score was less than or equal to 35 on a 170-point scale. A total of 17 of 65 (26 %) patients scored less than or equal to 35 on the WOMAC function subscale at 9 weeks. PASS cut-points were found to be nonsignificant for the GRCS, TUG, and 40-m SPWT (p > .05). The percent correctly classified for those outcome measures reaching statistical significance ranged from 65 to 74 %.

Table 3 Patient acceptable symptom state (PASS) cutoff points for self-report and physical performance outcome measures after 9 weeks and 1 year in patients with hip OA

At 1 year, reported PASS cut-points were found to be significant for all outcome measures (p < .05). The percent correctly classified ranged from 69 to 75 %. These early findings suggest that identified PASS cut-points are able to differentiate between subjects who are currently ‘satisfied’ from those who are ‘unsatisfied.’ PASS cut-points remained relatively stable over time with little fluctuation between 9 weeks and 1 year. Interestingly, PASS cut-points for the WOMAC pain and WOMAC function subscale increased slightly over time (1 and 5 points, respectively) with higher scores representing greater pain and disability. At 1 year, 19 of 65 (29 %) patients scored less than or equal to 40 on the WOMAC function subscale, which was consistent with a positive response on the PASS. With the exception of the 20-cm step test, all physical performance measures moved in the direction of improved function (quicker speeds, increased number of chair stands).

Discussion

In this prospective study, we investigated PASS estimates in 7 commonly used outcome measures in patients with hip OA. The PASS question was utilized in attempt to identify cutoff scores among commonly used hip OA-related outcome measures for a more robust measure of change hypothesized to be more reflective of patient satisfaction. The PASS is the value beyond which patients consider themselves well, whether at the end of an episode of care or in reference to the time required to achieve a maintained level of satisfaction [20]. We hypothesize that the PASS question and PASS cut-points can be considered a clinically relevant treatment target where patients may be unlikely to seek further treatment.

In terms of the longevity of PASS estimates over time, we found our cut-points to be relatively stable over time with little fluctuation. These results are consistent with previous authors who have reported similar findings on the stability of PASS estimates over time [12]. Both the WOMAC pain and physical function subscale cut-points rose slightly from 9 weeks to 1 year (1 and 5 points, respectively) and could potentially be contributed to an increased tolerance to pain and disability over time. However, the small nature of the change may more likely be attributed to our small sample size or recall bias, a known limitation to the WOMAC physical function subscale [38, 39]. Interestingly, with the exception of the 20-cm step test, all of the physical performance measure cut-points demonstrated improved performance over time (faster times, increased number of chair stands). This is somewhat consistent with previous authors’ findings and hypotheses that patient expectations may change over time, especially with a positive response to treatment [13]. Patients suffering from previously failed treatments may have lower expectations and report higher levels of pain and disability as satisfactory, whereas expectations may rise with a more effective treatment resulting in lower overall pain and disability scores to achieve a state of satisfaction. The change in PASS estimates from this study is relatively small, and more studies with larger sample sizes are needed before conclusions can be made.

Identified cut-points are difficult to compare to previous studies given the variation in patient population, treatment protocol, and outcome measures. We are aware of only one other study in a similar patient population whereby PASS estimates were established in patients with painful hip and knee OA receiving nonsteroidal anti-inflammatory drugs for 4 weeks [15]. Tubach et al. [15] reported PASS cut-points of 35.0 mm for the visual analog scale for pain (0–100 point scale) and 34.4 for the WOMAC function scale (0–100 point scale) in patients with hip OA after 4 weeks of pharmacological treatment. Our WOMAC physical function subscale cut-point of 35 on a 0–170-point scale converts to a score of 21 on a 100-point scale, significantly lower than that reported in the Tubach et al.’s study [15]. We are unable to directly compare our full results to this study given the different patient population and treatments used. This lower score could be attributed to a greater effect of physiotherapy treatment over pharmacological treatment, resulting in greater expectation for improved results from the patient’s perspective. The lower cutoff score could also be attributed to different cultures (New Zealand vs. France) and it may be that different cultures adapt differently and what is considered acceptable may also be different. Specific to the Otago region of New Zealand, where this study took place, nearly 41 % [40] of the work industry is employed in an occupation that could potentially be labeled as labor-intensive (trade workers, plant and machine operators and assemblers, agriculture, and fishery workers) and therefore could result in lower functional PASS cut-points to achieve a state of satisfaction simply related to work demands.

Previously, the MCII has been utilized to identify the minimal amount of improvement needed to be meaningful to the patient. It has been postulated that the MCII could be utilized globally when evaluating effectiveness of care when comparing pre- and post-intervention outcome measures. Recently, however, this concept has come under scrutiny [18, 41]. MCII estimates have been found to vary greatly based on the baseline demographics such as gender, chronicity of symptoms, age, and severity of symptoms [15, 17]. We have suggested the use of identified cut-points associated with the PASS on commonly used outcome measures as a useful complementary measure to the MCII in terms of planning long-term goals for treatment and plans for patient discharge.

Utilization of the identified cut-points requires further scrutiny before adaptation into clinical practice. Tubach et al. [14] found PASS estimates to vary greatly based on the chronicity of disease and with disease activity with patients seeming to accept higher levels of pain and functional impairment in chronic conditions [12]. In addition, Maksymowych and colleagues [13] recently determined the PASS to be a variable versus stable value influenced by age, duration of disease, and gender in patients with ankylosing spondylitis. However, in that study, the duration of follow-up was only 24 weeks, compared to the 1-year follow-up in this study. The stability of PASS estimates has also been scrutinized, which has been hypothesized to be associated with patient expectations with regard to treatment effectiveness in a pharmacological trial as described earlier [13]. However, the MCII has been reported as even more variable and dependent upon baseline characteristics highlighting the PASS as a more robust and stable measure of change [14, 42, 43]. Other factors influencing the PASS estimate could be attributed to varying compensation systems. For instance, someone paying out of pocket for treatment may accept a higher level of pain and disability given the cost of treatment, whereas someone covered by insurance for a certain number of visits may not report a state of satisfaction until the end of treatment after achieving a certain level of function. Again, further studies with more varied populations and compensation systems are needed to make these findings more generalizable.

Given that hip OA has a significant impact on function, yet is variable in presentation, cutoff scores that are robust should facilitate clinical decision making on course of treatment and may in the future predict function, disability, or mortality. A stratification based on severity of disease will likely provide more utility given the above-described confounding factors of severity and chronicity of pain. Multiple authors have reported on the variability of MCID estimates or PASS estimates based on the severity of symptoms at baseline [14, 15, 17]. The general consensus from these studies [14, 15, 17] is that the greater the severity of symptoms at baseline, the more change that is needed to report meaningful change. PASS estimates tend to be higher for those patients with higher baseline severity levels [14, 15]. This makes sense given low baseline functional status scores will require greater change to report improving functional status. However, these patients may also be satisfied at a functional level lower than someone with a higher functional level at baseline. This highlights an important point in that using a single point estimate based upon the average score of the group is not representative of the wide distribution of baseline severity levels. At the individual level, reported PASS estimates may misclassify people below the mean as not having experienced satisfactory change when in fact they have. Adjusting for baseline severity is one method of addressing this limitation but more studies are needed.

Given the arbitrary nature of achieving a ‘meaningful change’ score in an outcome measure, we propose the use of identified cutoff scores, such as those identified in this paper, that better reflect the patients’ perspective for the satisfaction of their current level of function. Such cutoff scores would be similar to studies that have identified age- and gender-matched normal walking speeds for healthy individuals in which a functional goal could be established in clinical practice [44]. Identified PASS estimates could be incorporated as end points in clinical practice whereby clinicians can determine a successful response to treatment when the patient has achieved a state they consider satisfactory.

Limitations

Our study had several limitations. The sample size for the purposes of ROC analysis was only 65, which may have affected our precision estimates as well as our ability to detect significant differences in our 4 out of 7 outcome measures. However, even with this small sample size, some significant changes were detected and at 1-year significant differences were found for all outcome measures. Regardless, all results should be interpreted with caution and as preliminary given such a small sample size. The small sample size also prevented us from stratifying the analysis to assess whether baseline levels of pain and function affected the PASS estimates. Given PASS will likely vary based upon the specific impairments and activity limitations relevant to a particular patient population, the extent to which these values can be applied to populations other than hip OA is unclear. However, this is the first study to report PASS estimates for commonly used outcome measures in patients with hip OA undergoing physiotherapy treatment.

The data in this study are not definitive, and more studies are needed to validate our findings in larger samples. Given the fluctuating clinical symptom state of hip OA, identified cut-points may be elevated or low based on day-to-day variations. Larger sample sizes would likely clarify these numbers and narrow confidence intervals associated with fluctuating symptoms.

Conclusion

The PASS is directly related to the personal experience of the patient and may be complementary to the MCII when identifying patient response to treatment. Previous authors have suggested that it is more important for a patient to feel good or satisfied (PASS) than to feel better or improved (MCII). Although the utilization of outcome measures in hip OA is common, the identification of meaningful change is variable. This is the first study to identify target PASS cut-points in commonly used outcome measures, in particular the WOMAC pain and function subscales, for patients undergoing physiotherapy treatment for impairments related to hip OA. Given the inherent and known weakness behind the MCII, the authors recommend increased utilization of the PASS to assess patient response to treatment and in determining a plan for patient discharge. Further investigation in larger sample size with stratification based on severity is warranted.