Written questionnaires referred to as health-related quality of life (HRQL) reports provide clinicians with an easily administered, cost-effective method of measuring treatment outcomes. Because decisions regarding therapy and policy are based on the outcomes of treatment, caregivers have the responsibility of defining and specifically measuring the effects of intervention and quality of care on an ongoing basis. Furthermore, constraints in resources force healthcare workers to prove treatments to be efficacious, cost-effective, and patient-focused with an increased emphasis being placed on productivity. In the field of hand surgery, a number of outcome tools have been developed. Determining which measures are practical for clinical use in specific situations will allow for comparisons between studies and treatments while aiding in decision making with the goal of improving quality of care.

Responsiveness is an instrument’s ability to accurately detect change when it has occurred [1]. It is not a fixed property and should be considered a contextualized attribute of a measure and described in relation only to a specific purpose and situation. In this study, we determined the responsiveness of three common outcome questionnaires: the Disabilities of the Arm, Shoulder, and Hand (DASH) questionnaire; the Michigan Hand Questionnaire (MHQ); and the Patient-Specific Functional Scale (PSFS), in four different surgical situations of the hand or wrist. Although these instruments assess slightly different domains, they all attempt to capture HRQL with acceptable rigor. Psychometric properties have been reported for all three instruments [2, 7, 8, 18, 19, 21, 22, 24]; however, many physicians are not sure what type of outcome information should be collected routinely and which tools are clinically useful [32]. Due to a lack of comparison studies, user preference is the only factor currently dictating which measure to use in clinical situations. Evaluating the responsiveness of available tools in common surgical hand or wrist situations is necessary to clarify which is the most clinically useful method of outcome data collection. The objective of this study was to determine and compare the responsiveness of each questionnaire during three time intervals in each of four surgical hand or wrist situations.

Materials and Methods

Subjects

Consecutive adult candidates for surgery to treat either carpal tunnel syndrome, wrist pain, finger contracture, or tumor were invited to participate. All study procedures were approved by the Research Ethics Board at Sunnybrook Health Sciences Centre and written informed consent was obtained from all study subjects.

Questionnaires

The DASH is a 30-item questionnaire used to measure disability for any disorder affecting the upper extremity by assessing severity of symptoms and difficulty in completing specific tasks [18]. Its validity, reliability, and responsiveness have been reported for a variety of upper extremity conditions [3, 12, 16, 26, 29, 36]. The score, which does not distinguish between the right and left extremities, is transformed to a scale of 0 to 100, where a higher score indicates more severe disability.

The MHQ is a hand-specific questionnaire for patients with chronic hand conditions [7]. Consisting of 57 items, it distinguishes between the left and right hands over six domains including overall hand function, activities of daily living, pain, work performance, aesthetics, and patient satisfaction with function. The MHQ has been used in carpal tunnel syndrome [20, 21], distal radius fracture [9], reconstruction [5, 6], and arthroplasty in rheumatoid arthritis [14, 25]. Each domain is scored from 0 to 100, where a lower score denotes more severe disability except for the pain domain where the opposite holds true. The final score is obtained by averaging the six scores after reversing the pain score.

Finally, the PSFS was developed as a standardized method for eliciting and recording functional status limitations that are important to individuals [31]. Patients are asked to identify up to five important activities that they are unable to perform or are having difficulty with as a result of their condition. Items are followed over time to provide a comparison of activity levels at a given point with respect to a pre-disability state [19]. The PSFS can be utilized in any investigation of improvement based on activities chosen by the patient and is therefore suitable for numerous conditions involving disability due to injury, disease, and/or pain. It has been used to evaluate outcomes of treatment for neck pain, back pain, and disorders of both the upper and lower extremities [4, 10, 27, 30, 3335].

Administration

Each subject completed the DASH, MHQ, and PSFS shortly before surgery and 3 and 6 months after surgery. Subjects with incomplete questionnaire profiles because of lack of attendance or survey spoilage were excluded. Patients were not excluded on the basis of clinical success or failure.

Analysis

Scores on all questionnaires were calculated at all time points and converted into percentages where applicable. Scores for the DASH were reversed so that a lower score corresponds to more severe disability for comparison purposes. Percentages were analyzed using repeated measures one-way analysis of variance and Newman–Keuls multiple comparison tests. Standardized response means (SRM), which are utilized to measure responsiveness when data for which two time points in the same patients are being compared [22], were calculated by dividing the difference in mean scores by the standard deviation of the mean difference. According to Cohen [11], a SRM of 0.2 is considered as small, 0.5 as medium, and 0.8 as large. All analyses were carried out using SAS Version 9.1 (SAS Institute, Cary, NC, USA).

Results

All Subjects

A total of 799 eligible patients were initially enrolled in the study. After excluding data from incomplete surveys and those lost to follow-up, 81 participants had completed the three questionnaires at all three time points. The mean length of time between the baseline and first follow-up and between the first and second follow-ups was 3.1 and 6.7 months, respectively. The sample consisted of 20 subjects (24.7%) receiving surgical treatment for carpal tunnel syndrome, 21 subjects (25.9%) for wrist pain, 34 subjects (42%) for finger contracture, and six subjects (7.4%) for tumor. The mean age of study participants was 49.9 ± 16.2 years.

Mean scores for all groups at all time points are displayed in Fig. 1 and Table 1. When all subjects were combined, the DASH, MHQ, and PSFS detected a significant postoperative improvement (p < 0.05) at both 3 and 6 months, whereas only the MHQ detected a significant improvement between 3 and 6 months (p < 0.05). Responsiveness during each interval for all groups is shown in Fig. 2 and Table 2.

Figure 1
figure 1

DASH, MHQ, and PSFS mean preoperative, 3-month postoperative, and 6-month postoperative scores. a Whole sample (n = 81); b carpal tunnel group (n = 20); c wrist pain group (n = 21); d finger contracture group (n = 34); e tumor group (n = 6). *Three-month mean score significantly differs from respective preoperative mean score. Six-month mean score significantly differs from respective 3-month mean score. DASH scores are reversed so that a higher score reflects less disability. Error bars represent standard deviation.

Figure 2
figure 2

Standardized response means (SRM) for all questionnaires and all groups from preoperative to 3-month, preoperative to 6-month, and 3- to 6-month periods. a Whole sample (n = 81); b carpal tunnel group (n = 20); c wrist pain group (n = 21); d finger contracture group (n = 34); e tumor group (n = 6).

Table 1 DASH, MHQ, and PSFS mean preoperative, 3-month postoperative and 6-month postoperative scores ±SD for each surgical situation.
Table 2 SRM for all questionnaires from preoperative to 3-month, preoperative to 6-month, and 3-month to 6-month periods.

Carpal Tunnel Syndrome

All instruments detected a significant postoperative improvement at both 3 and 6 months (p < 0.05). From 3 to 6 months, only the MHQ detected a significant improvement (p < 0.05). The MHQ exhibited the highest SRM (1.04) during the preoperative to 6-month period. The MHQ was also responsive during the preoperative to 3-month (0.58) and the 3- to 6-month periods (0.68). The DASH was responsive during the preoperative to 3-month and preoperative to 6-month periods (0.64, 0.77), and the PSFS was responsive during the preoperative to 6-month period (0.65).

Wrist Pain

The DASH and MHQ showed a significant 3- and 6-month postoperative improvement (p < 0.05). The MHQ exhibited the highest responsiveness of the three questionnaires, which was observed during the preoperative to 6-month period (0.87). The MHQ was also responsive during the preoperative to 3-month period (0.61). The DASH was responsive during the preoperative to 3-month (0.53) and preoperative to 6-month periods (0.61). The PSFS was not responsive for this group.

Finger Contracture

The MHQ and PSFS detected a significant postoperative improvement over 6 months (p < 0.05) and the PSFS detected a significant improvement from 3 to 6 months (p < 0.05). The PSFS and MHQ were only responsive during the preoperative to 6-month period (0.64 and 0.62, respectively). The DASH was not responsive for this group.

Tumor

No instrument detected a significant improvement during the intervals measured. Nonetheless, the DASH was responsive during the preoperative to 3-month period (0.55).

Discussion

In this study, we compared the responsiveness of three hand surgery outcome questionnaires and found more than one tool to be responsive. The MHQ was the most responsive for those with carpal tunnel syndrome and wrist pain. The DASH was the most responsive for those with tumor, and the PSFS was the most responsive for those with finger contracture. When all groups were combined, the MHQ demonstrated higher responsiveness than the DASH and PSFS. When considering the time periods during which responsiveness was measured, the highest responsiveness for the carpal tunnel, wrist pain, and finger contracture groups was observed during the preoperative to 6-month period, while responsiveness was greatest from the preoperative to 3-month period in tumor patients. These results support the notion that responsiveness, an element of validity, should be reported in the context of both follow-up period and diagnosis [1]. Our results indicate variable responsiveness among time periods within patient samples. This is of clinical importance because follow-up periods after different procedures may differ in length. If a long follow-up period is not feasible in practice, the use of a questionnaire that is responsive to the population during a more realistic period may be useful.

Responsiveness studies in the field of hand/wrist surgery are limited. At present, responsiveness of the MHQ has been studied in those with distal radius fracture [22], carpal tunnel syndrome [21], and in populations with a variety of chronic hand conditions [8]. Kotsis and Chung [21] compared responsiveness (SRM) of the DASH and MHQ in 50 patients undergoing surgery for carpal tunnel syndrome. They concluded that the MHQ (components ranging from 0.5 to 1.1) and the DASH (0.7) were both responsive over a 6-month postoperative period. In the present study, the MHQ (1.04) and DASH (0.77) were both responsive in patients with carpal tunnel syndrome, reinforcing the results of Kotsis and Chung. Although the MHQ detected more severe preoperative disability in the present study (41.01) compared with the study by Kotsis and Chung (52.12), postoperative averages were comparable (61.83 versus 64.6, respectively), as were SRMs, indicating similar responsiveness. Likewise, the DASH detected more severe disability before surgery (56.25) compared with preoperative DASH scores obtained by Kotsis and Chung (38.1). After surgery, our subjects’ postoperative scores were similar to the preoperative scores of Kotsis and Chung’s sample (38.3 versus 20.6, respectively). Although the preoperative disability level is different between studies, both conclude that the DASH and MHQ are responsive for those undergoing carpal tunnel surgery.

Greenslade et al. [15] observed a SRM of 0.66 for the DASH during a 3-month postoperative period in those with carpal tunnel syndrome. We observed a SRM of 0.64 for the same period, providing further evidence that the DASH is responsive in this population. Gay et al. [13] also found the DASH to be sensitive to clinical change in carpal tunnel syndrome 12 weeks following surgery, leading to a recommendation that the DASH be used as the primary outcome tool when postoperative follow-up evaluation is at least 12 weeks. A recent analysis of outcome measures for carpal tunnel syndrome expressed potential for the DASH in this population but asserted that further validation is required [28]. Studies exploring the validity of the DASH for specific surgical hand situations will clarify the benefits of using this tool. MacDermid et al. [24] investigated the responsiveness of the DASH in distal radius fracture outcomes, reporting extremely high SRMs for both 0- to 3-month and 0- to 6-month periods. The authors attribute this finding to the acute nature of a distal radius fracture and a more uniform response to intervention and extreme clinical change. These results provide evidence that the DASH may be more responsive to change in patients with acute conditions than in patients with chronic conditions.

We found the DASH to be the only responsive questionnaire for the tumor group. Whereas the highest responsiveness for all other groups occurred during the preoperative to 6-month period, the tumor group experienced the highest responsiveness during the preoperative to 3-month period. One explanation for this finding is the short recovery period associated with tumor patients, where the most extreme change is seen within the first 3 months. Impaired function resulting from a tumor is often caused by structural changes, with disability subsiding almost immediately following removal. Another finding of interest is that the tumor group had the highest mean preoperative scores on all three measures, implying less severe preoperative impairment than the other groups and, therefore, little room for improvement. This may result in a ceiling effect, which can discount the use of a measure [24]. Similarly, absence of a significant change in scores from one time point to the next suggests that either the instrument is failing to address relevant issues or that no clinical change has occurred.

Moreover, the number of tumor patients originally enrolled in the study was 161; however, complete data were available for only six subjects. It is often unlikely for less severe cases to attend follow-up visits, increasing the likelihood of exclusion from the study. This may result in a possible misrepresentation of patients with hand/wrist tumor.

While our results demonstrate acceptable responsiveness of the DASH for three of the four groups, the MHQ exhibited higher responsiveness for those with carpal tunnel syndrome and wrist pain. While reasons for this remain elusive, one explanation could be that the MHQ is able to track symptom improvement versus functional improvement as separate scales, giving a more detailed picture of how and why the patient is or is not improving. The MHQ also has the advantage of addressing only hand issues, whereas the DASH was developed for all disorders of the upper extremity. Because it also provides information about each hand, an unaffected hand may be used as a within-subject control [21]. Responsiveness of the DASH in those with wrist pain, finger contracture, and hand/wrist tumor has not been reported elsewhere.

Data for those with finger contracture show that the PSFS was the most responsive of the three instruments for this group. Since finger contracture often involves one digit, impairment may be very limited to specific activities depending on which digit is affected. Domain-specific questionnaires designed for the entire upper limb such as the DASH and those focused on pain and broad hand functions such as the MHQ may not be as useful for this population, as our results suggest. Scores on the DASH and MHQ did not change significantly from one time point to the next, implying that the domains addressed by these two questionnaires may be irrelevant to the finger contracture group. Reasons for this may include low pain levels often associated with finger contracture and the fact that finger contracture patients may not experience difficulty performing the activities listed on the DASH and MHQ. This may also explain why finger contracture patients in our study scored high on both the MHQ and the DASH, indicating little room for postoperative improvement and the possibility of a ceiling effect. Herweijer et al. [17] conducted the only study to date involving outcome questionnaires in those with finger contracture and reported a significant improvement at 10 months detected by both the DASH and MHQ. The authors did not evaluate the PSFS, nor did they examine responsiveness. Because many patients with finger contracture experience very specific limitations, the PSFS may be an appropriate choice; however, further investigation is required to establish the psychometric properties of outcome measures in finger contracture patients.

In a comparison of nine patient-specific indices in those with musculoskeletal disorders, the PSFS was found to demonstrate content validity, generalizability, and feasibility [19]. Because some clinicians prefer to rely on individual patient concerns and improvement as indicators for problem identification or treatment monitoring as opposed to fixed-item questionnaires, the clinical application of the PSFS is appealing. The drawback of the PSFS is that allowing patients to generate their own items presents difficulty in comparing scores across patients and settings. Patient-specific tools such as the PSFS are difficult to statistically analyze because standardization is not possible, limiting the score’s ability to hold a common meaning among patients [23]. Despite this notion and the absence of reports of reliability, Pearson correlations and effect sizes have been used to calculate the performance of the PSFS [19].

There is no standard equation available to calculate an appropriate sample size for responsiveness; however, a sample size calculation for reliability is one way to approach this. For an alpha of 0.05, a beta of 0.2, and to account for a 10% dropout/survey spoilage rate, the estimated appropriate sample size was 51 subjects per group for a total of 204 subjects. Despite initial adequate recruitment, a large number of subjects discontinued. Patients did not show up for follow-up visits or declined follow-up questionnaire completion complaining it was onerous.

An additional calculation of responsiveness from data of all subjects who completed at least the preoperative and 3-month questionnaires was performed. Sample size increased to 153 subjects, including 46 with carpal tunnel syndrome, 33 with wrist pain, 61 with finger contracture, and 13 with tumor. Figure 3 and Table 3 display the responsiveness of the instruments for each group during the 3-month postoperative period. When compared with the same time period in Fig. 2, responsiveness of the DASH and MHQ remained in the medium range for those with carpal tunnel syndrome and wrist pain, while the PSFS remained in the low range. All three measures maintained low responsiveness for the finger contracture group. For those with tumor, responsiveness of the DASH remained in the medium range, whereas responsiveness of the MHQ and PSFS increased from the low range to the medium range (0.43 to 0.56 and 0.49 to 0.68, respectively). Although no formal assumptions can be made based on these comparisons, these additional data suggest that a larger sample size may provide more accurate results for those in the tumor group and results would not change for the other three groups.

Figure 3
figure 3

Standardized response means (SRM) for all questionnaires for the preoperative to 3-month period including subjects who completed at least baseline and 3-month questionnaires. a Carpal tunnel group (n = 46); b wrist pain group (n = 33); c finger contracture group (n = 61); d tumor group (n = 13).

Table 3 SRM for all questionnaires for the preoperative to 3-month period (including subjects who completed at least baseline and 3-month questionnaires).

Finally, the weakness of measuring responsiveness should not be overlooked. Responsiveness is not a fixed property, but rather an element of validity. Statistics related to responsiveness should be explained in specific context to the group being measured, the scores being contrasted and the type of change being quantified [1]. Another factor of importance is the efficiency of treatment. MacDermid et al. [24] attributed a lower responsiveness over a 3- to 6-month interval compared with a 0- to 3-month interval to a difference in treatment magnitude effect. Magnitude influences responsiveness statistics, leading to difficulties in comparing studies on the same condition. MacDermid sites an example of a questionnaire that may directly measure a complication that may be missed by other tools and therefore statistically appears to be insensitive to change when in reality, it is measuring a negative treatment effect. Because positive, negative, and non-beneficial treatment effects exist and responsiveness statistics may not reflect this, selecting an appropriate outcome instrument to use in the field of hand surgery requires that responsiveness be considered in conjunction with face, construct, and criterion validity. Moreover, measuring mean change in a group of patients can have implications for estimating a measure’s ability to detect meaningful change in individuals. Accepting that mean indicates improvement in an individual may imply that several people, who may consider themselves unchanged, would be erroneously considered improved [1].

Developing one questionnaire for the evaluation of HRQL issues in the field of hand surgery is difficult, and the three questionnaires evaluated in our study have advantages and disadvantages. Our study has shown that the MHQ is responsive for those with carpal tunnel syndrome, wrist pain, and finger contracture; the DASH is responsive for those with carpal tunnel syndrome, wrist pain, and tumor; and the PSFS is responsive for those with carpal tunnel syndrome and finger contracture. Clinicians may find the DASH useful to assess function and symptoms combined in one scale. The MHQ is useful when independent scores from different domains are required or when comparison with an unaffected hand is needed. The PSFS is useful when the disorder is affecting a limited number of patient-specific activities. We have found that each instrument is responsive for at least one group, clarifying the applicability of each for outcome studies related to the field of hand and wrist surgery.