Introduction

Assessment of functional outcomes related to health is necessary for both researchers and clinicians as diagnosis alone is a poor predictor of clinical and functional outcomes [1]. Measurement of functional outcomes is becoming increasingly important for assessing improvement in health-related quality of life of patients or populations, justifying and obtaining reimbursement for healthcare services, providing an economic interpretation of the burden of chronic diseases, and demonstrating efficacy of interventions in clinical trials [25]. Many measures of the impact of musculoskeletal disorders (MSDs) and chronic health conditions on the performance of functional activities have been developed over the last two decades [68]. Although some measures have undergone extensive psychometric testing to determine their reliability and validity, few studies are available to guide researchers and clinicians about which measure(s) are best suited for use in a given setting, population, or stage of disease severity [3, 9, 10].

Some measures of upper extremity function that have been used in recent musculoskeletal research studies, such as the Quick Disabilities of the Arm, Shoulder, and Hand (QuickDASH) [11], the Work module of the DASH (DASH-W), and the Functional Status Scale (FSS) from the Boston Carpal Tunnel Questionnaire [12], were designed and validated for use in clinical populations. The QuickDASH is a region-specific measure that was designed to assess functional outcomes relevant to a range of conditions affecting the upper extremity (UE) [13]. The Work module of the DASH (DASH-W) is designed to assess difficulties with work performance due to UE disorders. The QuickDASH and DASH-W have been tested and widely used in orthopedic and rehabilitation settings among patients for many different UE musculoskeletal disorders (MSDs) [3, 1418]. However, both workplace-based studies and comparative studies against other functional measures have been more limited, especially for the DASH-W [1924]. The FSS was designed for use in clinical patients seeking treatment for symptoms of carpal tunnel syndrome (CTS), and has primarily been tested in patients undergoing surgical interventions for CTS [12, 25]. The Short-form 8 health survey (SF-8) is a generic health status measure that assesses multiple domains of health and well-being, and was developed for assessment of health-related quality of life in national health surveys [26]. The SF-8 has been tested in both general population samples and specific patient samples [26, 27].

Clinical populations are comprised of people who are treatment-seeking and likely represent a more symptomatic and severe range of disease than found in general or working populations. It is unclear how well measures designed for clinical populations perform in relatively healthy active working populations [19, 20] and whether the response scales are able to capture functional limitations corresponding to early stage disease [28]. Using functional measures in working populations may identify early stages of disease, allowing for interventions to improve work ability, prevent disability, and promote return to work following injury [1921, 29]. Epidemiological studies have used a range of case definitions associated with MSD outcomes including symptoms alone, symptoms plus physical signs, or various functional outcomes [3032]. Simultaneous comparison of the performance of multiple health measures against various UE MSD outcomes would inform research and clinical practice regarding which measure(s) may be used to identify mild levels of impairment in relatively healthy, active working populations.

The purpose of this study was to evaluate the performance of several standardized measures of functional work performance, activities of daily living, and overall health in relation to four upper extremity (UE) case definitions: (1) UE symptoms, (2) UE musculoskeletal disorders (MSD), (3) carpal tunnel syndrome (CTS), and (4) work limitations due to UE symptoms.

Materials and Methods

Participants

Study participants were subjects in the prospective Predictors of Carpal Tunnel Syndrome (PrediCTS) study. Subjects were enrolled in the PrediCTS study (2004–2006) as full-time, newly hired workers in construction, service, and clerical jobs. Detailed descriptions of subject recruitment for the PrediCTS study may be found in several prior publications [3235]. The present analysis included subjects who completed a follow-up visit between March 2012 and August 2013, consisting of a self-reported questionnaire, physical examination of the upper extremity, and nerve conduction studies of the hands. This study was approved by the Institutional Review Board of the Washington University in St. Louis School of Medicine. All subjects provided written informed consent and were compensated.

Data Collection

Subjects completed self-reported questionnaires collecting information about demographics, employment history, physical and psychosocial job exposures, UE symptoms, and functional and work limitations related to UE symptoms. Trained research technicians performed a standardized physical examination for clinical signs of UE MSDs, including tenderness to palpation and standard provocative maneuvers. Research technicians also performed standardized bilateral nerve conduction testing of the median and ulnar nerves at the wrist to determine the presence of abnormal median neuropathy consistent with carpal tunnel syndrome. Specific methods are described in previous publications [3638].

Functional Measures

As described below, standardized measures assessed three general domains of health and functioning: activities of daily living (ADL), work performance, and overall health. Assessments were administered to all participants, although the 1-year recall modified QuickDASH was only administered to subjects with UE symptoms.

One-Year Recall Modified QuickDASH

The QuickDASH is an 11-item, shortened version of the Disabilities of the Arm, Shoulder, and Hand (DASH) outcome measure. The QuickDASH is designed to measure physical functioning in people with UE disorders. Respondents are asked to rate their ability to perform various activities of daily living and the severity of their symptoms, on a scale from “1” to “5.” A minimum of 10 of the 11 QuickDASH items must be completed in order for a score to be calculated. Completed responses are summed and averaged. The average value is transformed to a 0–100 scale by subtracting one and multiplying by 25. Higher scores indicate greater disability [15]. The QuickDASH has been shown to be reliable and valid in various clinical populations [15, 39, 40]. The recall period for the QuickDASH was modified from the original timeframe of “during the past week” to “when your symptoms were the worst in the past year” to parallel the reference time frame used for other measures in the PrediCTS study. Due to the modified recall period, we refer to the QuickDASH in this study as the 1-year recall modified QuickDASH.

Modified Functional Status Scale

The Functional Status Scale (FSS) from the Boston Carpal Tunnel Questionnaire is an 8-item questionnaire originally developed by Levine et al. [12] to assess functional abilities in patients with CTS. The FSS has shown good reproducibility, internal consistency and responsiveness to change in surgical patients [12]. Each item of the FSS is rated on a scale from “1” “no difficulty” to “5” “unable.” The overall score for the FSS is calculated as the mean of the completed items, and ranges from 1 to 5 [12]. Higher scores indicate greater disability. Similar to the QuickDASH, the recall period for the FSS was modified from the original two-week recall period to 1 year, and is described as the modified FSS in this study.

One-Year Recall Modified DASH Work Module

The DASH Work module (DASH-W) is a 4-item scale assessing the impact of UE conditions on physical work ability. Workers are asked to rate their difficulty in performing work activities on a scale from “1” “no difficulty” to “5” “unable.” All 4 items must be completed in order to calculate a score. The DASH-W is scored by summing all four responses and dividing by 4 to get an average. Then, 1 is subtracted and the value is multiplied by 25 to get a final score ranging from 0 to 100. Higher scores indicate greater disability [41, 42]. We also used a 1-year recall period for the DASH-W, in order to be consistent with the other outcome measures, which is referred to as the 1-year recall modified DASH-W.

Short Form-8 (SF-8) Health Survey: Physical Component Score

The SF-8 Health Survey is an 8-item scale designed to assess self-perception of overall health and ability to perform daily activities. The SF-8 has shown acceptable reliability compared with the longer, more widely tested SF-36 health survey [26]. Items are scored on “1” to “5” scales with various verbal anchors. The SF-8 was scored to yield the physical component score (PCS-8) according to the developers’ recommendations [26]. If any items were missing, a score was not calculated. Higher scores on the PCS-8 indicate better health. In contrast to the other measures in this study, the PCS-8 used a 4-week recall period.

Case Definitions

Subjects were determined to meet or not meet each of the four UE case definitions described below for (1) UE symptoms, (2) UE MSD, (3) CTS, and (4) work limitations due to UE symptoms.

Upper Extremity (UE) Symptoms

Subjects reported symptoms in three regions of the UE that would commonly be used in epidemiological case definitions for UE MSDs. “In the past YEAR, have you had any RECURRING (repeated) symptoms in your (Shoulders/upper arms, Elbow/forearms, or Hands/Wrists/fingers) more than 3 times or lasting more than one week?” [43].

Upper Extremity Musculoskeletal Disorders (UE MSD)

Among subjects reporting UE symptoms (#1 above), those who also had corresponding positive physical sign for a MSD of the shoulder, elbow, or wrist, met our epidemiological case definition of an UE MSD [31]. The case definitions considered for this study were rotator cuff tendonitis, biceps tendonitis, lateral or medial epicondylitis, radial tunnel syndrome, cubital tunnel syndrome, deQuervain’s tenosynovitis, wrist flexor or extensor tendonitis, or carpal tunnel syndrome (defined in the following paragraph) [31].

Carpal Tunnel Syndrome (CTS)

CTS cases were contained within the UE MSD category, but we also evaluated CTS cases separately because the FSS (from the Boston CTS Questionnaire) was designed to specifically evaluate function of patients with CTS, CTS is one of the most expensive work-related diagnoses, and we had a small but reasonable number of cases to allow this study. Subjects with typical median nerve symptoms and abnormal median neuropathy of the same hand met our case definition for CTS [44]. Median nerve symptoms included numbness, tingling, burning, or pain in at least one of the thumb, index, or middle fingers, reported on a hand diagram [45, 46]. Criteria for abnormal median neuropathy was defined as median distal motor latency >4.5 ms (ms), median distal sensory latency >3.5 ms, or median-ulnar sensory latency difference >0.5 ms [47].

Work Limitations Due to UE Symptoms

Subjects who reported UE symptoms were asked to complete six additional items from the University of Michigan Upper Extremity Questionnaire (UEQ) [48, 49], to describe work limitations that resulted from having UE symptoms. We created a composite work limitations outcome from these questionnaire items which included self-reported limitations in work ability or productivity, or missing days from work, having job restrictions, or changing jobs or companies due to one’s symptoms. We have used a similar case definition for work limitations in our prior studies [32, 50].

Statistical Analysis

Descriptive statistics were calculated for the study population and the frequencies of each case definition. We ran correlations between the functional measures using Spearman rank correlations. We considered correlation coefficients of 0.7–0.9 to be strong, 0.4–0.7 moderate, and <0.4 weak [51]. We expected moderate correlations at best between the measures based on differences in the constructs each measure was designed to assess, as well as the strength of associations that have been shown in prior studies [3, 19, 23]. Because lower scores on the SF-8 indicate worse health, whereas higher scores on all of the other measures indicate greater disability, negative correlations were expected between the SF-8 (PCS-8) and the other measures.

We determined if there were statistically significant differences in mean scores between cases and non-cases for each cases definition on each of the measures using Student’s t tests. We also reported the effect sizes (Cohen’s D) to show the magnitude of the differences that were detected between cases and non-cases for each case definition on each functional measure.

Finally, we compared the sensitivity and specificity of each measure across each of the case definitions for UE symptoms, UE MSD, CTS, and work limitations, to evaluate how well each measure identified functional impairments across a range of UE case definitions. We selected a cut-point for each measure using normative population scores from the scientific literature. We selected a cut-point of 8.81 points for the 1-year recall modified DASH-W from the U.S. normative population mean value for the standard 1-week recall version of the measure [52]. We selected the U.S. population mean value of 50 as the cut-point for the SF-8 physical component score [26]. As the FSS was designed for a clinical population, there has been no population mean score determined. Therefore, subjects whose score was more than 0.5 standard deviations (SD) above the mean were considered as having functional limitations. Our PrediCTS cohort at 6 months had an average FSS score of 1.14 (SD 0.38) [32]. We used a score difference of [0.5(SD) + Mean], or 1.3 points as a cut-point for the FSS. This value shows slightly less impairment than the post-surgical average score for CTS patients as previously reported by Levine and colleagues (FSS score = 1.9) [12]. Sensitivity and specificity could not be calculated for the QuickDASH since it was only completed by symptomatic workers. Analyses were conducted using R version 3.1 and SAS Version 9.3 (Statistical Analysis System Institute, Cary, NC).

Results

From the original PrediCTS cohort, 573 subjects were included in the present analyses. The majority of subjects were male (62 %), with a mean age of 38.4 years (SD 10.8), and the largest proportion was employed in construction trades (31 %) (see Table 1). Among the full cohort, 40 % of subjects had UE symptoms and 25 % of the cohort had symptoms and signs, meeting an UE MSD case definition. The prevalence of work limitations due to UE symptoms and CTS were substantially lower, 9 and 4 %, respectively. Compared with a clinical population in which 100 % of subjects would be symptomatic and would be seeking treatment, there was a relatively low prevalence of disease in this actively working population, with only 12 % of the overall cohort reporting having sought treatment from a medical professional in the past year.

Table 1 Demographic characteristics of the study population and frequencies of the outcomes (n = 573)

Distributions of Functional Outcome Scores

Descriptive statistics including distributions of scores, means, and median scores for each measure are shown in Table 2. Subject responses represented the full range of possible scores on each measure; however, the relatively low median scores across all measures suggested a relatively moderate MSD disease spectrum in this cohort.

Table 2 Distributions of scores of each functional measure in the study population (n = 573)

Correlations Between Measures

Correlations among the measures ranged from poor to strong (−0.34 to 0.85) (Table 3). The 1-year recall modified QuickDASH was strongly correlated to the modified FSS (r = 0.85) and the 1-year recall modified DASH-W (r = 0.76). Correlations of the SF-8 PCS-8 with other measures were weak to moderate (−0.34 to −0.43).

Table 3 Spearman correlation coefficients between the functional measures (n = 573)

Performance of the Measures Against 4 UE Case Definitions

Results of t tests showed statistically significant differences on all functional measures between cases and non-cases for UE symptoms, UE MSD, CTS, and work limitations (Table 4). Cases reported higher levels of ADL limitations, work disability, and worse overall health than non-cases for each outcome. Effect sizes showed larger differences for all measures between cases and non-cases of CTS and work limitations. For CTS, the largest differences between cases and non-cases were shown with the modified FSS; for work limitations the largest differences between cases and non-cases were seen on the 1-year recall modified DASH-W.

Table 4 Differences between cases and non-cases of upper extremity symptoms, upper extremity musculoskeletal disorders, carpal tunnel syndrome, and work limitations due to upper extremity symptoms, on the functional measures (n = 573)

Applying one cut-point for each measure across all case definitions allowed us to compare the sensitivity and specificity of each measure for the four UE case definitions (Table 5). In general, sensitivity of all measures was low and specificity was high in relation to UE symptoms and UE MSD. Sensitivity was higher for classifying workers with CTS and work limitations for all measures. The 1-year recall modified DASH-W showed the highest sensitivity in relation to the work limitations case definition.

Table 5 Sensitivity and specificity of each functional measure for four upper extremity case definitions based on a common cut-point for each measure (n = 573)

Discussion

This study examined the utility of several measures of work ability, functional limitations, and overall health for a range of UE health outcomes in a working population. This study helps to fill important gaps in the literature as few previous studies have directly compared these functional measures for various musculoskeletal case definitions or in an actively working population. Workers with UE symptoms, UE MSD, CTS, and work limitations due to UE symptoms reported worse ADL function, more limited work performance, and worse overall health than non-cases. Measures generally showed higher sensitivity and lower specificity with increasing levels of impairment, suggesting that measures performed better with more defined states of disease in this generally healthy population.

Measures designed to assess similar constructs of health and function were moderately to strongly correlated with one another, such as the modified versions of the QuickDASH and FSS which both address functional performance of daily activities. The strong correlation observed between the 1-year recall modified QuickDASH and DASH-W (r = 0.76) is consistent with the findings of Fan et al., comparing the standard QuickDASH and DASH Work module (r = 0.63) in active workers with UE symptoms and clinical cases for UE MSD [19] and those of House et al. [22] comparing the full DASH and DASH-W in workers with hand–arm vibration syndrome (r = 0.64). In another study that compared the full Boston Carpal Tunnel Questionnaire (BCTQ), from which the FSS is taken, stronger correlations were observed between the BCTQ and the full DASH, than with measures of overall health such as the SF-36, from which the SF-8 was developed [25]. Our findings are consistent with those of Leite et al. [25], with strong correlations observed between the modified FSS and 1-year recall modified QuickDASH, but weaker correlations with the SF-8 physical component score.

Many previous studies have shown that clinical patients with UE MSD report problems with functional performance [11, 14, 53, 54]. Studies in non-clinical populations are limited, but the growing body of literature suggests that active workers with UE symptoms also experience difficulties in ADL and work performance [1921, 23, 55]. Even in this relatively young, healthy working population in which few workers sought medical treatment (12 %), cases for all outcomes reported more difficulty performing ADL and work activities. In addition, workers with UE conditions also perceived themselves to have lower overall health, as measured by the SF-8. These findings provide support for the ability of all of the measures to discriminate statistically significant differences between cases and non-cases along a range of severity for UE conditions in workers.

The sensitivity and specificity of measures can vary among patients in different settings or different stages of disease severity. Our findings showed higher sensitivity of measures with case definitions that suggest greater levels of impairment, whereas the specificity was lower. These findings suggest that functional measures showed weaker ability to discriminate between workers at lower levels of disease severity. Measures that are more closely related to the outcome are more likely to be sensitive to discriminating cases from non-cases [9]. The DASH-W is an UE region-specific measure which was developed for clinical populations, and has performed well in relation to a variety of UE disorders [56]. In our study, the 1-year recall modified DASH-W showed the highest sensitivity in relation to our work limitations outcome. The FSS is a condition-specific measure designed for use with patients seeking treatment for CTS, and showed its highest sensitivity for CTS versus the other UE case definitions. Even a measure of overall physical health, the SF-8 physical scale, showed differences between cases/non-cases for each UE outcome in this study. Selection of appropriate measures should be guided by the outcome of interest and which measures relate best to the outcome.

As described in a review of functional measures for workers with UE MSDs, few measures have been developed specifically for identifying mild levels of impairment in relatively healthy working populations [28]. Salerno et al. [28] recommended three measures that were developed for research application as the most relevant measures for mild UE conditions: the Nordic Musculoskeletal Questionnaire (NMQ), the Neck and Upper Limb Instrument (NULI), and the UEQ which included items from which our work limitations outcome was derived. Although several measures including the DASH and FSS have been used in previous studies of workers, few studies have tested their performance among workers with mild to moderate UE conditions [1921, 28]. Our findings provide new evidence supporting the use of these measures in a mildly impaired population, even though they were primarily designed for clinical application.

One limitation of our study was in the design of our questionnaire. The 1-year recall modified QuickDASH was only completed by subjects with symptoms, thus we could not calculate t tests between scores for cases and non-cases or sensitivity and specificity. All measures used a 1-year recall period except for the SF-8 which used the standard 4-week recall period. This difference in recall periods may have contributed to the weak correlations found between the SF-8 PCS-8 and other measures. Modifying the recall periods from those suggested by developers of the QuickDASH, DASH-W, and FSS may limit comparisons of our data with previous studies or with normative data. According to a recent study by Norquist et al. [57], recall periods for patient-reported outcome measures should depend upon the attributes of the disease or phenomenon of interest. Our study was longitudinal with the frequency of follow-up of approximately 1 year. Workers reported on symptoms that ranged from mild to severe and were episodic in nature. The recall periods chosen for the measures included in our questionnaires were selected to correspond with the one-year recall period for the Nordic-style symptom questions. Some authors also caution that lengthening the recall period of measures may cause subjects to underreport functional limitations due to symptoms that occurred as much as 1 year prior [58, 59]. Stepan et al. [54], however, showed that patients with orthopedic hand and elbow injuries were able to accurately recall their baseline functional status on the QuickDASH up to 2 years following an initial office visit.

An important strength of our study was the simultaneous comparison of multiple health measures across a range of UE disease severity. We assessed how well various measures were associated with common MSDs and functional work outcomes. Measures are often chosen without regard to how well they relate to the research question or outcome being studied. Previous studies of functional and disability measures have explored reliability and validity, but seldom provide guidance to researchers and clinicians as to which measure may be most applicable in a given setting or population. Our study population was an active working population rather than a clinical population. Most of the measures in this study were either tested in or designed primarily for use in clinical populations and few studies have examined their utility in working populations with a wider range of disease severity. Although all of the functional and work limitation measures were able to detect differences between the case and non-case groups of active workers across a range of UE health conditions, our results suggest that measures most closely related to the outcome of interest may perform better. The 1-year recall modified DASH-W showed the highest sensitivity and largest effect size for distinguishing workers with and without work limitations, and the FSS showed better performance for the CTS case definition versus the other UE case definitions. Additional longitudinal studies in active working populations are needed. Future work will look at the responsiveness of the measures to detect clinically meaningful change over time and the ability of different measures to predict future disability among active workers. Assessment of functional outcomes is important in both research and clinical practice, however, the performance of measures in the population and setting of interest should be considered.