Introduction

Workers in many sectors experience pain, numbness and tingling in the neck, shoulder, arm, wrist and/or hand. Such symptoms may be warning signs of current or impending musculoskeletal disorders, such as peripheral nerve entrapments (e.g. carpal tunnel syndrome, ulnar tunnel syndrome), peripheral enthesopathies (e.g. shoulder tendinitis, lateral epicondylitis, hand-wrist tendinitis) and many other non-specific musculoskeletal pain disorders [1]. Collectively, these conditions are referred to as upper extremity musculoskeletal disorders [1]. Workers may also experience more acute traumatic injuries of the upper extremity such as crushed fingers, tendon lacerations and burns. Data from the 2005 European Foundation for the Improvement of Living and Working Conditions Survey showed that 25% of the workforce reported work-related neck/shoulder pain and 15% reported work-related arm pain [2]. Together, upper extremity musculoskeletal disorders (MSDs) and traumatic injuries are a large burden to society and to workplaces because of lost productivity, reduced performance and lost-time claims among affected workers [35].

Upper extremity MSDs occur as a result of many factors. Physical, psychosocial and personal factors are a few of the many known occupational MSD risk factors [6, 7]. The multicausal occupational risk profile requires multiple solutions. Current practices in upper extremity MSD management are as diverse as the MSD risk factors. Practices include various workplace interventions (ergonomics training, workstation adjustments, work redesign), clinical interventions (physical therapy clinic at the worksite) and disability management programs (implemented by employers, insurers and jurisdictions). Despite the frequency, high cost and range of initiatives implemented to prevent upper extremity MSD injuries little is known about which OHS interventions are the most effective.

Prior reviews have examined the effectiveness of interventions for reducing or preventing upper extremity musculoskeletal conditions, but the prior reviews [823] differ from this current review in many ways. Several focused on clinically-based interventions not specific to the workplace [9, 1113, 16, 23]. Others had a narrower scope and were restricted to specific clinical disorders and populations (e.g. persons with carpal tunnel syndrome [14]) or focused on a specific industry/sector (e.g. nursing [21] or computer users [18, 19]). Many included a broader range of musculoskeletal outcomes and therefore are not specific to upper extremity musculoskeletal signs, symptoms, disorders, injuries, claims or lost time [1722]. Prior reviews allowed a wide range of studies with varying methodological quality to contribute to the evidence synthesis. Unlike prior reviews, the current review does not include single group designs (i.e. no control or comparison group) [8, 17, 20] or “low” quality studies [8, 11, 12, 17]. This systematic review used a structured methodology for evaluating the literature and synthesizing evidence regarding workplace interventions focused on upper extremity MSDs [18, 2427]. Specifically, the research answers the following question: “do OHS interventions have an effect on upper extremity musculoskeletal symptoms, signs, disorders, injuries, claims and lost time?” Further, we seek to identify which specific types of OHS interventions are effective.

Materials and Methods

OHS studies were reviewed using a systematic review process that was developed by The Cochrane Collaboration [28] and adapted by the review team. A review team of 14 researchers from the United States, Canada and Europe participated. Reviewers were identified based on their expertise in conducting epidemiologic or intervention studies related to upper extremity MSDs among workers, their experiences in conducting systematic reviews or their clinical expertise. Review team members had backgrounds in epidemiology, ergonomics, kinesiology, occupational medicine, physical therapy, safety engineering and information science.

The basic steps of the systematic review process are listed below.

  • Step 1—Formulate the research question and search terms with research team and stakeholders.

  • Step 2—Conduct the literature search and pool articles with those submitted by experts.

  • Step 3—Level 1 review: select articles for inclusion based on relevance to the review question using 6 screening criteria.

  • Step 4—Level 2 review: assess quality of relevant articles with scoring on 16 criteria.

  • Step 5—Level 3 review: extract data from relevant articles for summary tables.

  • Step 6—Conduct evidence synthesis and develop recommendations with research team and stakeholders.

The review team used a consensus process throughout each review step. For step 1, the review team and stakeholders reached consensus on the primary question; “do occupational health and safety interventions have an effect on upper extremity musculoskeletal symptoms, signs, disorders, injuries, claims and lost time?” To perform a well-defined literature search, definitions were agreed to for: workplace or work setting, occupational health and safety interventions and upper extremity musculoskeletal disorders and injuries.

Workplace or work setting was defined as any location where a worker is performing his or her assigned work.

Occupational health and safety interventions were defined as any primary, secondary or tertiary OHS interventions designed to reduce or prevent musculoskeletal symptoms, signs, disorders, injuries, claims and lost time. Violence prevention programs where the primary goal was preventing injury resulting from violence were excluded. However, biomechanical interventions designed to reduce assaults and musculoskeletal injuries were included. Interventions that were not delivered in the workplace, except workplace mandated pre-placement screening programs, were excluded (i.e. physical therapy clinics, work-hardening programs). Pre-placement screening programs were included as an intervention, regardless of where they were completed; as long as they were mandated by the employer (e.g. nerve conduction testing or genetic testing). Studies designed to examine productivity were only included if upper extremity musculoskeletal symptoms, signs, disorders, injuries, claims and lost-time outcomes had been analyzed. Productivity studies were excluded if they did not include health outcome data.

Upper extremity musculoskeletal disorders and injuries were defined as musculoskeletal symptoms and signs or clinical diagnoses in the following body locations: neck, shoulder, upper arm, elbow, forearm, wrist and hand [29]. These included injuries to or disorders of the muscles, tendons, ligaments, joint, nerves, blood vessels or related soft tissue including sprains, strains and inflammation. Workers’ compensation claims data and employer reports were included despite the validity and reliability vulnerabilities of these data sources. These data sources are important to stakeholders who use them to assess intervention usefulness. Excluded body locations included the thoracic spine, lower extremity (including hip, knee, ankle and foot), lumbar spine and low back. Also excluded were studies that reported only total symptoms (i.e. total body symptom count). We excluded studies where changes in exposure to physical risk factors were the primary outcome without considering changes in MSDs and injuries. This eliminates having to review a vast literature. Surgeries, cancers and pregnancy-related musculoskeletal symptoms, signs, disorders and diagnoses were excluded.

The review team considered published or in-press peer-reviewed scientific articles. There were no language restrictions. Book chapters, dissertations and conference proceedings were excluded.

Literature Search

The literature search was based on the research question and the above definitions. The search included the following databases: MEDLINE, EMBASE, CINAHL, PsycINFO and Business Source Premier. Search terms were identified for three broad areas: work setting terms, intervention terms, and health outcome terms (Table 1). Search categories were chosen to be exclusive within each area. The terms within the work setting and intervention categories were combined using a Boolean OR operator. The terms within the health outcome category were divided into three subcategories: upper extremity terms, injury/disease terms, and specific upper extremity injury/disease terms. The terms within each subcategory were combined using the Boolean OR operator. The upper extremity subcategory was combined with the injury/disease subcategory terms using a Boolean AND operator and the result was combined with the specific upper extremity injury/disease terms using a Boolean OR operator. The three main categories were then combined using a Boolean AND operator.

Table 1 Search terms

Before the literature search, the review team identified 50 relevant articles to test the searches face validity. The initial search did not include several specific index terms for the upper extremity and intervention categories causing some articles to be missed. Once the index terms were added, 41 of the 50 articles were captured. Of the nine not captured, two were not indexed in any of the databases searched, three were excluded because the search was limited to human subjects and four were not indexed with any upper extremity and/or intervention terms. Consequently, the review team considered the search valid.

The review team contacted 42 content experts to solicit relevant articles that might not be identified by the search. Six experts responded suggesting five articles. Team members discovered two potentially relevant articles that were in press while the review was in progress [30, 31]. The newly found articles were added to the review process.

Level 1: Selection for Relevance

The inclusive search strategy captured many articles not relevant to our research question. A relevance step was designed to identify and exclude non-relevant articles as efficiently as possible. Article relevance (Level 1a) was based on responses to five questions (Table 2). Reviewers read only the article title and abstract and entered responses on commercially available review software (Systematic Review Software [SRS]) [32]. SRS allowed centralized article tracking and access.

Table 2 Level 1—screening questions and the response that led to exclusion. An exclusionary response to any one question would exclude the article from further review

If reviewers did not know how to answer a question, they were instructed to mark it as “unclear”. In such cases, the article would move forward to Level 1b where the full paper was reviewed to determine relevancy. In addition to Questions 1–5 in Table 2, the review team considered single group designs (i.e. no control or comparison group) and studies with only post-intervention measures (i.e. no pre-intervention measures) fatally flawed for evaluating intervention effectiveness. Therefore, additional questions were added when reviewing full articles passing Level 1a review (see Level 1b, Question 2, Table 2). Articles at Level 1a were reviewed by individual team members, while two reviewed each article at Level 1b.

Since a single reviewer conducted the Level 1a review, there was a possibility for error. Therefore, a quality control (QC) check was done with an independent reviewer (QC reviewer). The QC reviewer assessed a randomly chosen set of one per cent of the articles that were subjected to Level 1a review (n = 140). The quality control check contained 70 articles that were excluded at Level 1a and 70 articles that would continue onto subsequent review levels. QC reviewer responses were entered into SRS software so they could be directly compared to a team member’s responses.

The QC reviewer disagreed with the exclusion category selected by the original reviewer for 26 of the 140 articles. In 15 of 26 cases (58%), the QC reviewer excluded the article while the original reviewer included it. We did not consider over-inclusion a problem since the article would be reviewed at the next level for relevance by two team members. There were 11 articles where the QC reviewer included the article and the original reviewer excluded it. In all cases, the QC reviewer responded with “unclear” about some or all of the criteria. The QC reviewer was not part of the review process, and therefore not privy to decisions and approaches that were not captured in the reviewer guide. Therefore, we concluded that the quality of the Level 1a review was acceptable.

Level 2: Quality Assessment

Relevant articles were moved forward for methodological quality assessment at the Level 2 review. The team identified 16 methodological criteria questions for assessing quality (Table 3). Each article was independently reviewed by two team members.

Table 3 Level 2—Quality assessment questions and weights

To reduce bias, each reviewer randomly paired with other team members and reviewer pairs were rotated. Reviewer pairs were required to reach consensus on all criteria. Where review pairs disagreed, they were encouraged to resolve their disagreement through discussion. In cases where agreement could not be reached, a third reviewer was consulted. Team members did not review articles they had consulted on, authored or co-authored.

Methodological quality scores for each article were based on a weighted sum score of 16 quality criteria. The weighting values assigned to each of the 16 criteria ranged from “somewhat important” (1) to “very important” (3) (Table 3). Each article received a quality ranking score by dividing the weighted score by 41 and then multiplying by 100. The quality ranking was used to group articles into three categories: high (>85%), medium (50–85%) and low (less than 50%) quality. The categories were determined by team consensus with reference to the review methodology literature including the Cochrane Manual [28] and AHRQ Guidelines [33].

The quality ranking represents the review team’s assessment of the internal, external, construct and the statistical conclusion validity of each study [34]. A lower overall quality ranking reflects greater uncertainty that the results were attributable to the intervention and not other on-going activities in the workplace or more broadly in society. Therefore, data extraction and evidence synthesis were only completed on the high and medium quality studies.

Level 3: Data Extraction

Data extraction and evidence synthesis were completed on studies that: (1) were ranked as high or medium in quality; (2) had a control or comparison group; and (3) had a direct statistical comparison of the intervention and the control group. The extracted data were used to create summary tables. These tables were used in evidence synthesis and recommendation development. Data extraction was performed independently by two reviewers and, again, reviewer pairs were rotated to reduce bias. Team members did not review articles they consulted on, authored or co-authored. Differences between reviewers were identified and resolved by discussion. The team developed standardized data extraction forms (Table 4) based on existing forms and data extraction procedures [18, 19, 35, 36]. During the data extraction, reviewers reconsidered the methodological quality rating scores recorded in Level 2 review. Any quality rating changes that the reviewer identified were proposed to the full team for consensus.

Table 4 Data extraction (DE) items

The heterogeneity of methods created a unique challenge for the review team. In particular, small sample sizes may have left studies underpowered and thus bias the evidence synthesis conclusions. In analysis some studies did not control for confounders/covariates and produce positive intervention effects when none exist. The review team identified studies with small samples and that did not adjust in final analysis for covariates/confounders and conducted a sensitivity analysis of the evidence synthesis to determine the robustness of the results.

Evidence Synthesis

The high level of heterogeneity in study methods and outcomes required a synthesis approach most commonly associated with Slavin and known as “best evidence synthesis” [24]. The evidence synthesis approach was adapted from other IWH prevention intervention reviews [18, 19, 35, 37]. The approach considers the article’s quality, the quantity of articles evaluating the same intervention and finding consistency (Table 5) to classify the evidence as strong, moderate, limited, mixed or insufficient [24, 35, 37, 38].

Table 5 Best evidence synthesis guidelines

The synthesis first answered the general question posed; “do occupational health and safety interventions have an effect on upper extremity musculoskeletal symptoms, signs, disorders, injuries, claims and lost time among workers?” Then, in a series of post-hoc evaluations, the evidence was summarized for each specific intervention category. Where specific data values were not reported, the team abstracted data from figures. When multiple findings were reported, the team indicated whether appropriate multiple comparisons were considered. Finally, both significant and non-significant trends were considered and reported. Initially, the plan was to calculate effect sizes for each article to apply a uniform method to evaluate the strength of associations [3941]. However, this approach was abandoned due to the heterogeneity in outcome measures, study methods and the lack of data necessary to calculate effect size in some studies.

The team adopted the following decision rules to summarize the evidence. An intervention with any positive and no negative results was classified as a positive effect intervention. An intervention with both positive and no effects was also classified as a positive effect intervention. An intervention with only no effects was classified as a no effect intervention. An intervention with any negative effects was classified as a negative effect intervention.

Working with our stakeholders, the following terminology for messages was agreed upon (Table 5). A strong level of evidence results in “recommendations” for practice. A moderate level of evidence leads to “practice considerations” or practices to be considered for disability management and workplace application.

Results

Literature Search and Selection for Relevance

The literature search, using the terms in Table 1, identified 15,279 articles after the results from different databases were merged and duplicates were removed (Fig. 1).

Fig. 1
figure 1

Flow chart of systematic review process

The Level 1a review excluded 14,564 articles. The remaining 715 articles proceeded to Level 1b review. The team excluded 610 articles leaving 88 studies (99 articles) to be reviewed for methodological quality at Level 2. Eleven articles [43, 45, 46, 4850, 52, 54, 56, 58, 60]* (Fig. 1) were grouped with other articles that described results from the same study. Six articles were not reviewed at Level 1b because no reviewer could be found for specific non-English language articles. This left 88 studies for Level 2 review. Eighty-seven studies were reviewed by two reviewers using the quality assessment questions in Table 3. One non-English study (Czech language) was not reviewed for methodological quality [61].

Methodological Quality Assessment

The 87 studies that met our relevance criteria were assessed for methodological quality and assigned a quality ranking score (Table 6). The studies were placed into three quality categories: high (>85%), medium (50–85%) and low quality, not sufficient to move forward to data extraction (<50%).

Table 6 Methodological quality assessment (QA) (n = 87)

Fourteen studies were classified as high quality [53, 6274]. Despite classification as high quality, most of these studies did not consistently document intervention effects on exposure parameters the interventions were intended to change (7 of 14), or examine for important differences between remaining and drop out participants (4 of 14). Loss to follow-up was greater than or equal to 35% in four of 14 studies.

Thirty-four studies were classified as medium quality [42, 44, 51, 57, 75104]. These studies generally scored well on the following criteria: stating the research question (34/34); using comparison (control) group(s) (33/34); describing pre-intervention characteristics (31/34); describing the intervention process adequately to allow for replication (30/34); and describing upper extremity musculoskeletal outcomes at baseline and follow-up (34/34). However, few met the following criteria: reporting recruitment (or participation) rate (13/34); examining for important differences between the remaining and drop-out participants after the intervention (13/34); optimizing statistical analyses for the best results (12/34); and adjusting for pre-intervention differences (8/34).

Thirty-nine studies were classified as quality not sufficient to move forward to data extraction [47, 55, 59, 105140]. All of these studies described upper extremity musculoskeletal outcomes at baseline and follow-up. Most had a length of follow-up three months or greater (35/39). Few had a comparison (control) group(s) (8/39). One used random allocation [110].

Data Extraction and Evidence Synthesis

One medium quality study did not have a control or comparison group [78]. Eleven medium quality studies had a control or comparison group, but did not include a direct statistical comparison between the intervention and control group [42, 44, 75, 76, 83, 91, 94, 95, 98, 102, 103]. Consequently, 36 studies were included in data extraction and evidence synthesis.

There were 19 distinct intervention categories examined. A detailed description of each intervention is presented in Table 7. Fifteen studies examined the effectiveness of more than one intervention [31, 51, 63, 6568, 71, 72, 84, 88, 90, 97, 99, 104]. Many intervention categories included only one study (n = 7). Additional data for each study can be found in a detailed report of this review [141].

Table 7 Description of interventions used in 36 studies for evidence synthesis, sorted by intervention

The study designs included 23 randomized trials, eight non-randomized trials and five cross-over designs. All high quality studies (n = 14) and 13 (of 22) medium quality studies were randomized trials. Nine studies were primary prevention trials and eight were secondary prevention trials. Fifteen studies were both primary and secondary prevention trials. Two studies were both secondary and tertiary prevention trials. Two studies were primary, secondary and tertiary prevention trials.

Study characteristics important when examining comparability and generalizability are shown in Table 8. Most studies were conducted in the USA (n = 15) and Europe (n = 15). While a variety of industries and job titles were represented, most study participants’ primary job duties involved office work (22 of 36 studies). Sample sizes tended to be small but varied from 10 [96, 100] to 602 [85]. Six studies [53, 70, 90, 92, 96, 100] had samples sizes of 20 or less. Lost to follow-up details were often lacking in study descriptions (n = 10). When reported, the numbers lost to follow-up tended to be small but varied from 0 to 52%. Length of observation also varied greatly, from one day [100] to 18 months [66].

Table 8 Characteristics of 36 studies

The level of statistical analysis varied across studies. Thirteen of the 14 high quality studies examined for covariates/confounders in the analysis (or in design by careful matching). Twelve of 22 medium quality studies examined for covariates/confounders. The variables considered in these analyses varied greatly with little consistency across the studies. Nine of the fourteen high quality studies included covariates/confounders in the final analysis [31, 65, 67, 69, 71, 72, 74] or controlled by design (matched design [70] and cross-over design [73]). Only four of 22 medium quality studies [79, 82, 84, 99] included covariates/confounders in the final analysis.

Outcomes were ascertained from employer records [e.g. injury, LWD (lost work days), WC (workers’ compensation)], worker self-report and clinical measures (includes clinical exams, clinical records or clinical diagnoses). Thirty studies examined only worker self-report outcomes [51, 53, 57, 6469, 7274, 77, 7982, 84, 85, 8790, 92, 93, 96, 99101, 104]. Four studies examined both worker self-report and clinical outcomes [31, 63, 71, 97]. One study examined only clinical outcomes [70] and one employer record outcomes [86].

A summary of the intervention effects is presented in Table 9. The review team did not find any negative or adverse effects. Overall, 36 studies provided mixed evidence that OHS interventions have an effect on upper extremity MSD outcomes. There were 20 interventions with positive effects and 32 with no effect. When only high quality studies were considered, there were nine interventions with positive effects and 13 with no effect. The evidence is summarized by intervention category.

Table 9 Intervention effects: 36 studies grouped by intervention categories

Exercise

Four studies evaluated exercise programs: two high quality studies [68, 73] found positive effects for the neck and no effect for the shoulder, one high [66] and one medium quality study [84] found no effect on neck and shoulder outcomes. The exercise interventions were similar; they involved initial training on exercises (by a physical therapist, Feldenkrais instructor), followed by an independent exercise program done either during work hours or at home. The four exercise programs included a variety of activities including strengthening, stretching, coordination, relaxation and/or stabilization exercises. Overall, these studies provide mixed evidence that exercise programs have an effect on upper extremity MSD outcomes.

Ergonomics Training and Exercise

Three studies evaluated ergonomics training combined with exercise programs: one high quality study [68] found no effects on neck and shoulder outcomes, one medium quality study [92] found positive (neck, shoulder, elbow outcomes) and no effects (wrist outcome), and one medium quality study [84] found no effect on neck/shoulder outcome. Overall, these studies provide mixed evidence that ergonomics training combined with an exercise program have an effect on upper extremity MSD outcomes.

Biofeedback Training

Three studies evaluated biofeedback training: two high quality studies [63, 74] found no effects on upper extremity outcomes and one medium quality study [96] found no effects on forearm/hands outcome. Together these studies provide moderate evidence that biofeedback training alone has no effect on upper extremity MSD outcomes.

Cognitive Behavioral Training

One high quality study [63] found no effect on upper extremity MSD outcomes using adult learning and cognitive behavioral techniques in small group discussions to advance workers’ capabilities for symptom and stress management and problem-solving. A single high quality study provides limited evidence that cognitive behavioral training has no effect on upper extremity MSD outcomes.

Job Stress Management Training

Two high quality studies [64, 66] reported no effect on upper extremity MSD outcomes. In both studies, the intervention was delivered in a group setting and the intensity varied in duration (from 70 to 90 min sessions over three to seven weeks). These studies provide moderate evidence that job stress management training alone has no effect on upper extremity MSD outcomes.

Workstation Adjustment

Three high quality studies [65, 67, 69] and one medium quality study [77] examined the effect of an array of workstation adjustments. The individual workstation adjustments were performed by a therapist or technician with the goal of reducing postural stresses. All studies found no effect of workstation adjustments on upper extremity MSD outcomes. These studies provide strong evidence that workstation adjustments alone have no effect on upper extremity MSD outcomes.

Ergonomics Training

Four medium quality studies examined ergonomics training: two studies [82, 101] found no effect, and two had positive effects [51, 93]. The four studies implemented different types of training programs ranging from a single session to multiple participatory training sessions. The training duration varied from a 10-min personal follow-up after receiving an information pamphlet to a 1-h ergonomics lecture. Together, these studies provide mixed evidence that ergonomics training has an effect on upper extremity MSD outcomes.

Ergonomics Training and Workstation Adjustment

One high quality study [53] found a positive effect on the elbow/forearm and no effect on the neck, shoulder and wrist/hand. This single high quality study provides limited evidence that ergonomics training plus workstation adjustments have a positive effect on upper extremity MSD outcomes.

Alternative Keyboards

One high quality study [70] and one medium quality study [97] examined the effect of alternative keyboards on upper extremity MSD outcomes. One study [70] found either positive (Phalen’s test time) or no effect (nerve conduction), for a keyboard with a new keyswitch force displacement. The other study [97] found positive effects for one fixed split keyboard and no effect for two other adjustable split keyboards when compared to a conventional keyboard.

Although positive effects were found in both studies, the Tittiranonda study found no effect for two keyboards in independent comparisons with a placebo keyboard. Two alternative keyboards in two different studies showed positive effects and two keyboards from a single study showed no effect. As a result, the team felt these inconsistent results represented a mixed level of evidence for the effect of alternate keyboards on upper extremity MSD outcomes.

The alternate keyboards are biomechanically very different and the team felt that the review should also address findings from the individual studies. A single high quality study provides limited evidence that a keyboard with a new keyswitch force displacement has a positive effect on upper extremity MSD outcomes. A single medium quality study provides insufficient evidence whether an adjustable split keyboard or a fixed split keyboard have an effect on upper extremity MSD outcomes.

Alternative Pointing Devices

Two studies examined the effect of alternative pointing devices on upper extremity MSD outcomes. One high quality study [71] found positive effects for a trackball compared to a conventional mouse. One high quality study [31] found no effect on upper extremity MSD outcomes for a vertical mouse compared to a conventional mouse. Together, these studies provide mixed evidence that alternative pointing devices have an effect on upper extremity MSD outcomes. While the findings suggest mixed evidence exists for alternative pointing devices on upper extremity outcomes, the team considers the devices (a trackball and vertical mouse) very different input technologies. While both are designed to reduce wrist pronation, one study [71] found only positive effects for the left side of the body. Given right-handed dominance of the study population and society in general, the team does not consider the health effects as strongly as if they were on the right side of the body.

Arm Supports

Three studies evaluated arm supports: two high quality studies [31, 71] found positive and no effect and one medium quality study [88] found no effect. Positive effects were found in both high quality studies for right upper extremity self-report outcomes. Given the right-handed dominance, the team considers these health effects as important. These studies provide moderate evidence that arm supports have a positive effect on upper extremity MSD outcomes.

New Chair

One high quality study [72] found a positive effect on upper extremity MSD outcomes with the introduction of a curved seat pan chair (new chair) and a flat seat pan chair (modified chair) in garment workers. This single high quality study provides limited evidence that both a new chair and a modified chair have a positive effect on upper extremity MSD outcomes.

Rest Breaks

Four medium quality studies evaluated the effects of rest breaks: one [99] found no effect with a 5-min break every 35 min; three [80, 81, 90] found positive or no effect, depending on the rest break pattern. For the positive findings, the break patterns were either a 5-min break every hour [80, 81] or, a 30-s break every 20 min [90]. Taken together, there was limited evidence that rest breaks have a positive effect on upper extremity MSD outcomes.

Rest Breaks and Exercise

A single medium quality study [99] evaluated rest breaks combined with stretching exercises during the break. This study reported no effect on upper extremity outcomes. With a single medium quality study, there is insufficient evidence to determine whether rest breaks combined with exercise has an effect on upper extremity MSD outcomes.

Participatory Ergonomics

A single medium quality study [57] evaluated a participatory ergonomics program. This study reported no effect on upper extremity outcomes. With a single medium quality study, there is insufficient evidence to determine whether a participatory ergonomic program has an effect on upper extremity MSD outcomes.

Broad-Based MSK Injury Prevention Program (MIPP)

A single medium quality study [85] evaluated a broad-based MSK injury prevention program. This study found both positive (shoulder outcome) and no effects (neck outcome). With a single medium quality study, there is insufficient evidence to determine whether broad-based MSK injury prevention programs have an effect on upper extremity MSD outcomes.

Prevention Strategies and Physical Therapy

A single medium quality study [86] evaluated an occupational health management approach involving prevention strategies, plus physical therapy, compared to standard care (standard medical and physical therapy). This study found positive effects for upper extremity employer outcomes (i.e. lost work days and workers’ compensation outcomes). With a single medium quality study, there is insufficient evidence to determine whether the prevention strategies combined with physical therapy have an effect on upper extremity MSD outcomes.

Miscellaneous Work Redesign

Four medium quality studies evaluated the effects of some type of work redesign [79, 87, 89, 100]. Taken together, there was limited evidence that work redesign has no effect on upper extremity MSD outcomes. However, these four studies included disparate work redesign interventions (redesign of video display terminal workstations (VDT) in semiconductor manufacturing [87], change from line out to line production in car body sealing [79], raised bricklaying [89], mechanical assist for bricks/mortar transport [100]) that occurred under a wide set of circumstances with no replication. The team felt that the review should also summarize evidence for individual studies. With only single medium quality studies, there is insufficient evidence to determine whether work redesign has an effect on upper extremity MSD outcomes.

Multi-Component Patient Handling

Multi-component patient handling includes three components: policy change, equipment purchase and training on equipment usage and patient handling. A single medium quality study [104] evaluated this intervention and found positive effects on shoulder outcomes for the “safe-lift policy” intervention (involving lifting and transfer equipment) and no effect for the “no strenuous lifting” intervention (involving new mechanical patient lifts). With a single medium quality study, there is insufficient evidence to determine whether either multi-component patient handling intervention had an effect on upper extremity MSD outcomes.

Sensitivity Analyses

Small sample sizes did not lead to null findings (33–50% had no effect findings) and the lack of inclusion of covariates/confounders did not lead to positive findings (48–57% of the studies showed no effect). For more information, please refer to a detailed report of this review [141]. Overall, the team did not consider these two important methodological issues to influence our evidence synthesis.

Discussion

This systematic review sought to answer the question: “Do occupational health and safety interventions have an effect on upper extremity musculoskeletal symptoms, signs, disorders, injuries, claims and lost time?” From an initial pool of more than 15,000 articles, we identified 36 studies to include in our evidence synthesis. Across all interventions, the results suggest a mixed level of evidence for the effect of OHS interventions on upper extremity MSD outcomes. A mixed level of evidence means there were medium to high quality studies with inconsistent findings. Importantly, no evidence was found that any OHS intervention had a negative or harmful effect on upper extremity musculoskeletal health. The above conclusions do not change when considering only high quality studies or when methodological issues of small sample size or lack of adjustment in final analysis for covariates/confounders are considered. The mixed level of evidence finding may be due to the heterogeneity of intervention types grouped together where some interventions were effective and others not.

When examining specific intervention categories, the review team was able to make more precise statements about intervention effectiveness. We found a strong level of evidence for no effect of workstation adjustments of computer workstations on upper extremity MSD outcomes. An OHS intervention approach that relies solely on adjustments to computing workstations is strongly discouraged. A moderate level of evidence was found for no effect of biofeedback training and job stress management training on upper extremity MSD outcomes. The implementation of either of these interventions to reduce upper extremity MSD outcomes is discouraged. Furthermore, the review team considers it of limited utility to conduct further studies focused solely on workstation adjustments, biofeedback training or job stress management.

A moderate level of evidence was found for a positive effect of arm supports on upper extremity MSD outcomes. The review team considers the use of arm supports a practical design strategy to reduce muscle loading in the upper extremity and potentially useful in a range of work environments.

A limited level of evidence was found for a positive effect of ergonomics training plus workstation adjustment, new chair and rest breaks on upper extremity MSD outcomes. Limited evidence supporting the effect of ergonomics training combined with workstation adjustment is significant. When initiated as separate interventions, there was strong evidence that workstation adjustments alone had no effect on upper extremity MSD outcomes and mixed evidence for ergonomics training alone. Workstation adjustment combined with training appears to be more effective compared to using either intervention independently.

A mixed level of evidence (medium and high quality studies with inconsistent findings) was found for: exercise programs, ergonomic training plus exercise, ergonomic training, alternative pointing devices and alternative keyboards. To advance the field and shift the level of evidence from mixed to positive, further high quality research of these interventions should be conducted. While mixed evidence exists for alternative pointing devices, the synthesis aggregates quite different pointing devices (a vertical mouse and a trackball). The review team is cautious in making any recommendations about specific alternative pointing devices.

Comparison with Other Systematic Reviews

We identified two recent systematic reviews that have examined a comparable research question [8, 18, 19]. Although one would hope that multiple systematic reviews would provide greater clarity on the effectiveness for upper extremity MSDs, we found some discordance. The reasons for the discrepancies in the messages from recent reviews compared with this review are methodological.

In this review, we used similar methods to an earlier IWH prevention systematic review of workplace interventions in computer users [18, 19]. There was considerable overlap between these reviews with 16 of the 36 studies (44%) common across the two reviews. However, there were some differences in the final messages explained by: (1) additional articles published since the 2004 search, (2) the restriction to only computer users in the earlier review, (3) our review was specific to upper extremity MSD outcomes whereas Brewer (2006) included low back and upper and lower extremity outcomes, (4) inclusion of employer reports and workers’ compensation reports in this review, (5) evolution of quality assessment criteria and criteria weighting that led to several differences in quality assessment ranking [53, 97, 99, 130], (6) inclusion in this review of a “limited evidence” synthesis category.

Another recent systematic review by Boocock (2007) summarized the evidence on the effectiveness of interventions for the prevention and management of neck and upper extremity musculoskeletal conditions [8]. They searched multiple databases from 1999 to 2004 and identified 31 studies. Our review searched multiple databases from inception to 2007 and identified 36 studies. Despite both reviews having similar inclusion criteria (related to population, intervention and outcomes), only six studies were common across the two reviews [51, 63, 67, 92, 97, 99]. Some of these differences can be explained by our broader search strategy (i.e. search terms used, time frame of search). However, much of this variation is the result of the inclusion of more heterogeneous study designs in the Boocock (2007) review. Almost 50% (15/31) of the studies included in their evidence synthesis were described as having no control group. These single group study designs were excluded in the selection for relevance phase of our review. In addition, our review excluded any study that had a control or comparison group and did not do a direct statistical comparison between the intervention and the control group. In the absence of a direct between-group statistical comparison, we could not make any inferences about the effect of the intervention. Furthermore, Boocock (2007) allowed a wider range of methodological quality (low, medium and high quality ratings) to contribute to their evidence synthesis. Another review has shown that the inclusion of studies with lower methodological quality was more likely to find positive effects [142].

The Boocock [8] review combined more diverse interventions in defining intervention categories. The following are examples of the intervention classifications used: work environment/workstation adjustments (included new workplaces ± ergonomics training, workstation adjustment ± ergonomics training) and ergonomic equipment (included new chair, new tools, gloves). Our review team felt that these interventions were too different to combine and thus chose to split many of these intervention categories in our evidence synthesis. We found that combining heterogeneous interventions led to mixed levels of evidence and the loss of messages that emerge from more specific intervention categories.

Strengths and Limitations

The strengths of the review include the varied backgrounds and specializations of the review team, the broad and exhaustive literature search including non-English language studies and the quality control process used to assess the early phase of article exclusion. We also used a process of randomly pairing reviewers at each phase to improve independent assessment by at least two team members. Finally, the engagement with our stakeholder groups in all phases of the process makes the results more useful for practitioners.

Limitations include the exclusion of the gray literature. Because of time constraints, the review team was unable to clarify specific questions about a study with the study authors. For example, contacting authors for additional information related to the intervention description might lead to a better understanding of the characteristics of effective interventions. Although a quantitative synthesis (or meta-analysis) was considered in this review, it was not appropriate due to differences among comparison/control groups, the use of different outcome measures and insufficient data reported. Similarly, comparable systematic reviews [8, 18, 19] have not been able to use quantitative syntheses due to the heterogeneity of the included studies.

Implications for Further Research

As more research is being conducted and supported by employers, labor and government, we have summarized some issues to consider before embarking on new projects:

  • Researchers should use concurrent worksite control groups as opposed to study designs with simulated controls, statistical controls or cross-over designs. True concurrent controls contribute results that are more generalizable across industrial sectors.

  • Field studies should have adequate sample sizes to reduce the risk of mistakenly concluding an intervention has no effect, simply because the sample is too small.

  • Rather than testing three or more treatment arms, if the sample size is limited, it is more valuable to test an intervention and a control.

  • For upper extremity MSDs, the review team recommends that studies be four to 12 months in duration to allow for examining the sustained effects.

  • In addition to worker self-report outcomes, researchers should consider using workers’ compensation, injury records or other regulated injury reporting systems using standard approaches that are common to the reporting requirements demanded of stakeholders.

  • Covariates and confounders should be measured and adjusted for using multivariate statistical models. This is especially true when the researchers are unable to randomize workers into either intervention or control groups.

  • Single interventions (i.e. training only, equipment only) tend to lead to no effect outcomes. A common characteristic of interventions showing positive effects is the multi-component nature of the intervention (i.e. training combined with addressing issues in the environment).

  • Studies should be conducted in sectors other than the office sector. Of the articles that proceeded to evidence synthesis, studies in the office sector accounted for 61% (22 of 36 studies) of the evidence base.

The review team believes that the systematic review process should continue to develop in several ways when considering the OHS literature. First, non-English articles and gray literature may be valuable to the process. Second, contacting the authors when necessary may be useful to clarify findings in the published studies. Third, studies where between-group comparisons were not made should be re-analyzed to provide evidence that can be included in data synthesis. Finally, in an effort to calculate effect sizes, necessary data not provided in the articles should be obtained from researchers, when possible.

This review identifies knowledge gaps. We did not identify any studies that looked at the prevention of acute traumatic upper extremity injuries. Also, pre-placement screening and examinations (e.g. nerve conduction testing for carpal tunnel syndrome) were included in our definition of OHS interventions, regardless of whether or not the examination occurred at the workplace or off-site, as long as they were mandated by the workplace. Despite these programs being some of the most widely used OHS interventions, no studies evaluated these interventions using a controlled study design with pre and post intervention measures. Therefore, we can find no scientific evidence of a reasonable methodological quality to either support or refute the effectiveness of pre-placement screening programs in reducing upper extremity MSDs. It is vital that we begin to generate the amount and quality of evidence required so decision-makers can make evidence-informed decisions about preventing and managing upper extremity MSDs.

Recommendations

The review team believes that policy recommendations should be based on strong levels of evidence. A strong level of evidence requires consistent findings from a number of high quality studies. Thus, we recommend that worksites NOT engage in health and safety activities that include only workstation adjustments. However, when combined with ergonomics training, there is limited evidence that workstation adjustments are beneficial for preventing and managing upper extremity MSDs.

The review team felt that with moderate levels of evidence it was possible to make recommendations for practices to consider. We note that a practice to consider is that using arm supports may reduce upper extremity MSDs. Another practice to consider is that the research evidence does NOT support adopting biofeedback and job stress management as training programs to reduce upper extremity MSDs.