Keywords

1 Introduction

According to the last study of the European Occupational Safety and Health Agency (EU-OSHA), approximately three out of every five workers in EU-28 have symptoms related to musculoskeletal disorders (MSKD). The most frequent disorders of that type among workers are pain in the back, neck and upper limbs, which is reported as the most serious problem by 60% of workers suffering some work-related health issue, and often develops as chronic pain [1]. The same study indicates that the main causes for MSKD are: manipulation of loads, specially when the task requires adopting flexed and twisted postures, repetitive or sudden movements, strained and static postures, vibrations, inadequate lighting or temperatures, fast work cycles, and sustained seated or standing postures. It also calls for integral and more effective procedures for the assessment of ergonomic risks, which all companies should adopt, in order to identify and address the most relevant risk factors.

However, the assessment of ergonomic risks is often neglected or limited to the evaluation of critical safety risks (e.g. accidents and injuries), since the disorders caused by other ergonomic issues, like those related to postures, movements and manual handling, are normally developed only after a prolonged exposure to them, and often the relationship between daily tasks and those risks is not obvious. Another reason for such neglection, specially in small companies with limited resources, is the great time investment that is required to conduct an adequate, integral assessment of ergonomic risks [2].

There are methodologies like OWAS [3], RULA [4], REBA [5], etc., specially aimed at that type of ergonomic assessment, which provide ergonomic engineers and technicians with systematic criteria to assess a workplace, based on the analysis of observed postures. But comprehensive evaluations, specially for complex or combined tasks, require the observation, selection and analysis of postures from long working periods, which is time consuming and subject to bias.

Multiple tools have been developed to address such problems in the application of those methods, mostly based on automatic measurement of postures with wearable sensors [6, 7], depth cameras [8,9,10] or virtual reality scenarios [11, 12]. But although those solutions reduce the subjectivity of the assessments, and the time required by the evaluators to conduct the analysis, they increase the burden of taking the measurements, because workers have to be instrumented with devices that usually require a previous calibration, and the instrumentation or the setting of the scenario may also interfere with the task that is being measured.

In this work we present a new approach for the automatic assessment of ergonomic risks based on Artificial Vision and Neural Network methodologies, which addresses the two main problems that have been detected before: the subjectivity of the evaluator, and the time needed to perform those measurements. The efficacy of that tool has been assessed in terms of time needed for the assessment, and performance in the detection of postures for a specific use case based on the OWAS methodology.

2 Methods

A web application has been developed in PHP and Javascript to assist the assessment of ergonomic risks, with semi-automated image analysis to detect postures of different parts of the body. The application can be used in any device with a web browser and Internet connection, and includes functionalities to upload and pre-process videos in different standard formats (AVI, MPEG, MP4, etc.), and visualize and export the results of the evaluations.

The application allows to define tasks composed by an arbitrary set of subtasks to be analyzed, associating each subtask with a different fragment of one or several videos, and choosing a specific number of frames to be extracted from those fragments for each subtask. Selected video fragments are sent to an AWS Elastic Computing service, and automatically processed by a convolutional neural network (CNN) based on the Simple Pose deep learning network, a robust, fast and accurate CNN that has achieved very good results in recent benchmarks (AP of 73.7 on COCO dataset) [13], with custom end layers to calculate the parameters that are used in ergonomic workplace assessments.

This paper presents a use case for the evaluation of tasks based on the OWAS method [3]. The CNN, originally designed to return the coordinates of characteristic points of the face, trunk, arms and legs from people detected in each frame of the video, has been complemented with additional calculations that allow to classify the postures of trunk (straight, bent, and/or twisted), arms (below or above elbows) and legs (standing, sitting, kneeling, walking, and different combinations of knee flexion and foot support).

Those parameters are automatically calculated and stored in a database. A visual tool can be later used to review the detected postures, modify them, and add the load level that is needed to calculate the ergonomic risk of each frame (see Fig. 1).

Fig. 1
figure 1

Visualization and plot of results for OWAS

To test the performance of the tool, it has been used with a sample of videos taken from the Carnegie Mellon University Motion Capture Database [14], particularly the samples #02.06 (bend over, coop up, rise, lift arm), #13.23 (sweep floor), and #15.06 (lean forward, reach for). The web application was used in a laptop with an Intel i5 2.3 GHz processor, 8 GB RAM and LAN Internet connection. The time for the analysis and the number of successful posture detections, considering the confidence level of joint coordinates computed by the Simple Pose CNN [13].

3 Results

The processing speed was between 15 and 20 frames per second, both in the pre-processing phase (trim and subsample) and in the analysis.

The CNN provided data of the postures of trunk, arms, legs in each frame. The confidence level for body part detections was over 0.5 (50%) in the majority of cases; and setting a minimum threshold of 25% of confidence, it was possible to assess the ergonomic risk based on the postures in between 73 and 86% of the images. The best results were obtained for legs, and the worst for arms, with a substantial decrease of performance in the sample video where the subject was kneeling down (#02.06, see Fig. 2).

Fig. 2
figure 2

Distributions of the confidence levels, separated by body part and video sample

Visual inspection of the videos with superimposed wireframes of joint coordinates (Fig. 3) showed that the poorer results in sample #02.06 were due to the subject’s extreme crouched posture in approximately 20% of the frames. The following reasons for potential failure of posture detection were detected:

Fig. 3
figure 3

Example of image with superimposed wireframe (sample video #13.23), from [14]

  • Confusions of laterality (flipped left and right sides of the body). This happened more frequently in the sample #13.23, presumably because the homogeneous dark clothing of the subject and his partial face covering made the body recognition more difficult. However, this did not affect the results of the OWAS assessment, which is insensitive to laterality.

  • Failure in body shape recognition. More frequent in unusual body postures (e.g. the crouched posture in the sample #02.06).

  • Occlusion of body parts. This happened also in #02.06, which was close to a side view, and resulted in the hidden arm being assimilated to the other one.

4 Discussion

With the processing speed obtained in the tests, a one-minute video recorded at a standard frame rate of 30 frames per second can be fully processed in two less than two-minutes; or a 30 min video, subsampled at 100 frames per minute (with a total amount of 3,000 images), could be processed in 6 min.

The occasional occurrence of failures in posture recognition still enforces a revision of the results from the evaluator’s side. Given the underlying reasons of the observed failures, some of them (e.g. flipped left/hand sides, unusual postures) are expected to be improved with further training of the CNN, using more images from workplaces, and specializing them further to give the results needed for the ergonomic assessment methods.

The approach presented for the use case of OWAS can be extended to other methods like RULA or REBA, by just modifying the final calculations obtained by Artificial Intelligence, to provide the postural parameters used by those methods. Thus, this tool is expected to facilitate more exhaustive and objective evaluations of ergonomic risks using postural assessment methods, and help to reduce the incidence of MSKD in work places, with faster times and smaller costs.

5 Conclusion

This application saves time compared with traditional procedures based on visual inspection, specially for great amounts of images, which nowadays is only matched by instrumented methods. The advantage of our approach is that it does not require the worker to wear any sensors, special cameras, or calibration of the work space, further reducing time and material investments.

But even considering its current performance, the time spent in the selection of images and labelling postures is smaller than with the traditional, manual approach; and this method also reduces the mental workload of evaluators and the dependency on their expertise. Thus, the same technician can evaluate a greater number of work places or analyze the same place for different workers, as recommended by the EU-OSHA [1].