Introduction

Spinal disorders are highly prevalent global health problems with immense societal and medical costs [19, 50]. Spinal disorders such as back pain, vertebral fractures, degenerative disk disease, spine deformity, and lumbar spinal stenosis, can affect spinal kinematics, trunk postures, and spine tissue loading (e.g., [6, 12, 14, 31, 39, 40, 47]). Older adults, who are at increased risk of spinal disorders, back injuries, and mobility deficits, constitute the fastest growing segment of the labor force [49]. Work force participation for individuals aged 65–74 is expected to be 30% by 2026, a near doubling of the rate observed in 1996 [18]. Therefore, it has become imperative to improve our understanding of spinal demands in older adults, in an attempt to mitigate their risk of back injury and pain.

Musculoskeletal models are computer-based tools that utilize anatomic details combined with engineering principles to non-invasively estimate internal tissue demands [29]. Model estimates of trunk tissue demands are associated with risk of injury [45]. However, prior to applying a musculoskeletal model to assess demands, it should first be validated for equivalent tasks. An established approach of model validation is comparing predicted muscle activations from the model to directly recorded activations from electromyography (EMG) [33]. However, while prior studies have used EMG to validate lumbar spine models (e.g., [2, 11, 13, 46]), no thoracolumbar spine model has been directly compared to EMG during dynamic activities. Furthermore, prior studies exclusively focused on young or middle-aged adults, and thus, the model validation of dynamic tasks for older adults has not been reported.

Our lab developed a fully articulated thoracolumbar spine musculoskeletal model capable of estimating trunk muscle activations and musculoskeletal loading throughout the spine [15, 16]. We also developed methods for rapid generation of subject-specific musculoskeletal models by incorporating spine curvature and muscle morphology measurements obtained via medical imaging [17]. Our model has been validated for estimating spine tissue demands during static poses [15], but a validation during dynamic tasks is lacking. Thus, the objective of this study is to further evaluate the performance of our thoracolumbar musculoskeletal model by determining the association (pattern and magnitude) between model-predicted muscle activation and experimental EMG recordings of muscle activities during dynamic activities in older adults.

Methods and Materials

Participants

We recruited eleven healthy older adults (6 women, 5 men) from the local community through fliers and online postings. Participants’ average (SD) ages, statures, body weights, and body mass indices (BMIs) were 65 (9) years, 168 (11) cm, 72.8 (19.6) kg, and 25.2 (4.0) kg/m2, respectively. Exclusion criteria included patient-reported conditions that might alter spine biomechanics, such as a history of traumatic spine injury or spinal surgery; severe scoliosis which needed brace or surgical treatment; neuromuscular conditions such as Parkinson’s disease, hemiplegia, multiple Sclerosis, or muscular dystrophy; a score ≥ 10 on the Short Blessed Test (suggesting possible impaired cognitive function or dementia); BMI > 30 kg/m2; self-reported musculoskeletal injury affecting the normal activity or movement. All participants ranged in age from 50 to 85 years and were able to perform activities of daily living (such as walking, standing, sitting, bending, or lifting) without assistance. The study was approved by the Institutional Review Board of Beth Israel Deaconess Medical Center, and written informed consent was provided by all participants prior to participation.

Experimental Procedures

Kinematic and Kinetic Measurements of Dynamic Tasks

Anthropometric data including age, height, and body weight were recorded from all participants. Participants wore compression shorts and a tailored tank top to expose their back and allow for direct placement of retroreflective markers and EMG sensors. To assign the proper placement of the markers, anatomical landmarks were palpated and marked. Ninety-seven passive markers (two marker sizes were used: 14 mm and 9.5 mm) were affixed to the skin using double-adhesive tape in accordance to anatomical landmarks [44] and on the top corners of a 30 × 30 × 30 cm crate individually tailored to 10% of each participant’s body weight.

Participants first performed a standing calibration pose, followed by five different dynamic tasks. Four of the dynamic tasks involved lifting/lowering the crate in: (1) axial rotation, (2) two-handed (2 h) sagittal flexion/extension lifting, (3) one-handed (1 h) asymmetric flexion/extension lifting with the right hand only, and (4) lateral bending. The fifth task simulated opening a window as an activity of daily living, starting with arms extended anteriorly at waist height and raising them to head height with a 9 N dumbbell in each hand. This force was chosen based on an estimated mass of about 1.8 kg for a typical window sash. During axial rotation, participants rotated their trunk axially to lift the crate with both hands from a waist high platform on their left side and then transferred the crate to an equivalent platform on their right side. For flexion/extension lifting (2 h and 1 h), participants flexed the trunk forward to lift the crate from the ground to waist height and then lowered the crate to ground in the same manner. In the lateral lifting task, participants lifted/lowered the crate from a stool 30 cm above the ground on their right side to waist height using only their right hand. For each participant, all dynamic tasks were repeated three times at a self-selected pace (< 10 s per task), but only one trial for each task was selected for analysis, based on review of the quality of the kinematics and EMG recording. Participants rested at least one minute between each task. During all tasks, the participants’ feet remained in contact with two embedded force plates (AMTI OR6-7-1000, Watertown, MA USA) and began and ended in a neutral standing posture without the crate. Three-dimensional full-body kinematics were collected at a 100 Hz sample rate with a 10-camera motion analysis system (Vicon Motion Systems, Oxford, UK). The two embedded force plates measured the ground reaction forces of each lower limb and were recorded synchronously within the motion capture software at a sampling rate of 1 kHz.

Electromyography (EMG) Recordings and Processing

Surface EMG signals were recorded during all tasks using a wireless system (Delsys TrignoTM, Delsys Inc., Natick, MA, USA) synchronized and recorded at 1925.93 Hz within the motion capture system. Prior to EMG sensor placement, the skin sites were prepared and cleaned according to established standards [48]. In total, eight sensors were placed bilaterally on the skin over four major muscle groups of the back (i.e., longissimus erector spinae (LT), iliocostalis erector spinae (IC)) and abdominal (i.e., external oblique (EO) and rectus abdominis (RA)) muscles, as previously described [1, 9]. Throughout the experimental protocol, all signals were inspected for quality and appropriate muscle function.

Prior to performing the dynamic tasks, participants performed a series of maximum voluntary isometric contractions (MVICs) to normalize their EMG signals. Participants performed three repetitions of four different seated trunk exertion tasks (i.e., trunk flexion, extension, and right and left lateral bending) designed to isolate the back and abdominal muscles. For most of the MVICs, the trunk was positioned in a neutral upright posture, with the exception of the extension task where the trunk was positioned in 20° of forward flexion [7, 35, 41]. Verbal encouragement was provided during all MVICs and subjects rested at least 30 seconds between exertions [10].

EMG signals were band-pass filtered (20–450 Hz, 6th-order Butterworth filter, bidirectional). The resulting filtered signals were then full-wave rectified and subsequently smoothed with root-mean-square envelopes from a moving window of 400 ms [9]. Processed EMG activations of each muscle group were then normalized with respect to each participant’s maximum EMG activity observed during either the MVICs or the dynamic tasks. The resulting normalized EMG (nEMG) was used for comparison with model-predicted muscle activations. All of the EMG data were processed using custom MATLAB scripts (The MathWorks Inc., Natick, MA).

Acquisition and Analysis of CT Scans of Thoracolumbar Trunk

All participants underwent volumetric CT scans of the chest, abdomen, and pelvis using a multi-detector scanner (Aquillon Prime SP). Scans were acquired at a tube voltage of 120 kVp, a nominal in-plane voxel size of 0.5 mm, and a slice thickness of 0.5 mm. CT scans were analyzed to get information about the spinal curvature and trunk muscle morphology using commercial software packages (SpineAnalyzer, Optasia Medical, Cheadle, UK; and Analyze, Biomedical Imaging Resource, Mayo Clinic, Rochester, MN) [17, 36]. Sagittal spine morphometry was extracted to accurately model the subject-specific spine geometry and curvature. Muscle size and centroid position relative to the spine were measured to represent the subject-specific musculature for each model more accurately [17, 36].

OpenSim Musculoskeletal Model Development and Model Muscle Activations

Creating and Solving Musculoskeletal Models

Our full-body thoracolumbar model is based in OpenSim version 4.3 [28]. The base model includes 620 musculotendon actuators, 78 rigid bodies, and 165 degrees of freedom [15, 21], with specific models for men and women. The thoracolumbar spine is modeled with 575 musculotendon actuators along with 17 rigid bodies with the total of 51 degrees of freedom. Crate inertia was added to the model by welding a rigid body to each hand with half of the inertial properties of the crate [4]. Each model was tailored to each participant according to gender, height, body weight, and marker positions from their neutral standing posture. In addition, custom MATLAB scripts were used to further refine the model to subject-specific spine (i.e., intervertebral joint angles and distances) and trunk muscle (i.e., muscle cross-sectional area and distances from joints) parameters obtained from the CT scans [17]. The maximum isometric force of trunk muscles was adjusted based on the measured cross-sectional area from CT scans, assuming a maximum muscle stress of 78 N/cm2 based on our prior report for back muscles in older adults [20]. Subject-specific models were subsequently used for all of the simulations and analyses.

Participants’ kinematics were tracked via OpenSim inverse kinematics, which fit subject-specific musculoskeletal models to recorded marker positions. Coordinate coupling constraints were assigned to reduce spinal degrees of freedom from 51 to 6 during the inverse kinematics tracking [8, 11]. To estimate the kinetics responsible for the tracked kinematics, musculotendon actuator forces were calculated from a static optimization algorithm that minimized the sum of all activations cubed [26, 34]. All estimated muscle activation thresholds ranged from 0 (no activation) to 1.0 (fully activated to achieve maximum force).

Extracting Model Muscle Activations

EMG sensor locations were mapped onto each subject-specific model in accordance with the nominal placement of the EMG sensor. For LT and IC muscle groups, this was immediately lateral to L1 and L2 vertebral levels, respectively. For EO muscle group, the location was halfway between the anterior superior iliac crest (ASIS) and distal border of the rib case, at L3 vertebral level. For RA, the placement was 1 cm above the umbilicus at approximately the L2 vertebral level. Then, only the musculotendon actuators in the target muscle group whose paths crossed the sensor location in the axial plane were identified. Muscle activations resulting from those actuators were then averaged to produce the model-estimated muscle activations for each frame in the time series. It should be noted that the model muscle activations were not filtered similar to the experimental nEMGs because the model was solved at a much lower frequency than EMG recordings, and therefore, similar filtering would not be feasible.

Statistical Analysis

Ensemble average plots of experimental nEMGs versus model-predicted muscle activations for back and abdominal muscles were illustrated for all dynamic tasks. In addition, to quantitatively validate our subject-specific thoracolumbar spine model, we focused on two outcomes: (1) the maximum absolute normalized cross-correlation (MANCC) coefficient, to quantify the temporal similarity (i.e., trend/pattern similarity) between model-predicted and normalized experimental muscle activities; and (2) the root-mean-square error (RMSE), quantifying the magnitude of the difference between model-predicted and normalized experimental muscle activities. Both outcomes were calculated for each combination of participant, dynamic task, and muscle group. Note that MANCC values larger than 0.9, 0.7, and 0.4 would indicate excellent, strong, and moderate pattern similarity, respectively.

Results

Ensemble Average Plots of Model-Predicted Muscle Activities Versus Experimental nEMG for All Muscle Groups and Tasks

Ensemble average plots of model-predicted muscle activations versus experimental nEMG indicate that the model predicts the general trend of the back muscles, but it tends to underpredict the activations of longissimus erector spinae muscles in the majority of the tasks (Fig. 1). The model tended to underpredict the muscle activations of trunk flexors, but given the low overall magnitude of the activations of these muscles, their relative differences tended to be small (Fig. 2).

Fig. 1
figure 1

Ensemble average plots of experimental nEMG (red lines and shading) vs. model-predicted (blue lines and shading) muscle activity for all of the back muscles (left/right longissimus erector spinae and left/right iliocostalis erector spinae, shown in columns) during five dynamic tasks (i.e., axial rotation, 2-handed (2 h) sagittal lifting, 1-handed (1 h) sagittal lifting, lateral lifting, opening a window, shown in rows). Task times from each participant were scaled from 0 to 100%.

Fig. 2
figure 2

Ensemble average plots of experimental nEMG (red lines and shading) vs. model-predicted (blue lines and shading) muscle activity for all of the abdominal muscles (left/right external oblique and left/right rectus abdominis, shown in columns) during five dynamic tasks (i.e., axial rotation, 2-handed (2 h) sagittal lifting, 1-handed (1 h) sagittal lifting, lateral lifting, opening a window, shown in rows). Task times from each participant were scaled from 0 to 100%.

Temporal Similarity Between Model-Predicted Muscle Activities and Experimental nEMG (MANCC Coefficients)

In general, we observed a high temporal similarity between model-predicted and experimental nEMG for back extensor muscles and moderate similarity for flexors (Figs. 1, 2, 3). Both the longissimus (0.95 ± 0.08) and the iliocostalis (0.92 ± 0.13) erector spinae muscle groups had high average median MANCC coefficients across all lifts, with the exception of the right iliocostalis during the lateral lift task (0.64 ± 0.17). The abdominal muscles (both external oblique and rectus abdominis) had a moderate level of temporal similarity during all of the tasks, except during the window opening task, which demonstrated excellent temporal similarity (median MANCCs ≥ 0.94).

Fig. 3
figure 3

Heatmap of median values for maximum absolute normalized cross-correlations (MANCC) between modeled and normalized experimental muscle activities (LLT/RLT: left/right longissimus erector spinae, LIC/RIC: left/right iliocostalis erector spinae, LEO/REO: left/right external oblique, LRA/RRA: left/right rectus abdominis) during the five dynamic tasks (Ax Rot: axial rotation, Sag lift (2 h): 2-handed sagittal lifting, Sag Lift (1 h): 1-handed sagittal lifting, Lat Lift: lateral lifting, and Open a Window: opening a window).

Magnitude Difference Between Model-Predicted Muscle Activities and Experimental nEMG (RMSE Values)

On average, in the back muscles, the left and right longissimus (0.23 ± 0.12) erector spinae muscles had larger median RMSEs compared to the left and right iliocostalis (0.14 ± 0.09) erector spinae muscles (Fig. 4). The left longissimus muscle had the highest RMSE values (i.e., poorer performance) across all tasks with an average median RMSE of 0.26. All other back muscles had average median RMSEs ranging from 0.13 to 0.16 across the five tasks. In the abdominals, the external obliques had a lower (i.e., better matched) average median RMSE compared to rectus abdominis muscles (0.08 vs 0.11). Overall, abdominal muscles had lower RMSEs than the back muscles during most tasks, but the iliocostalis had the lowest RMSEs during the open-window task. It is worth noting that the abdominal muscles had far lower overall activation levels than the more agonistic back extensors (Figs. 1, 2) and this low level of activation would favorably impact the interpretation of their RMSE values.

Fig. 4
figure 4

Boxplots of root-mean-square error (RMSE) between modeled and normalized experimental muscle activities (LLT/RLT: left/right longissimus erector spinae, LIC/RIC: left/right iliocostalis erector spinae, LEO/REO: left/right external oblique, LRA/RRA: left/right rectus abdominis) during five dynamic tasks (axial rotation (Ax Rot), 2-handed sagittal lifting (SagLift (2 h), 1-handed sagittal lifting (SagLift (1 h), lateral lifting (Lat lift), and opening a window (Open Window)). Boxplots indicate the RMSE values of all participants (black dots) as well as 25th percentile (lower limit of each box), median (open space within each box), and 75th percentile RMSE values (upper limit of each box).

Discussion

In this study, we validated the performance of a thoracolumbar spine model for predicting the activations of trunk muscles during five lifting tasks by quantifying their pattern of temporal similarity and magnitude difference between model-predicted muscle activations and experimentally measured EMGs. Our results indicated that the thoracolumbar model reasonably predicts the temporal trends of measured muscle activations for most of the observed trunk muscles and tasks. Moreover, both the temporal and magnitude results compare well with prior model-predicted estimates of trunk muscle activity.

Our temporal associations, expressed as MANCC, are similar to or better than previous evaluations of lumbar spine musculoskeletal models [2, 13, 30]. Specifically, Favier et al. [30] reported cross-correlation values ranging from 0.93 to 0.98 for back muscles during a 5 kg lift, which is equivalent to the 2-handed and 1-handed sagittal lift tasks and values we examined. Our MANCC results of back muscles demonstrate that the temporal similarity between the thoracolumbar model and EMG recordings is generally strong and does not markedly vary by task. For the abdominal muscles, we had strong temporal similarity during the open-window task, but moderate temporal similarity for the other tasks. Prior studies have not explicitly reported cross-correlations for abdominal muscles, but our ensemble average plots of rectus abdominis for 2-handed and 1-handed sagittal lifting were qualitatively similar to those reported by Beaucage-Gauvreau et al. [13]. Abdominal muscle activity was quite low relative to the back muscle activity, a finding that could contribute to our moderate correlation results. Further, the applied modeling approach tends to underpredict the activations of abdominal muscles during lifting tasks because abdominal muscles function primarily as antagonists in these activities and static optimization penalizes antagonist co-contraction [3, 13].

The magnitude of the error between measured and model-predicted muscle activations varied by muscle group, but was comparable with previous model evaluations conducted in younger and/or middle-aged subjects [2, 13, 30, 43]. Due to low activation levels of abdominal muscles, their median RMSE values did not vary between the tasks (Fig. 2). These findings are comparable with those qualitatively reported by Beaucage-Gauvreau et al. [13]. For back muscles, left longissimus erector spinae had a higher RMSE (ranging from 0.25 to 0.31) compared to other back muscles (ranging from 0.04 to 0.22). Overall, the higher error for longissimus than for iliocostalis erector spinae muscles may partially be explained by the method of normalizing the EMG activations. In previous studies, the maximum muscle activity has typically been defined as the maximum activation recorded during each task. However, the maximum in our study was defined as the maximum muscle activity observed across all tasks and MVICs. This method of EMG normalization allows for a consistent and more physiologically relevant maximum across different tasks. The source of the maximal EMG signal depended on the muscle and activity, and for example, generally occurred during MVICs for iliocostalis, but during dynamic tasks for longissimus erector spinae. This variation, along with the fact that our model often underpredicted the EMG activations, suggests that muscles may not have been normalized to a true maximum in some cases, thereby increasing RMSEs. Our sample of older adults might have affected the accuracy of MVIC data as older adults generally have more recruitment variability compared to younger adults due to their reduced ability to generate smooth and accurate movement [23]. Older adults can also be more reluctant to exert a true maximal effort, perhaps due to their perceived risk of injury, and thus, voluntary exertions might not reflect their true capacity [38]. These factors may have impacted the accuracy and precision of experimental nEMG, and therefore, this model evaluation, particularly for the muscle activation magnitudes that were underpredicted by model. In addition, maximal muscle stress was assumed to be equivalent among all participants and muscle groups. Studies that evaluate maximum muscle stress indicate large variations between individuals as well as differences between muscle groups [20, 32], which would directly alter model-estimated muscle activations. Additional individualization of the strength of each subject-specific model could further reduce the RMSE between measured and model-predicted activations [11, 20]. Combined, these factors make determining the definitive sources of error between model and EMG magnitudes difficult. Nonetheless, it is important to report the differences [33], in order to highlight the limitations and areas for improvement in current approaches.

A key goal of this study was to provide a validation of this musculoskeletal model for use in assessing dynamic tasks. Validation of musculoskeletal models using EMG is recommended to focus on temporal comparisons [33] specifically due to difficulties related to EMG normalization. Here, we show evidence of model validity in dynamic tasks as the model showed good temporal associations with EMG data. A model outcome of particular interest is spinal loading, as it is associated with conditions including low back pain [24], vertebral fractures [37, 51], and spinal stenosis [44]. We note that the model used here has been previously validated for estimating spine loading during static poses [15] using the same static optimization approach for model evaluation. The findings of temporal validity here, combined with previously established validity to estimate spine loads, provide support for the use of this model in evaluating spine loading outcomes during dynamic tasks. Direct assessment of dynamic spinal loading validity remains desirable, although the difficulty of obtaining comparison measurements remains an obstacle.

Study Limitations

Several limitations need to be acknowledged for the current work. First, our sample size (n = 11) was relatively small, but this is comparable to or larger than other studies in which EMG was used to validate a full-body spine model [2, 11, 13, 30, 46]. Second, we examined only lifting tasks, and more research is needed to confirm whether the current findings extend to other tasks. Third, our participants performed isometric MVICs in a seated position, while dynamic tasks were performed standing. Force production capability and/or maximal EMG signal may have been different in the seated versus the standing position, which could alter the nEMG. As previously noted, we attempted to address these concerns by normalizing to the maximal signal observed in any of the MVICs or dynamic activities. Fourth, our study did not assess how different approaches to personalized model creation might affect the results, as we only examined our standard approach. Several studies of gait suggest that muscle activity timing is not very sensitive to the personalization approach in modeling [5, 25]. Fifth, our model does not incorporate the loading contribution of passive structures including muscles, ligaments, and intervertebral disks, though it is unclear whether adding these elements to the model would improve the model-predicted muscle activations with experimentally measured EMG values. Finally, the model currently uses a static optimization approach to solve for muscle redundancies. While this approach has been validated in this model for evaluating spine loading in static poses [15], it has several limitations [22, 27] and can be influenced by how many actuators the muscles are partitioned into. This, along with the aforementioned normalization methods, makes direct comparisons of activation magnitude with EMG challenging [33]. Moreover, there are countless muscle recruitment patterns that could satisfy the kinetic demands, but musculoskeletal loads can be very different even between plausible solutions [42]. Future work should examine the sensitivity of spine model loading outcomes to alternative solution patterns.

Study Strengths

Our study had several scientific strengths and innovations. First, we demonstrated the validity of our model to predict trunk muscle activity during five unique lifting tasks, in terms of temporal similarity pattern and the magnitude difference between model-predicted muscle activations and experimentally recorded EMGs. Most previous validations concentrated only on sagittal lifting tasks, but here we indicated that the muscle activations from our model are also reflective of non-sagittal lifts. Second, prior studies only examined young to middle-age adults, while we examined older adults. Given the growing number of older adults in the workforce today and high rate of spinal disorders in this group, including this population in model validation efforts is important. Third, using medical imaging (i.e., CT scan) data to create subject-specific thoracolumbar spine models is innovative, and our group has previously reported that incorporating subject-specific spine curvature and muscle morphology can significantly influence the estimates of spinal loading [17]. Fourth, in many previous studies, the maximum muscle activity has been defined as the maximum during each task, which only allows for within-task normalization. This approach limits the ability to examine overall model performance between tasks. Our method of EMG normalization allows for a more consistent and relevant maximum across different tasks, which we feel is an important strength. Finally, there is no established or recommended approach for extracting muscle activation from the results of static optimization analysis for comparison to measured EMG, and indeed, most studies have not clearly described their methodology. In our current work, we used an average of musculotendon actuators in the target muscle groups that were near the nominal EMG electrode locations, which is to our knowledge a novel approach. Further research is necessary to examine the sensitivity of model-predicted muscle activations to the methods used for extracting them from the results of static optimization analyses.

Conclusions

In this study, we performed a comprehensive EMG validation of muscle activity predicted by static optimization analyses of subject-specific thoracolumbar spine musculoskeletal models during measured dynamic activities in older adults. We determined the capability of using these subject-specific thoracolumbar spine models to estimate the pattern and magnitude of muscle activities relative to recorded nEMG. Overall, we found that the model-predicted muscle activity estimates the EMG-measured temporal activity patterns of back muscles well during dynamic tasks, however, with only moderate temporal similarity for abdominal muscles. Our magnitude results compare well to equivalent evaluations of other spine models, but differences between model and EMG outcomes can be high, and the specific sources of error can be difficult to determine.

The current results provide confidence in the validity of this model for evaluating subject-specific dynamic lifting tasks, based on the temporal validity of estimated muscle recruitment throughout the thoracolumbar spine with EMG measurements. We propose that this, given the prior validation of this model for predicting static spinal loading outcomes [15], suggests that the model has similar validity for predicting dynamic spine loading as for static spine loading. Overall, this supports the use of this modeling process for predicting musculoskeletal loading outcomes during a variety of dynamic lifting activities to estimate risk of injury and identify biomechanical mechanisms contributing to spinal disorders.