Introduction

Women comprise nearly two-thirds of older adult fallers [1]. Falls account for about 1.9 million nonfatal injuries of older women each year, resulting in an injury rate more than 50% higher than that of men [2]. Given this disparity in the impact of falls on men and women, as well as apparent between-sex differences in how strength [3] and balance [4] are altered with age, it is reasonable to take a sex-specific approach to identify fall-risk factors.

Up to 63% of older adult falls are from an extrinsic perturbation [5,6,7,8]. Therefore, balance-reaction tests are relevant in evaluating fall risk. At-home fallers have been characterized by lower perturbation forces that elicit a posterior step [9]. In the same study, responses to anterior and lateral perturbations did not hold that relationship. Conversely, the inability to limit steps after a lateral waist pull [10, 11] or forward lean release [12] has been associated with future falls. For feet-in-place reactions to an oscillating platform, the resulting lateral sway was larger for future fallers [8]. From these studies, it was clear that some, but not all, balance-reaction tests could play a meaningful role in evaluating fall risk. Between-study variability may underlie inconsistencies in results. Differences include the method, direction, size, and precision of perturbations; instructions pertaining to step constraints; and the participant’s certainty of the fall direction. These factors likely affect the measure’s reliability, precision, and ecological validity. Furthermore, a sex-specific approach to fall prediction had not been considered, despite difference in fall injury rates, incidence, risk factors, and balance-reaction capabilities [2, 13,14,15,16].

We developed a protocol to reliably and precisely quantify anteroposterior single- and multiple-stepping thresholds [17]. Using the same data set as the present study, we determined that these thresholds could only partially be inferred from a combination of age, functional measures, and balance confidence [18]. We did not know, however, if the unique perspective on balance reactions was relevant to fall risk.

The purpose of this study was to determine if stepping thresholds are prospectively related to falls in ambulatory, community-dwelling older women. We hypothesized that such thresholds would be related to falls in the year after assessment. Additionally, we hypothesized that thresholds would persist as independent predictors of falls when combined with the other measures of standing balance, gait, strength, balance confidence, and fall history. This study represented an exploratory aim of the Mayo Clinic Study Assessing Fall Epidemiology and Risk (SAFER), the primary goal of which was to evaluate the relationships between balance/function assessments and the Fracture Risk Assessment Tool (FRAX) [19] score.

Materials and methods

Study participants

We recruited 125 community-dwelling women for this study, targeting approximately 25 women per 5-year age strata from 65 to 85 + years (65–69 years: n = 27, 70–74 years: n = 26, 75–79 years: n = 26, 80–84 years: n = 25, 85 + years: n = 21). This sample size was chosen to provide 90% power for the SAFER primary analysis, correlating fall risk with FRAX scores (results not yet published). All women reported the ability to walk a city block without a gait aid. Participants were, on average, overweight, active, and with a few chronic comorbidities (Table 1). They had no previous diagnosis of dementia and were cognitively intact. This study was approved by the Mayo Clinic Institutional Review Board, and all participants provided written informed consent.

Table 1 Baseline characteristics of the 125 ambulatory, community-dwelling women age ≥ 65 years

Assessment of stepping thresholds

Stepping thresholds were assessed as participants stood on a computer-controlled treadmill (Simbex, Lebanon, NH, USA, Fig. 1, video in Supplementary Material). All participants wore a safety harness (Maine Anti-Gravity Systems, Inc., Portland, ME, USA) attached to an overhead rail, as well as their own pair of well-cushioned shoes. Two progressively challenging series of perturbations were administered [17]. One series was designed to quantify anterior (ASST) and posterior (PSST) single-stepping thresholds (Fig. 1a). A subsequent series evaluated anterior (AMST) and posterior (PMST) multiple-stepping thresholds (Fig. 1b). We have previously described this protocol in detail [17, 18]. Briefly, participants were instructed to “try not to step” in the single-stepping-threshold test and “try to take only one step” in the subsequent multiple-stepping-threshold test. The perturbation direction was randomized, so that, at most, three perturbations in the same direction were consecutively delivered. After participants acknowledged that they were ready, the timing of the perturbation was delayed 3–10 s. Therefore, participants were expecting a perturbation, but could not predict the timing or direction of it. Aside from the failure to respond as instructed, responses were also considered failures if the participant reported assistive support from the harness or the investigator observed unambiguous harness support. The perturbation that represented the threshold of interest elicited four consecutive failed responses. Given this criterion, it was typically the case that participants stepped against instructions early in the assessment, doing so in response to relatively small perturbations. However, participants usually learned to withhold steps within the next three attempts. To best estimate the magnitude of the destabilizing perturbation, thresholds were expressed as the resulting torque at the base of an inverted pendulum (τ =|m · a · l|), where m is body mass, a is the perturbation acceleration, and l is the estimated pendulum height (0.586 height).

Fig. 1
figure 1

a An 82-year-old participant recovers from anterior and posterior disturbances without a step. Here, anterior and posterior refer to the direction of the fall, not the translation of the treadmill belt. Shown are the responses to the largest disturbance recovered without taking a step. b Shown are the same participant’s responses to the largest disturbance recovered with a single step. A video of these tests with a different participant is available as Online Supplementary Material

Other assessments

  1. 1.

    Balance Confidence was recorded using the Activities-specific Balance Confidence (ABC) questionnaire [23].

  2. 2.

    Fall History in the 12 months prior to enrollment was self-reported.

  3. 3.

    Gait Analysis: Participants walked at preferred speeds, with body-segment motion recorded over 3–6 strides (120 Hz, Motion Analysis Corporation, Santa Rosa, CA, USA). Measures included average gait speed, stride time, the percent of stride in double-support, and step width, as calculated using commercial (C-Motion Inc., Rockville, MD, USA) and custom (Mathworks, Natick, MA, USA) software.

  4. 4.

    Obstacle Crossing: Participants walked 2.5 m before crossing a 2.4 cm obstacle. The average peak lateral speed of the whole-body center of mass (COM) during the crossing step [24], including three left and right crossing-steps, was determined from motion recordings.

  5. 5.

    Standing Postural Sway: Participants stood on two force plates (Kistler Instrument Corp., Amherst, NY, USA; 600 Hz) for 30 s. Outcomes included the center of pressure root-mean-square error (RMSE) under eyes-closed [RMSEEC] and eyes-opened [RMSEEO] conditions, as well as the Romberg ratio of the two (RMSEEC/EO) [8].

  6. 6.

    Unipedal Stance: Participants stood on one foot for up to 30 s. The maximum time of six attempts, three on each foot, was recorded [25].

  7. 7.

    Functional Reach: Participants reached as far forward as they could with their dominant hand. The average reach of the last three of five successful trials was determined [26].

  8. 8.

    Strength: Hip, knee, and ankle flexor and extensor isometric strength of the non-kicking limb was measured for three trials each (HUMAC NORM, Computer Sports Medicine, Inc., Stoughton, MA, USA). Grip strength of each hand was tested three times using a dynamometer (Aeverl Medical, Gainesville, GA, USA). The greatest force (hand grip) or torque (lower extremity) among all trials was selected.

Fall tracking

For 1 year, twice-monthly questionnaires were completed by participants [20]. Falls were defined as when the participant lost their balance and landed (1) on the floor, ground, or lower level; (2) on an object (e.g., furniture); or (3) against a wall or railing. We had 97.5% of fall mailers completed, a rate that benefited from careful tracking and follow-ups by phone [20].

Statistical analyses

To evaluate function relative to body size, all measures were scaled to unitless values (Table 2) [18, 27]. Logistic regression evaluated the univariate relationship of each measure with fall status (any vs none). Variables were selected for inclusion in a multivariable logistic regression model using penalized logistic regression analysis after imputing the data tenfold to include participants with missing data [28, 29]. A lasso penalty was chosen using cross-validation to select the penalty that resulted in the smallest misclassification error. The area under the receiver operator curve (AUC) was calculated to summarize the predictive accuracy of the logistic models. Significance was held at α = 0.05 for all tests. Analyses were performed using SAS version 9.4 (SAS Institute, Cary, NC, USA) and R version 3.4.2 (Vienna, Austria; www.R-project.org).

Table 2 Univariate relationships between assessment measures (scaled to body size) and prospective fall status

Results

Some participants did not complete all assessments (Table 2). Seven participants were partially limited by study staff as a safety precaution. Reasons for exclusion included acute knee pain, ankle tendon injury history with preferred use of an ankle brace, a history of cardiac events with current use of a pacemaker or loop recorder, or medical shunts or pouches that could be affected by the safety harness. Additional participants (n = 17) did not complete all stepping-threshold assessments. Thirteen participants completed none or only part of the tests due to self-reported nervousness on the treadmill. Four participants chose to end their participation due to back pain or knee soreness. Six participants did not complete all strength tests due to discomfort, self-reported fatigue, or, in one case, user error in saving data.

Over the year after assessment, 74 participants (59%) fell at least once. Fall details are reported in a separate report [20]. Of all measures, only PSSTs significantly discriminated future fallers from non-fallers (OR = 1.50, 95% CI 1.01–2.28, AUC = 0.62, Table 2). A standard-deviation decline in PSST was associated with a 50% increase in the odds of being a faller. In a multivariable model, PSSTs (OR = 1.45, 95% CI 0.98–2.21, p = 0.1) were paired with postural sway with the eyes closed (OR = 1.27, 95% CI 0.85–1.99, p = 0.3), although the predictive ability of this model did not increase (AUC = 0.62). So that results can be compared to other studies, non-normalized values are presented in the appendix. For non-normalized values, PSSTs and grip strength were significant discriminators of future fallers and non-fallers.

Discussion

The purpose of this study was to determine if stepping thresholds were prospectively related to falls in older women. PSSTs, or the disturbance magnitude that consistently elicited a backward step, were predictive of subsequent falls. This measure paired with vision-occluded sway in a multivariable model of fall prediction.

These results partially agreed with previous studies on balance reactions and subsequent falls in older adults. After a waist pull, PSSTs, but not ASSTs were predictive of falls [9]. Contrary to our results, AMSTs after a lean release predicted falls [12]. This discrepancy could be due to differences in study samples. The aforementioned study included both men and women, with women more likely to take multiple steps [12]. We encouraged treating sex as an independent factor, as there appeared to be interactions between sex, age, and fall circumstances [30]. In the lean-release study, 70% of falls were anterior in direction; and 13% were posterior [12]. The lean-release threshold specifically predicted forward falls. In our cohort of women, 44% of falls were anterior in direction and 41% were posterior [20]. PSSTs did not significantly predict posterior fallers (n = 42, OR = 1.27, 95% CI 0.85–1.93, p = 0.2, AUC = 0.60).

Our multivariable model aligned with previous studies. In a study that included both men and women, falls were predicted by a combination of mediolateral standing sway with vision occluded and anteroposterior sway while on an oscillation platform [8]. Therefore, there was a trend that steadiness and balance-reaction measures were independent, yet valid indicators of fall risk. This trend was supported by evidence that the tendency to take multiple anterior steps after a lean release predicted falls when partnered with tests of vision, sensation, strength, reaction time, sway, and dynamic balance [12].

Unlike previous studies [31], strength did not predict falls. A key difference in our study was that we scaled strength to body size. This was a logical approach, as many perturbations outside the laboratory were likely proportional to body size. In other words, larger people move with greater momentum, and their collisions with fixed objects will result in larger perturbing forces. In addition, the impulses necessary to arrest a fall were also proportional to body size. Using non-adjusted values, grip strength was a significant fall predictor (p = 0.04, Appendix). However, a positive correlation between grip strength and body mass (r = 0.26, p = 0.01) suggested that scaling was warranted. Of note, the combination of low grip strength and obesity, as measured by waist circumference, was a strong risk factor of falling [32]. Regardless of scaling, lower extremity strength did not predict falls. In a previous study, adding peak isokinetic hip abductor torque to the tendency to take multiple steps after a lateral fall made fall prediction more specific, albeit less sensitive [10]. Perhaps, isokinetic measures or tests of muscles acting in the frontal plane would have contributed to our fall-prediction model.

The association of posterior stepping thresholds with falls was likely weak (AUC < 0.63) due to the multifactorial nature of falling. Falls were influenced by intrinsic, extrinsic, and behavioral factors [33]. Therefore, a small battery of intrinsic assessments was not likely to sensitively and consistently identify fallers. The observation that a balance-reaction measure persisted, despite confounding influences, was a promising indicator that balance reactions should be part of a comprehensive fall-risk evaluation.

Our observed 59% fall incidence was higher than most previous reports [5,6,7,8], a result we attributed to frequent, twice-monthly questionnaires and persistent follow-up that resulted in 97.5% adherence in returning reports [20]. An alternative approach to the analysis of our study would be to use the number of falls as the dependent variable. With 158 recorded falls, and with 30% of participants falling more than once, this cohort of older women fell at a rate of 1.3 falls per person-year [20]. In this context, vision-occluded sway was the only measure to have a significant relationship with the number of falls (RR = 1.22, 95% CI 1.01–1.47, p = 0.043). Posterior stepping thresholds did not hold such a significant relationship (RR = 1.13, 95% CI 0.90–1.42, p = 0.30). This disparity in results, when considering the number of falls, may be due to the aforementioned intrinsic, extrinsic, and behavioral factors that underly fall risk [33]. Consider that, after a single fall, an individual would likely modify their behavior or extrinsic factors so as to reduce the risk of a subsequent fall. These modifications could include avoiding hazardous areas or activities, removing hazards in their environment, reducing gait speed, or being less physically active [34]. Such modifications would alter the likelihood of experiencing an external perturbation. In turn, fall-recovery skill, as measured by posterior stepping thresholds, would be a less-relevant factor underlying fall risk. Conversely, the sway test may be more sensitive to impaired sensory function, an intrinsic risk factor. In the presence of this substantial risk factor, behavioral and extrinsic-factor modifications, then, may be less effective at limiting a subsequent loss of balance. Therefore, the sway-based measure and the stepping-threshold measures may reflect two distinct influences on the risk of falling, an important consideration when determining an individualized approach to preventing a fall.

We have demonstrated a significant relationship between a measure of performance (i.e., PSSTs) and subsequent falls. From these data alone, however, we do not know the underlying mechanisms that lead to worse stepping thresholds. Standing balance reactions consist of complex, multi-joint actions to prevent a step [35, 36]. Posterior responses share common neural pathways to that of the startle response [37]. Therefore, posterior thresholds could reflect the ability to control the startle response in situations where the perturbation is anticipated, but the timing and direction are not known. Nearly all participants failed in response to perturbations smaller than their final threshold, but learned to recover successfully in the next three attempts. Our assessment, then, is not just a measure of motor performance, but is also an indicator of short-term motor learning. Further biomechanics and motor control studies are needed to elucidate the specific underlying features of the posterior stepping-threshold test that are related to fall risk.

An advantage of stepping thresholds as a fall-risk assessment is that it directly informs fall-recovery skill as an intervention target. Perturbation-based balance training has significantly reduced falls in older adults and individuals with Parkinson’s Disease (RR = 0.54, 95% CI 0.34–0.85) [38]. Perhaps, for those with impaired balance-reaction performance, this approach would prevent behavioral adaptations, such as limiting physical activity, that would have negative health consequences. Additional study is needed to determine if specifically improving PSSTs, a task in which a non-stepping response is encouraged, will subsequently decrease the risk of falls. It could be that such an approach would benefit the robustness of standing or walking when perturbed, limiting reliance on the separate skill of reactive stepping [39].

A limitation of this protocol is the risk of soreness or feasibility. Across all assessments, 4% of participants ended participation due to soreness, and 12% ended participation due to nervousness. These proportions were reduced to 1% and 4% when only accounting for the single-stepping-threshold protocol. We assume that the nervous response was associated with a fear of falling, a factor that is predictive of falls, yet can also result from experiencing a fall [40]. Therefore, the stepping-threshold test may not be applicable to a subset of older adults with substantial fear or very low fall self-efficacy, a group in which cognitive factors may play a more important influence on fall risk than motor factors [41]. Of the five participants who did not register a PSST due to nervousness, two subsequently fell. The psychological risk factors considered in our study only included balance confidence, ignoring other aspects such as depression [42], apathy [43], or chronic pain [44]. A second limitation may be insufficient power to detect significant relationships between our measures and subsequent falls. Numerous measures in this exploratory analysis had promising, non-significant (p < 0.35) relationships with fall status (Table 2). We cannot conclude that these measures have no utility in evaluating fall risk, yet our observed effect sizes may be of use in meta-analyses, or they can inform expectations for more rigorous, powered study.

Conclusions

PSSTs are prospectively related to falls in community-dwelling older women. Given its promising reliability [17], these thresholds may also serve as a relevant pre- and post-test indicator of how rehabilitation alters the risk of falling. Subsequent work is needed to determine if this assessment is indicative of fall risk in other populations, such as older men.