Introduction

Healthy life expectancy is defined as the period in which one can live without any restrictions on daily life by health problems. The difference between life expectancy and healthy life expectancy is the presence of an “unhealthy period” in which daily life is somehow restricted. In 2016, Japan reported this period to be 8.84 years for males and 12.35 years for females [1]. With the increase in Japan’s elderly population, this unhealthy period is growing, resulting in more of the population needing long-term care [2]. The need for long-term care in Japan is primarily driven by locomotor (i.e., musculoskeletal) disorders produced by fractures and falls [3]. Therefore, understanding the current state of locomotor disorders in the elderly and establishing a prevention and treatment strategy are emergency tasks. To accomplish this, the Japanese Orthopaedic Association (JOA) proposed the concept of locomotive syndrome (LS) in 2007.

LS is a condition wherein mobility functions (such as sit-to-stand or gait) decline because of locomotive organ impairment. The progression of this syndrome limits one’s independence to carry out the activities of daily living (ADL) [4]. Evaluation tools to assess the risk of LS include the stand-up test, two-step test, and 25-question geriatric locomotive function scale (GLFS-25). The risk levels for LS are classified by stages 0, 1, 2, and 3 [4, 5]. Stage 0 means no locomotive problems are present. Stages 1 and 2 represent “a beginning of the decline of mobility function” and “an indicator of underlying pathological changes in locomotive organs,” respectively [4]. Stage 3 was recently proposed to indicate an “advanced condition of LS stage 2 requiring surgeries for the degenerative disease.” The significance of this stage remains to be defined in the future [5].

The stand-up test quantifies lower limb strength by evaluating an individual’s ability to stand up from stools of four different heights (10, 20, 30, and 40 cm) using one or both legs. Performance scores are assigned as follows: 0 designates the inability to stand; 1, 2, 3, and 4 designate the ability to stand using both legs from heights of 40, 30, 20, and 10 cm, respectively; and 5, 6, 7, and 8 designate the ability to stand using one leg from heights of 40, 30, 20, and 10 cm, respectively. Combined scores of < 2, < 3, and < 5 are classified as LS stages 3, 2, and 1, respectively. This test demonstrates one's ability to move the center of gravity in the vertical direction, as represented by the weight-bearing index [6].

The two-step test score is determined by the following equation: two-step score = the maximum length of two steps (cm) ÷ height (cm). Scores of < 0.9, < 1.1, and < 1.3 are classified as LS stages 3, 2, and 1, respectively. This test is an indication of horizontal movement ability (such as walking speed and movement distance over time), the degree to which one may be bedridden, and one’s risk of falling [7]. The stand-up test and the two-step test represent objective physical functions such as lower limb strength, standing balance, flexibility, and walking ability [6, 7].

In contrast to these tests that assess physical function directly, the GLFS-25 consists of 25 self-assessment questions for the assessment of motor function and ADL. It was developed by JOA-designated experts and with reference to previous literature. The 25 questions are classified into statistically correlated subcategories based on actual answers as follows: body pain (Q 1–4), movement-related difficulty (Q 5–7), usual care (Q 8–11, 14), social activities (Q 18, 21–23), cognitive (Q 24, 25), and ADL (Q 12, 13, 15–17, 19, 20). The ADL subcategory has been reported to have a strong correlation with the other five subcategories [8] (Supplemental Table). The total score ranges from 0 to 100. A GLFS-25 score of ≥ 25, ≥ 16, and ≥ 7 is classified as LS stages 3, 2, and 1, respectively. This test is a subjective evaluation for elderly individuals whose age is ≥ 65 [5, 8].

After the LS criteria were published, the usefulness of the GLFS-25 and its correlation with other indicators were verified. Studies have reported a significant relationship between the GLFS-25 and functional parameters such as fall history, five-times sit-to-stand (FTSTS), and 6 m walking time [8, 9], as well as quality of life (QOL) parameters such as the EQ-5D and nursing care needs [8, 10]. Relationships among the GLFS-25 and the stand-up and two-step tests were, however, reported to be quite weak [11]. Thus, the significance of the GLFS-25 as being representative of physical function is still controversial. In addition, little information regarding the six subcategories of the GLFS-25 has been accumulated to date [8, 10].

In the present study, we aimed to clarify whether the GLFS-25 can reliably represent one’s LS stage and anticipate results from the stand-up and two-step tests. We also aim to clarify the relationships between subcategories of GLFS-25 and physical functions.

Materials and Methods

We conducted a retrospective study to analyze data from a care prevention project, “A Course for locomotive syndrome prevention,” conducted in Mitaka City from June 2017 to March 2020. Inclusion criteria were as follows: be age 65 or older; possess sufficient ability to understand and answer questions; possess sufficient walking ability without assistance; and have sufficient physical function to travel to the project venue without assistance. Individuals with cerebral dysfunction, severe cardiovascular disease, or mental disease were excluded. Data were collected from 103 participants. The age of the participants ranged from 65 to 90 years (mean 77). There were 23 male and 80 female participants.

The GLFS-25 was sampled using the method developed by Seichi et al. [8]. Limb muscle mass was measured by bioimpedance analysis using an InBody770 (InBody Japan co., Ltd., Tokyo, Japan). The skeletal muscle mass index (SMI) was acquired by dividing limb muscle mass by height squared (kg/m2). Body mass index (BMI) was calculated by dividing weight by height squared (kg/m2).

In addition to the stand-up and two-step tests, the team collected data for handgrip strength (HGS), the five-times sit-to-stand (FTSTS) test, the 4 m walking test (4 m-WT), and the single leg standing (SLS) test to evaluate physical functions. The team performed the two-step test twice and used the higher value. The team measured HGS twice on the dominant hand and used the mean value. For the FTSTS test, the team measured the time required to stand up from a sitting position five times (at a chair height of 45 cm). Throughout the FTSTS test, participants were required to keep their arms folded across their chest and prohibited from using a pushing-off motion [12]. For the 4 m-WT, the team twice recorded the time it took participants to walk 4 m as quickly as they could, and used the mean value. The SLS test was performed on the dominant leg with the eyes open. We recorded 120 s for those who could stand for longer than 120 s.

We performed statistical analysis using JMP version 13 (SAS Institute Japan Ltd., Tokyo Japan). We used the Mann–Whitney U and Kruskal–Wallis tests and considered a p-value of less than 0.05 to be significant. For comparison between groups, p-values were adjusted by Bonferroni methods. We used Spearman’s rank correlation coefficient for correlation analysis and defined significant correlation as a p-value of less than 0.05. Receiver operating characteristic curve (ROC) analysis was used to determine cutoff values of continuous variables. The study was conducted according to the Declaration of Helsinki principles. Informed consent was obtained from all participants included in the study. This study was approved by the research ethics committee of the authors’ institution.

Results

The GLFS-25 scores ranged from 0 to 55 (mean 7.8, median 5). The LS stage determined by the GLFS-25 was 0 in 59 participants, 1 in 27 participants, and 2/3 in 17 participants. The distribution of scores for each subcategory is shown in Fig. 1. Scores in all subcategories tended to be distributed in the lower range. There was no significant difference in GLFS-25 scores between male and female participants (p = 0.36). The GLFS-25 did not correlate with BMI (r = 0.05, p = 0.58) or SMI (r = 0.06, p = 0.54). A correlation trend between the GLFS-25 and age was observed but not significant (r = 0.18, p = 0.08). Analysis of the relationship between the six subcategories of the GLFS-25 and participants’ backgrounds revealed significant correlations between age and ADL (r = 0.24, p = 0.01) and between sex and usual care (p = 0.04). Other than these, there was no correlation between participants’ backgrounds and scores in any subcategories (Table 1).

Fig. 1
figure 1

The distributions of scores for GLFS-25 and each domain

Table 1 Relationship between GLFS-25 and background of the subjects

The scores of the stand-up test ranged from 1 to 5 (mean 3.6). The LS stage determined by the results of the stand-up test was 0 in 23, 1 in 65, and 2/3 in 15 participants. Two-step test scores ranged from 0.48 to 1.76 with a mean of 1.28. The LS stage determined by the results of the two-step test was 0 in 57, 1 in 32, and 2/3 in 14 participants. The LS stage determined by these three tools combined was 0, 1, and 2/3 in 14, 58, and 31 participants, respectively. This demonstrates a large discrepancy in the distribution of LS stages between the GLFS-25 alone and the three tools combined, especially at stage 0 (Table 2). Although we confirmed weak relationships between the results of the GLFS-25 and stand-up test (r =  − 0.32, p = 0.0008), and the GLFS-25 and two-step test (r =  − 0.32, p = 0.0009) (Table 3), the results indicate that subjects with low GLFS-25 scores tended to be classified into an inappropriately low LS stage compared with the actual LS stage when evaluated together with physical function tests. In the post-hoc test, the GLFS-25 stages were statistically correlated with the stages of the combined three tools. However, several participants classified in LS stages 1 or 2/3 showed considerably low (i.e., better) GLFS-25 scores (Fig. 2A). ROC analysis was performed to determine whether the GLFS-25 could represent the LS stages determined by the three tools combined. The AUC was 0.78 and 0.81 for the determination of LS stages 0/1 and 1/2, respectively, suggesting moderate effectiveness of the GLFS-25 for the determination of LS stages (Fig. 2B, C). The sensitivities for stages 1 and 2 were, however, relatively low (at 0.55 and 0.62, respectively), reflecting the presence of a low GLFS-25 score group among subjects with LS (Table 4).

Table 2 Locomo grade distribution of the subjects
Table 3 Relationship between the results of Stand-up test/ Two-step test and GLFS-25
Fig. 2
figure 2

A Analysis of the relationship between the GLFS-25 and locomotive stage determined by the 3 tools. The GLFS-25 is statistically correlated with LS stage, however several participants classified as LS stage 1 or 2/3 had lower GLFS-25 scores. B, C The effectiveness of GLFS-25 on the determination of LS was analyzed by ROC. AUC was 0.78 for LS stage 0/1 determination (Figure B) and 0.81 for LS stage 1/2 determination (Figure C), indicating that the GLFS-25 is moderately effective in determining LS stage

Table 4 Results of receiver operating characteristic (ROC) analysis

Next, we analyzed the correlations between the GLFS-25’s six subcategory scores and physical function assessments. The results of the stand-up test were correlated with movement-related difficulty, usual care, ADL, and social activities, and weakly correlated with cognition. The two-step test was correlated with ADL and social activities (Table 3). ADL correlated with all four physical tests (HGS, FTSTS, 4 m-WT, and SLS). Movement-related difficulty and usual care correlated with the FTSTS test, and social activity weakly correlated with the FTSTS test and 4 m-WT. However, body pain and cognitive were not correlated with any of the physical tests (Table 5).

Table 5 Relationship between physical functions and GLFS-25

Discussion

The median of the GLFS-25 in this study was 5, which was below the cut-off value for LS stage 1, and the mean of 7.8 was close to the cut-off value. The distribution of scores in the six subcategories was also in the area of lower scores, suggesting that most participants had sufficient functional activity even though they were over 65 years of age (Fig. 1). In the analysis of the distribution of GLFS-25 scores in elderly individuals over 65 years performed by Yamada et al., age-dependent deterioration of GLFS-25 scores was evident in participants over 80 years [9]. In our study, the correlation between GLFS-25 scores and age was not significant. We speculate that these results are due to the limited number of elderly participants with diminished activity and function in our study (Supplemental Figure). Conversely, in our analysis of the distribution of the GLFS-25’s six subcategory scores, ADL was significantly correlated with age in our participants. The ADL subcategory includes questions directly evaluating gait function, difficulty climbing stairs, possible duration of continuous moving, lifting, and activity in domestic affairs [8]. This suggests that the influence of aging appears evident in these activities in the younger portion of the elderly population, such as the majority of participants in our study. As to SMI/BMI, representative data of the physical status of the subjects, there exists no report discussing the relationship with the GLFS-25 [13]. In the present study, we attempted to analyze the significance of SMI/BMI and found no correlation with any of the six subcategories of the GLFS-25, suggesting that muscle mass and body shape did not have an impact on the items of the GLFS-25 in the present subjects.

In our study, we observed a weak relationship between the GLFS-25 scores and the stand-up and two-step tests, revealing a discrepancy between our study and the study reported by Ogata et al. in which the relationship was negligible [11]. We speculate that the discrepancy is due to differences in the status of motor dysfunction of the subjects. Ogata et al. collected data across a broad range of ages with a mean age of 45 years. In that study, subjects had no apparent disorders related to motor function with a median of the GLFS-25 of 5 in males and 4 in females. Under such conditions, the results of the stand-up test, which represents vertical movement ability, and the results of the two-step test, which represents transverse movement ability, are expected to be more homogeneous than those of the present subjects, in which only the subjects with cerebral dysfunction, severe cardiovascular disease, or mental disease were excluded and a wide variation in motor function status was expected. This might have led to this discrepancy.

As we observed a weak relationship between the GLFS-25 and the stand-up and two-step tests, we expected that the GLFS-25 could represent the physical function and overall LS assessment of this population. As shown in Table 4, AUC determined by ROC analysis was acceptable, suggesting the usefulness of the cut-off values. However, the sensitivity was low (0.55 and 0.62), which indicates that the portion of the participants with higher GLFS-25 scores was directly associated with a later LS stage, while the portion of the participants with lower GLFS-25 scores included those with poorer physical function (Fig. 2A). We concluded that in the determination process of LS stages in participants with low GLFS-25 scores, stand-up and two-step tests, which can directly verify one’s actual physical ability, should not be omitted.

The results of our study indicate that the six subcategories of the GLFS-25 are clearly divided into two groups in terms of correlation with actual physical functions. The four subcategories of movement-related difficulty, usual care, ADL, and social activities are all correlated with vertical mobility (such as the stand-up and FTSTS tests) and horizontal mobility (such as the two-step test). In contrast, the remaining two subcategories did not show any relation to these assessments. Body pain did not correlate with any indices collected in our study. Cognitive weakly correlated with only the stand-up test. Yamada et al. examined the effect of two-year exercise habits on these subcategories and found that daily exercise habits improved movement-related difficulty and usual care in male participants, while body pain, social activities, and cognitive remained unchanged [14]. Wang et al. confirmed these relationships within their own test population and found that body pain and cognitive are independent of the other factors [15]. These results support our findings that body pain and cognition were independent of the other four subcategories. We speculate that such a poor relationship between physical function and body pain is at least partly due to the inclusion criterion whereby participants needed to possess sufficient physical function to visit the test center on their own in spite of the possible presence of body pain. Moreover, a poor relationship between these two subcategories and physical tests may be one of the reasons for the discrepancy in LS stage determination between the GLFS-25 alone and the three tools together.

The questions evaluating cognitive assessed anxiety about the subject’s ability to walk and risk of falling. The only physical function test that related to cognitive in our study was the stand-up test. We speculate that fear of falling and anxiety about losing one’s ability to walk would be most evident in participants whose standing ability had begun to decline. In this context, neither the FTSTS test nor the SLS test, which are expected to be related to cognitive, demonstrated any significant correlation. The stand-up test may have greater potential to produce fear or anxiety than the FTSTS or SLS test, but the accumulation of further data is required.

Among the physical functions examined in this study, upper limb function was evaluated by HGS. Grip strength was expected to correlate with usual care, which reflects upper extremity functions, but it showed an unexpectedly weak correlation with ADL and no correlation with other subcategories. This may be due to the fact that the actual activities in the GLFS-25 questions relating to usual care (changing clothes [Q 8, 9, and 14], using the toilet [Q 10], and washing the body in the bath [Q 11]) may not require strong grip power. Nonetheless, some questions ask about lifting (Q 17) and load-bearing (Q 20), which may require strong grip power, establishing the observed weak correlations with HGS.

Potential limitations of this study are the small number of subjects and the small number of inactive elderly participants. In addition, the correlation between GLFS and age, which has been detected in many previous reports [9, 10, 16], was not observed in this study, suggesting that the subjects in this study may be different from general elderly people. The data should be interpreted with caution, and further investigation with a large number of subjects including elderly individuals with lower activity levels is warranted.

In conclusion

  1. (1)

    Although GLFS-25 scores had statistically significant correlations with the stand-up and two-step tests, as well as the LS stages, the sensitivity of GLFS-25 scores to discriminate LS stage was not high enough. The GLFS-25 is useful for screening those in a later LS stage; however, evaluation of physical function is needed for precise LS staging in those with low GFLS-25 scores.

  2. (2)

    Among the six subcategories of the GLFS-25, movement-related difficulty, usual care, ADL, and social activities correlated or tended to correlate with physical functions, whereas body pain and cognitive had no correlation, suggesting that the latter two subcategories may have distinct significance.