Introduction

Musculoskeletal disease is one of the most common disorders among elderly individuals, accounting for 7.5% of the total burden of disease in people aged 60 years or over. (1). Among the musculoskeletal diseases, low back pain and osteoarthritis are the main contributors that cause disability-adjusted life years (1, 2). Sarcopenia, characterized by low muscle strength and low muscle quantity or quality (3), is gaining attention in patients with musculoskeletal disease (46), as it is associated with increased risks of falling and mortality among the general geriatric population (7, 8). One epidemiologic study showed that the prevalence of sarcopenia is 9.1% among patients with osteoarthritis (9). However, in clinical practice settings, sarcopenia among patients with musculoskeletal disease remains underdiagnosed due to scant research on diagnostic testing (6).

For case findings in community settings, the SARC-F questionnaire is recommended by the European Working Group on Sarcopenia in Older People (EWGSOP) before performing reference standard (i.e., confirmatory testing) for sarcopenia (3). Although the SARC-F is simple (i.e., only consisting of 5 items and giving a score of 0 to 10 points), easy to administer, inexpensive, and has been validated in many ethnic populations (3), its diagnostic accuracy has not yet been assessed among musculoskeletal diseases. In addition, the sensitivity of its cut-off (i.e., ≥4 points) to classify sarcopenia is admittedly low (3, 6). To overcome this challenge, the combined use of the SARC-F and routine measurements predicting sarcopenia would foster effective case finding. We conceived the “SARC-F+EBM,” which is composed of the SARC-F, elderly (“E”), and body mass index (“BM”).

In this study, we aimed to (1) validate the diagnostic accuracy of the SARC-F and (2) to derive the SARC-F+EBM for screening sarcopenia, using the “Screening for People Suffering Sarcopenia in Orthopedic cohort of Kobe study” (SPSS-OK), a large single-center cross-sectional study.

Methods

Design, setting, and participants

SPSS-OK was a cross-sectional study approved by a local institutional review board and an institutional research ethics committee. Informed consent was obtained in writing according to Japan’s ethical guidelines for medical and health research involving human subjects. The center involved in this study was located in the central part of Kobe City. Our target population comprised adult patients with musculoskeletal disease who were scheduled for spinal surgery or knee or hip replacement therapy. Their diagnoses were discerned by orthopedic surgeons in charge of the patients via physical examination and imaging. Only patients waiting for their first surgery were analyzed to avoid the potential interference of the implanted artificial materials with bioimpedance analysis (BIA). Patients with neuromuscular disease were excluded as they were not likely to undergo BIA measurement. This diagnostic accuracy study was planned before data collection and questionnaire administration. The patients were recruited consecutively from August 2016 to February 2019.

Reference standards

The reference standard was sarcopenia determined according to the definition of the Asian Working Group for Sarcopenia (AWGS) (10) or the European Working Group on Sarcopenia in Older People 2 (EWGSOP2) (3) (see Table 1 for detailed diagnostic criteria). AWGS criteria use a combination of low muscle quantity (determined by appendicular skeletal mass index [ASMI], namely, appendicular skeletal muscle mass adjusted by height) and low muscle strength (determined by handgrip strength) or a combination of low muscle quantity and low physical performance (determined by gait speed). EWGSOP2 criteria use a combination of low muscle quantity and low muscle strength. All measurements for this study were obtained during the pre-surgery visit.

Table 1 Operationalized definitions of diagnostic criteria for sarcopenia.

ASMI was calculated by dividing appendicular skeletal muscle mass by height squared (m2). Appendicular skeletal muscle mass was measured using BIA with the MC-780A body composition analyzer (Tanita, Tokyo, Japan); the analysis was performed in patients with stable condition. The monitor measures bioimpedance at three frequencies between 5 kHz and 250 kHz, as it is a multi-frequency device. The validation of BIA method using dual-energy X-ray absorptiometry (DPX-L, GE Healthcare) showed that the correlations between the two methods are high for appendicular skeletal muscle mass (12). Handgrip strength was measured twice for both left and right hands using a grip strength dynamometer (GRIP-D T.K.K. 5401; Takei Scientific Instruments Co., Ltd, Japan), and the value was calculated from the average of measured values. Gait speed was calculated from measures of walking time obtained on a 10-m length walkway. Patients were instructed to walk along a 15-m smooth and horizontal walkway, in which the segment consists of a 10-m length walkway for measurement and 2.5-m-long sections from either end for acceleration and deceleration. Walking time in the 10-m section was measured twice using a stopwatch and was converted to m/s. The average value among the two was used for analysis. A Well-trained board-certified physiotherapists conducted all the measurements, and performers were unaware of the index test, as the index test was performed later.

Index test: SARC-F, SARC-F+E, and SARC-F+EBM

SARC-F was originally developed in English (12). It has already been translated into Japanese (Supplementary Table 1), and the conceptual equivalence of that Japanese version has been examined via back-translation and consultation with the authors of the original version (John E. Morley). Each item has 0 to 2 points, and summation of the 5 items constitutes the total score (Supplementary Table 2). The summation score ≥ 4 corresponds to cut-off for sarcopenia (12). The questionnaire including the SARC-F was administered at the date of admission for surgical treatment. The responses in SARC-F was not affected by the reference standard, as sarcopenia was classified after data collection. SARC-F+EBM was a combined scoring algorithm using SARC-F, age, and BMI. Age was determined at the date of the test for reference standard. Age ≥75 years was considered to be old (“E”lderly) (13). Body weight was measured using a digital scale. Height was measured using a fixed stadiometer. A body mass index (“BM”I) ≤ 21 kg/m2 was considered to be underweight.[14,15]. For “E”lderly, zero point was assigned for age < 75 years, whereas 10 points was assigned for age ≥ 75 years. For “BM”I, zero point was assigned for those who were not underweight, whereas 10 points for those who were underweight. The resulting SARC-F+EBM score ranges from 0 to 30 points (Supplementary Table 2). We also considered SARC-F+E scoring, with combination of SARC-F plus old age (zero point for age < 75 years, whereas 10 points for age ≥ 75 years), ranging from 0 to 20 points (Supplementary Table 2).

Table 2 Baseline characteristics in overall and by sarcopeniaa

Measurement of comorbidities

Data on baseline characteristics were collected to describe comorbidities of the study population. History of hypertension, cancer, chronic lung disease, heart disease, and stroke were determined by self-reported questionnaire. Heart disease was considered to be present if the patients answered “yes” to having either “heart attack,” “congestive heart failure,” or “angina.” Chronic kidney disease was defined as estimated glomerular filtration rate (eGFR) values ≤ 60 mL/min/1.73 m2, which were calculated based on the 3-variable Japanese equation using age, serum creatinine level, and sex as follows (16): eGFR = 194×serum creatinine−1.094×age−0.287×0.739 (if female). Diabetes was defined as having glycosylated hemoglobin values of ≥6.5% according to the National Glycohemoglobin Standardization Program.

Statistical analyses

All statistical analyses were conducted using Stata/SE, version 15 (Stata Corp., College Station, TX). Sociodemographic characteristics, comorbidities, underlying musculoskeletal disease, and outcomes (sarcopenia defined by two criteria, ASMI, handgrip strength, and gait speed) were described for all patients and separately by sarcopenia defined by AWGS. To examine the diagnostic test accuracy of the SARC-F questionnaire among the patients, the sensitivity and specificity of SARC-F score were estimated with provision of contingency table. To examine the overall diagnostic accuracy among SARC-F, SARC-F+E, and SARC-F+EBM, we fitted logistic regression models with sarcopenia being dependent variable and those scoring methods being independent variables. Receiver operating characteristic curves were then derived, and their areas under the curves (AUCs) were estimated and compared using the DeLong method (17). Larger AUC indicates better overall diagnostic accuracy. Cut-off points for the SARC-F+E and the SARC-F+EBM scores were determined using the Liu’s method (18). The sensitivities and specificities among the SARC-F, SARC-F+E, and SARC-F+EBM were compared using the McNemar’s test for those with sarcopenia and those without sarcopenia, respectively. (19). All the analyses were separately performed for sarcopenia determined by AWGS and EWGSOP2 criteria. All analyses were conducted by a well-trained epidemiologist (N.K.). Two-sided P < .05 was considered statistically significant.

Results

A total of 996 patients with musculoskeletal disease were invited. Figure 1 shows the flow of patients in this study, along with the outcome of sarcopenia defined by AWGS. After exclusion of 22 patients with incomplete response to the SARC-F and 15 patients with missing value for sarcopenia testing, 959 patients were evaluated. The mean age was 69.4 years, and three quarters of the patients were female (Tables 2). The median difference in dates between the reference and index tests was 31 days (interquartile range: 21–42 days).

Figure 1
figure 1

Study flowchart

The test results of the SARC-F for patients with and those without sarcopenia are shown in Supplementary Table S3. The prevalence of sarcopenia was 3.8% (36/959). When conventional cut-off point (≥4) was applied, only 41.7% of the patients with sarcopenia met this condition (sensitivity 41.7%, 95% confidence interval [95% CI] 25.5% to 59.2%), and 68.5% of the patients without sarcopenia did not meet this condition (specificity 68.5%, 95% CI 65.4% to 71.5%). The ROC curve of the SARC-F to identify sarcopenia is presented in Figures 2, and the AUC was 0.557 (95% CI 0.452 to 0.662) (Tables 3).

Figure 2
figure 2

Receiver operating characteristics curves for SARC-F, SARC-F+E, and SARC-F+EBM for identifying sarcopenia by AWGS criteria

Dashed line with hollow circle symbols indicates receiver operating characteristics (ROC) curve for SARC-F scoring. Solid line with hollow circle symbols ROC curve for SARC-F+E scoring. Solid line with solid circle symbols indicates ROC curve for SARC-F+EBM scoring.

Table 3 Overall diagnostic accuracy of the SARC-F, SARC-F + E, and SARC-F + EBMa

The ROC curve of the SARC-F+EBM to identify sarcopenia is presented in Figure 2. The AUC of the SARC-F+EBM was 0.824 (95% CI 0.762–0.886), which is superior to that of the SARC-F (Tables 3). The optimal cut-off point for sarcopenia was ≥12 points. The sensitivity of the SARC-F+EBM was 77.8% (95% CI 60.8% to 89.9%), which is superior to that of the SARC-F (P<0.001) (Tables 3). The specificity of SARC-F+EBM was 69.6% (95% CI 66.5% to 72.5%), which is similar to that of SARC-F (P=0.565).

Similarly, the ROC curve of SARC-F+E to identify sarcopenia is presented in Figure 2. The AUC of SARC-F+E was 0.663 (95% CI 0.561–0.765), which is superior to that of SARC-F (Tables 3). The optimal cut-off point for sarcopenia was ≥9 points. The sensitivity of the SARC-F+E was 63.9% (95% CI 46.2% to 79.2%), which is superior to that of the SARC-F (P=0.046) (Tables 4). The specificity of SARC-F+E was 66.3% (95% CI 63.2% to 69.4%), which is similar to that of SARC-F (P=0.302).

Table 4 Sensitivity and specificity of the SARC-F, SARC-F + E, and SARC-F + EBMa

Similar results were obtained when sarcopenia by EWGSOP2 criteria was used as reference standard (Supplementary Tables 4 to 6).

Discussion

We evaluated the diagnostic accuracy of the SARC-F questionnaire to classify sarcopenia as a screening purpose among patients with musculoskeletal disease and found that its overall diagnostic accuracy based on AUC was not acceptable. However, combined use of the SARC-F with “EBM” (Elderly plus BMI) resulted in excellent overall diagnostic accuracy and improved sensitivity. We propose that for sarcopenia screening, our “SARC-F+EBM” scoring should be applied to other ethnic or disease settings because of its simplicity.

With regard to the diagnostic accuracy of the SARC-F with conventional cut-off points (≥4), our study findings may partially corroborate those of previous research. Low sensitivity is shown in Chinese community-dwelling individuals (3.8% to 9.9%) (20) and in Brazilian community-dwelling individuals (33.0%) (21). In our study, regardless of sarcopenia definition, the sensitivity was under 50%. This low sensitivity may be attributed to the face validity of the SARC-F items, as previous researchers pointed out (21): the items ask about muscle strength and physical performance, but not about muscle quantity. The low specificity of this study (68.4% to 68.5%) contradicts the finding of a previous research (21, 20), and may be due to our unique disease population. Generally, specificity becomes larger if participants without reference standard are healthier (22). Our populations are composed of musculoskeletal disease requiring surgery, and they may have problems in physical performance, but they may not have sarcopenia. Previous studies showed that indices of lower limb performance are affected by pain among musculoskeletal disease (11). Therefore, participants without sarcopenia in our musculoskeletal disease settings demonstrated high false-positive rates for the SARC-F and contributed to low specificity.

Our findings have several implications for researchers and physicians. First, our “SARC-F+EBM” scoring is applicable to other ethnic and disease settings and expected to demonstrate high sensitivity. Age and BMI are inexpensive and easily obtainable information. In addition, “EBM” serves as items asking muscle quantity to some degree, as old age (age ≥ 75 years) corresponds to age-associated muscle loss and underweight (BMI ≤ 21 kg/m2) corresponds to undernutrition (3). Thus, incorporating such items in the SARC-F+EBM will foster to effectively screen sarcopenia in different populations. Even if weighing scale is not available in rural community settings, “SARC-F+E” can be used.

Second, the consistency of the magnitudes of the diagnostic accuracy of sarcopenia, which is based on two different criteria (AWGS and EWGSOP2), suggests that the observed diagnostic performance is unlikely to be affected by the use of gait speed for definition of reference standard. This is because only AWGS criteria require gait speed measurement to diagnose sarcopenia.

Third, the fact that the specificities of the SARC-F and SARC-F+EBM are not high enough to rule in sarcopenia among musculoskeletal disease may be applicable to other diseases, which also cause impairment in physical performance. For example, patients with neurological diseases (such as Parkinson’s disease and hemiplegia due to stroke) or organ failure (such as heart failure and chronic obstructive lung disease) have difficulty in walking, climbing stairs, or rising from a chair because of the disease itself and thus may tend to show false-positive results in the SARC-F scoring. Thus, to enhance the SARC-F, researchers may consider adding extra items asking about muscle quantity or muscle strength rather than about physical performance. However, this notion does not hamper the use of the SARC-F+EBM in clinical or community settings. Rather, we could show that the SARC-F+EBM has improved sensitivity (77.8% to 84.2%) and exhibited excellent overall diagnostic accuracy (AUC 0.824 to 0.876). Thus, the SARC-F+EBM can be used as a sarcopenia screening tool (i.e., tool preventing people without sarcopenia from taking unnecessary confirmatory workup for sarcopenia).

This study has several strengths. First, our study assessed the validity of the SARC-F to screen sarcopenia among a population with musculoskeletal disease, which is rarely investigated. Second, recruitment of a relatively large number of patients at a single-center enabled us to measure important variables (i.e., SARC-F as index test and sarcopenia [defined by ASMI, handgrip strength, and gait speed as reference standard) with uniform measurement methods across all patients to effectively assess our research question. Third, our study reinforced the SARC-F by adding “EBM,” using formal analyses required for diagnostic accuracy study.

This study has also several limitations. First, BIA was performed instead of volumetric analysis using computed tomography or magnetic resonance imaging to quantify muscle mass. However, BIA is a major approach of sarcopenia screening in population study settings and is recommended as an option for measuring muscle mass by the EWGSOP2 and AWGS (3, 10). Second, the prevalence of sarcopenia was unexpectedly low. A selection bias (i.e., referring bias) may exist with regard to the low prevalence, as was described already, as the patients were referred because they were expected to be amenable to surgery without complications.

Conclusions and Implications

The SARC-F+EBM improved the sensitivity and overall diagnostic accuracy of SARC-F to screen sarcopenia in musculoskeletal disease population. Thus, the SARC-F+EBM may enhance the SARC-F in the “find-cases” pathway (3). Further validation studies in different settings are warranted to elucidate the sensitivity and specificity of the SARC-F+EBM obtained in this population.

Brief summary

The SARC-F combined with elderly and body mass index (SARC-F+EBM) significantly improved the sensitivity and overall diagnostic accuracy of the SARC-F for screening sarcopenia in musculoskeletal disease settings.