Introduction

Gastric cancer is one of the major global health problems and it is the fourth leading cause of cancer-related death (1). Surgery is still the main therapeutic method (2). Due to the aging population, the number of elderly patients undergoing gastric cancer surgery is gradually increasing. Their conditions have been seriously influenced by the tumor, aging, and multiple gastrointestinal symptoms, making the phenomenon of frailty prominent and common among them (3). Frailty is defined as a state of decreased physiologic reserves arising from cumulative deficits in multiple homeostatic systems (4). Elderly gastric cancer patients with frailty have a higher risk of adverse postoperative outcomes, such as complications, prolonged length of stay, increased hospital costs, and hypoproteinemia (57). Therefore, applying a suitable instrument to identify frailty precisely before surgery and predict adverse outcomes effectively is significant for perioperative management among elderly gastric cancer patients.

There are three widely used conceptual models regarding frailty, including the biological model, deficit accumulation model, and integrative conceptual model. Among them, the biological model is one of the most classical models and reveals the internal physiological process of frailty. Frailty phenotype (FP) is developed based on this model (8). FP is the most commonly used frailty instrument and comprises self-report items as well as objective physical measures (grip strength, walking speed). Although there is still no “gold standard” regarding frailty instruments, FP has been considered as an operational definition of frailty by the American Geriatrics Association and is recognized as one of the optimal risk assessment tools for elderly gastric cancer patients undergoing surgery (9, 10). Another representative instrument of frailty is the frailty index (FI). It is derived from the model of deficit accumulation, which assumes that frailty is a state caused by the accumulation of health deficits in a lifetime (11). However, the original FI is less attractive due to the overload of items (70 items). Thus, 11-factor modified Frailty Index (mFI-11) is developed, which maps the 70 variables from the original FI to 11 preexisting variables from the National Surgical Quality Improvement Program (NSQIP) database (12). Later on, mFI-11 is simplified to a 5-factor modified Frailty Index (mFI-5) (13). Both mFI-11 and mFI-5 can be calculated based on medical records easily. Finally, the most frequently used instrument based on the integrative conceptual model is Tilburg Frailty Indicator (TFI) (14). It expands the dimension of frailty and identifies frailty from three aspects: physical, psychological, and social domains through 15 simple self-report items.

Although FP is considered as an operational definition of frailty, applying FP requires specific equipment and expertise for measuring grip strength and gait speed, which are not always feasible in a busy surgical oncology setting (15). It is necessary to explore whether TFI, mFI-11, and mFI-5, which are concise in items and avoid tedious physical measurement, can be used to replace FP for frailty identification and predicting adverse outcomes in elderly gastric cancer patients undergoing surgery. Some previous studies have compared the performance of these three instruments for diagnosing frailty and predicting adverse outcomes among institutionalized, community-dwelling, inpatient elderly patients (1618). However, among elderly gastric cancer patients undergoing surgery, the capacity of these instruments for detecting frailty needs further exploration and it remains unclear which frailty instrument has the best predictive accuracy for adverse postoperative outcomes.

Therefore, our study aimed to compare the diagnostic performance of TFI, mFI-11, and mFI-5 for FP-defined frailty, as well as to compare their predictive ability for adverse outcomes among elderly gastric cancer patients undergoing surgery.

Methods

Study design and participants

We conducted a prospective cohort study among elderly patients with gastric cancer at a tertiary hospital in Jiangsu Province, China from March 2021 to July 2021.

The inclusion criteria included patients who: (1) were diagnosed with gastric cancer by endoscopy or pathology; (2) would undergo radical surgery for the first time; (3) aged ≥60 years; (4) signed informed consents and agreed to participate in this study. The exclusion criteria included patients who: (1) had a severe physical disability or cognitive impairment; (2) received preoperative radiotherapy or chemotherapy; (3) combined with other sites of malignant cancers.

The study was approved by the Ethics Committee of Nanjing Medical University, Jiangsu, China (Number: 2020–273). The study was verified by the Chinese Clinical Trial Registry (Number: ChiCTR2100047064).

Measurements

Baseline information

Data were collected on demographic characteristics (age, gender, BMI, marital status, and education), and disease-related information (cancer stage, type of surgery, and nutritional risk screening).

Frailty instruments

The enrolled patients would be measured for frailty within 24 hours after admission using the following four instruments.

The FP includes five components: exhaustion, weight loss, low physical activity, handgrip strength, and slowness (8). The scores of FP range from 1 to 5. Frailty is defined as a score of 3 to 5. Exhaustion and weight loss were self-report items; low physical activity was assessed using the International Physical Activity Questionnaire-Short Form (IPAQ-SF); handgrip strength was measured on the dominant hand using an electronic hand dynamometer (EH101, Guangdong Province, China); slowness was evaluated using gait speed measurement. More details and measurement criteria are available in “Chinese expert consensus on frailty management” (19). Considering that FP is widely used and recognized, we chose it as the reference standard for the diagnosis of frailty.

The TFI comprises 15 self-report items focusing on physical, psychological, and social domains (14). The 8 items of the physical domain are poor physical health, unexplained weight loss, difficulty in walking, difficulty in maintaining balance, poor hearing, poor vision, lack of strength in hands, and physical tiredness. The 4 items of the psychological domain are problems with memory, feeling down, feeling nervous or anxious, and being unable to cope with problems. The 3 items of the social domain include living alone, lack of social relations loneliness, and lack of social support. The scores of TFI range from 0–15, and a score of 5–15 represents frailty. The TFI has been extensively culturally adapted and validated in China (20).

The mFI-11 consists of the following 11 variables: diabetes; functional status; chronic obstructive pulmonary disease (COPD) or pneumonia; congestive heart failure (CHF); history of myocardial infarction; hypertension requiring medication; peripheral vascular disease or rest pain; impaired sensorium; history of either transient ischemic attack or cerebrovascular accident; history of cerebrovascular accident with neurologic deficit; and prior percutaneous coronary intervention, previous coronary surgery, or history of angina (12). Besides, mFI-5 is simplified from mFI-11 and comprises the following 5 variables: diabetes; functional status; COPD or pneumonia; CHF; hypertension requiring medication (13). The mFI-11 and mFI-5 are calculated by the number of variables present divided by the total variables. An mFI-11 score ≥ 0.27 and an mFI-5 score ≥ 0.4 are defined as frailty, respectively. Both mFI-11 and mFI-5 have been used and validated among elderly surgical patients in China (21, 22).

Outcome measures

The primary outcome was postoperative complications in hospital graded by Clavien-Dindo classification. Total complications were considered as Clavien-Dindo grade ≥ 2 according to previous studies (23).

The secondary outcomes were prolonged length of stay (PLOS), increased hospital costs, and postoperative hypoproteinemia. Based on previous studies, PLOS was defined as a duration (from date of admission to date of discharge) exceeding the 75th percentile in this cohort (24); increased hospital costs were defined as costs greater than the 75th percentile of the entire cohort (24); hypoproteinemia was defined as serum albumin levels<35 g/L at discharge (25).

Statistical analysis

The sample characteristics were described by means (with standard deviations) for continuous variables and frequencies (with percentages) for categorical variables.

The sample size was estimated based on the PASS version 15.0 (PASS 15 Power Analysis and Sample Size Software, NCSS, LLC). One previous study showed that the prevalence of frailty in preoperative patients with gastric cancer was 17.71% (3). Considering the missing rate of 5%, the minimum sample in our study was 254 patients. The receiver operating characteristic curves (ROCs) were plotted to compared the diagnostic performance of TFI, mFI-11, and mFI-5 for frailty using FP as the reference stand. Meanwhile, ROCs were used to examine the performance of TFI, mFI-11, and mFI-5 for predicting adverse outcomes. The area under the curve (AUC)>0.70 was regarded as an indicator of good performance (26). The nonparametric method described by DeLong et al. was used to performed ROC contrasts between individual frailty instruments and ascertain if there is a statistical difference in AUCs (27). The detailed statistic on sensitivity and specificity were used to compare the clinical validity of each instrument in diagnostic performance and predictive performance, respectively. Statistical analysis was performed using MedCalc for Windows version 19.0 (MedCalc Software, Ostend, Belgium). Statistical significance was defined as a p<0.05.

Results

Participants characteristics

A total of 259 patients were ultimately enrolled in the study. The mean age of study participants was 69.1±5.4 years, with a predominance of male (76.1%). The prevalence of frailty measured by different instruments ranged from 8.5% (mFI-11) to 45.9% (TFI). For the adverse outcomes, the incidence of total complications, PLOS, increased hospital costs, and hypoproteinemia was 22.8%, 24.3%, 25.1%, and 40.2%, respectively. The detailed characteristics of the participants are described in Table 1.

Table 1 Characteristics of the participants (N = 259)

Diagnostic performance for FP-defined frailty

AUCs of TFI, mFI-11 and mFI-5 against the FP for diagnosis of frailty were 0.764 [95%confidence interval (CI): 0.707–0.814; P<0.001], 0.600 (95%CI: 0.538–0.660; P=0.033) and 0.600 (95%CI: 0.538–0.660; P=0.0311), respectively. On ROC contrasts, the AUC of TFI was greater than that of mFI-11 (p=0.0008) and mFI-5 (p=0.0012) in the detection of frailty defined by FP (Figure 1).

Figure 1
figure 1

The ROC curves of TFI, mFI-11 and mFI-5 against the FP for diagnosis of frailty

Note. ROC contrast: TFI vs mFI-11, P=0.0008; TFI vs mFI-5, P=0.0012; mFI-11 vs mFI-5, P=0.9883. Abbreviations: FP, Frailty Phenotype; TFI, Tilburg Frailty Indicator; mFI-11, 11-factor modified Frailty Index; mFI-5, 5-factor modified Frailty Index; ROC, Receiver Operating Characteristic.

At their original cutoffs, the sensitivity of TFI (56.86%) was higher than that of mFI-11 (1.96%) and mFI-5 (3.92%), while the specificity of TFI (79.33%) was lower than that of mFI-11 (98.08%) and mFI-5 (98.08%). At the optimal cutoffs of mFI-11 (0.09) and mFI-5 (0.2), their sensitivity and specificity both tended to be balanced. The optimal cutoff of the TFI was identical to the original cutoff. Table 2 summarizes the diagnostic properties of TFI, mFI-11, and mFI-5 for FP-defined frailty.

Table 2 Diagnostic performance of TFI, mFI-11 and mFI-5 using the FP as reference Standard (N=259)

Predictive performance for adverse outcomes

TFI and mFI-11 both had statistically significant but inadequate predictive accuracy for adverse outcomes, including total complications (AUCs: 0.618; 0.621), PLOS (AUCs: 0.593; 0.639), increased hospital costs (AUCs: 0.594; 0.624), and hypoproteinemia (AUCs: 0.573; 0.600). For mFI-5, the predictive ability for hypoproteinemia exclusively was statistically significant, but with poor accuracy (AUC: 0.592).

On ROC contrasts, the predictive accuracy of mFI-11 for total complications and increased hospital costs were better than that of mFI-5. Additionally, there was no statistically significant difference between mFI-11 and TFI in the prediction of four adverse outcomes (Figure 2). The detailed results of predictive performance for adverse outcomes are shown in Table 3.

Figure 2
figure 2

The ROC curves of TFI, mFI-11, and mFI-5 for predicting adverse outcomes

Table 3 Predictive performance of TFI, mFI-11 and mFI-5 at original cutoffs for adverse outcomes (N=259)

Note. ROC contrasts: Total complication: TFI vs mFI-11, P=0.9553; TFI vs mFI-5, P=0.4102; mFI-11 vs mFI-5, P=0.0221. PLOS: TFI vs mFI-11, P=0.3103; TFI vs mFI-5, P=0.1186; mFI-11 vs mFI-5, P=0.0848. Increased hospital costs: TFI vs mFI-11, P=0.5407; TFI vs mFI-5, P=0.1720; mFI-11 vs mFI-5, P=0.0154. Hypoproteinemia: TFI vs mFI-11, P=0.5563; TFI vs mFI-5, P=0.6777; mFI-11 vs mFI-5, P=0.4869. Abbreviations: TFI, Tilburg Frailty Indicator; mFI-11, 11-factor modified Frailty Index; mFI-5, 5-factor modified Frailty Index; PLOS, Prolonged Length of Stay; ROC, Receiver Operating Characteristic.

Discussion

In our study, the prevalence of frailty measured by different instruments ranged from 8.5% to 45.9%. ROC analyses showed that TFI had better diagnostic accuracy in the detection of frailty defined by FP compared with mFI-11 and mFI-5. Besides, TFI and mFI-11 had similarly ability for predicting four adverse outcomes, while the predictive ability of mFI-5 for hypoproteinemia exclusively was statistically significant among elderly patients undergoing gastric cancer surgery.

Our results revealed that the prevalence of frailty varied widely when using different instruments, which is consistent with previous studies among hospitalized and community-dwelling elderly adults (17, 18). Furthermore, we found that the frailty prevalence measured by TFI was higher than the rest of two instruments. The possible reason is that mFI-11 and mFI-5 are purely focus on the physiological level, while TFI also includes psychological and social domains except for physical domains. In the hospital setting, most elderly gastric cancer patients suffer from psychological and social issues, such as anxiety, depression, and low social support, which only can be identified by TFI as part of the determination of frailty (28).

Our findings demonstrated that TFI had significantly better diagnostic performance for frailty defined by FP compared with mFI-11 and mFI-5. This can be explained by the fact that although the TFI and FP are derived from different conceptual models, there is still some overlap in their physical domains (8, 14). In contrast, either mFI-11 or mFI-5 mainly focus on the medical history, which is more similar with the concept of comorbidity instead of physical status (12). Some patients with gastric cancer can be judged as frailty through TFI and FP because of poor physical performance mainly caused by the tumor and digestive tract symptoms. While mFI-11 and mFI-5 may neglect the specific poor functional status so that the physical frailty cannot be identified by them completely (29). Hence, it was unsurprising that mFI-11 and mFI-5 had the poor diagnostic capability for FP-defined frailty among elderly gastric cancer patients.

We found that the sensitivity and specificity of mFI-11, mFI-5, and TFI were quite different at their original cutoffs for discovering frailty. TFI had higher sensitivity compared with mFI-11 and mFI-5, which results in a low false-negative rate, and more elderly gastric cancer patients being frail are identified. This might be crucial as it allows for early warning and intervention as much as possible for elderly gastric cancer patients who are in poor health status (30). Simultaneously, both mFI-11 and mFI-5 had higher specificity than TFI, which contributes to a low false-positive rate and more elderly gastric cancer patients being non-frail can be detected. This can largely avoid adverse psychological impacts on patients and unnecessary medical interventions. In practice, the use of highly sensitive or specific instruments for detecting frailty should depend on the specific context (31). In our study, medical resources during hospitalization are relatively abundant. To improve the safety and reliability of perioperative management of elderly gastric cancer patients who are originally should be focused on, higher sensitivity may be more important for frailty instruments. Additionally, the optimal cutoff of TFI for diagnosis of frailty was identical with the original cutoff, whereas the optimal cutoffs of mFI-11 (0.27 to 0.09) and mFI-5 (0.4 to 0.2) were both decreased, with more balanced sensitivity and specificity in detection of frailty. It indicates that the cut-off of mFI-11 or mFI-5 can be appropriately adjusted to make it more suitable for elderly gastric cancer patients during hospitalization.

Our findings revealed that mFI-11 had better ability for predicting total complications and increased hospital costs than mFI-5 among elderly gastric cancer patients undergoing surgery. Possibly because mFI-5 lacks some important variables that reflect cardiovascular and cerebrovascular health, such as history of transient ischemic attack, cerebrovascular accident, and prior percutaneous coronary intervention, which are closely related to the adverse outcomes (32, 33). Meanwhile, the differences between AUCs of TFI and mFI-11 for predicting adverse outcomes were not statistically significant, and their AUCs were both lower than 0.7, implying that TFI and mFI-11 had comparable ability to predict adverse outcomes, but none of them had an adequate predictive performance. These results are consistent with prior studies. Andreasen et al found that TFI had inadequate accuracy for predicting adverse outcomes in acutely admitted elderly patients (AUC=0.64) (34). Another study concluded that mFI-11 had poor ability to predict major complications among hospitalized elderly patients (AUC=0.645) (35). The inadequate predictive performance of these frailty instruments for adverse clinical outcomes may be explained by the particularity of the participants both in our and previous studies. Among the elderly hospitalized patients, especially the elderly gastric cancer patients undergoing surgery, there are many other problems except for frailty, such as malnutrition and sarcopenia (36, 23). All of these will increase the risk of adverse outcomes. This indicates that using the frailty instrument alone is limited for predicting adverse outcomes in elderly hospitalized patients. A previous study found that adding frailty instruments to current commonly used surgical risk assessment tools, including the American Society of Anesthesiologists (ASA) score and Estimation of Physiologic Ability and Surgical Stress (E-PASS) score can improve the capacity of predicting complications among elderly patients undergoing surgery (37). It implies that we should combine frailty instruments with other effective risk assessment tools in clinical practice to fully improve the quality of in-hospital care for elderly gastric cancer patients undergoing surgery.

At last, although the items of mFI-11 and mFI-5 can be found in electronic health records, which promotes the feasibility and time-efficiency in practice (38). Given that their inadequate diagnostic performance for frailty and the poor predictive ability of mFI-5 for adverse outcomes, both mFI-11 and mFI-5 do not seem to be suitable frailty instruments for elderly gastric cancer patients undergoing surgery. TFI is simple and convenient to use as a self-reported scale. It can also be assessed by e-mail and telephone, improving the practicability of identifying frailty and predicting adverse outcomes among elderly gastric cancer during hospitalization and post-discharge period (39). Based on its good diagnostic performance for FP-defined frailty and acceptable predictive accuracy for adverse outcomes, TFI appeared to be the most suitable instrument in our study.

Some limitations in our study should be noticed. First, our participants were recruited from a single hospital, which might limit the generalizability of the findings to other hospitals with different medical technology and resources. Additionally, we only involved three instruments in our study. Given that more and more frailty instruments are being developed and applied, the performance of other validated tools needs to be explored among elderly patients with gastric cancer. Moreover, the adverse outcomes in the present study were limited to in-hospital indicators, some important long-term outcomes should be taken into consideration in the future.

Conclusion

In summary, the TFI performed slightly better than mFI-11 and mFI-5 in our study. Moreover, there is still a need to further investigate one optimal instrument which can not only diagnose frailty well but also accurately predict the risk of adverse outcomes among elderly gastric cancer patients. In practice, the detection of frailty should be integrated into routine assessments to improve the quality of perioperative management among elderly patients with gastric cancer. Finally, except for objective outcomes, further studies should also focus on the predictive ability of diverse frailty instruments for predicting meaningful subjective indicators such as quality of life.