Gastric cancer is the third leading cause of cancer-related deaths worldwide.1 Neoadjuvant chemotherapy is increasingly used in clinical practice and has many advantages, such as increasing patient tolerance to chemotherapy, reducing potential micrometastasis, and increasing the rate of radical surgery. Previous phase III clinical trials have preliminarily confirmed that perioperative chemotherapy combined with radical surgery can significantly improve the patient’s prognosis.2 The perioperative treatment modality of neoadjuvant chemotherapy combined with radical surgery has been recommended as a standard option by the clinical guidelines (evidence level 1, NCCN Guidelines 2021v4).3 Meanwhile, neoadjuvant chemotherapy is an in vivo drug sensitivity test that can help assess drug efficacy by evaluating tumor changes during chemotherapy and provide a basis for precise treatment of patients.

At present, imaging evaluation in gastric cancer mainly adopts the World Health Organization (WHO) efficacy evaluation criteria or Response Evaluation Criteria in Solid Tumors (RECIST).4 However, because the stomach is a hollow organ, the thickness of the stomach wall is affected by the degree of gastric cavity filling, and the staging of the primary lesion of the stomach is based on the depth of tumor invasion rather than the size of the tumor. Therefore, it is different from solid tumors such as those found in breast cancer and liver cancer. There are still many difficulties in evaluating the effect of neoadjuvant chemotherapy on gastric cancer. The primary lesion cannot be used as an evaluable lesion. If there is no obvious enlarged lymph node, it is difficult to accurately evaluate the response according to the existing standards. In some patients, although the size of the lesion does not change significantly during chemotherapy, the tumor cell density may have changed, so the change in diameter alone cannot reflect the tumor response. It has been reported in the literature that only 17% of patients with gastric cancer receiving neoadjuvant chemotherapy have a pathological tumor regression grade (TRG) of 1.5 This means that limited by the low accuracy of the response evaluation, a large number of patients who receive neoadjuvant chemotherapy for gastric cancer have received excessive chemotherapy, missed the optimal time for surgery, and suffered from reduced survival time.

A major focus and challenge in gastric cancer is how to effectively evaluate the efficacy of neoadjuvant chemotherapy. With the widespread application of artificial intelligence and deep learning in medicine,6 it has become possible to use radiomics to provide more valuable clinical information. The results of an earlier study conducted in our center suggest that radiomics can help identify occult peritoneal metastases.7 In recent years, there have also been studies suggesting that radiomics can predict patients’ pathological stage or TRG.8,9 However, two research gaps remain. First, the pathological staging of ypT after chemotherapy alone or the TRG alone represents only part of the information about the effectiveness of chemotherapy in patients. Moreover, although Becker’s10 study reported that the TRG is correlated with the pathological T stage after neoadjuvant chemotherapy, these two are not always linearly correlated. In the study of Katjia Ott et al.,11 ypT3 and ypT4 patients were as high as 40% among the 231 patients with residual tumor cells < 10%, while in patients with nonprimary lesion pCR with residual tumor cells less than 10%, ypT3 and ypT4 patients were as high as 54.3%. Second, the existing studies are primarily based on pre-treatment radiomics, while the changes in pre-treatment and post-treatment radiomics may better reflect the changes in the tumor during treatment and may be more representative of the patient’s therapeutic effect.

Therefore, we conducted this retrospective single-center cohort study to explore whether baseline and post-treatment radiomic signatures can better predict prognosis of patients with gastric cancer.

Patients and Methods

Patients

The study population included patients diagnosed with gastric adenocarcinoma by gastroscopic pathology from June 2009 to July 2015 at the Gastrointestinal Cancer Center of Peking University Cancer Hospital. Patients were included in the analyses if they met the following criteria: (a) confirmed gastric adenocarcinoma by gastroscopic pathology; (b) underwent preoperative staging of a locally advanced stage without distant metastasis; (c) received neoadjuvant chemotherapy based on platinum combined with fluorouracil regimen; (d) received baseline CT scan within 10 days before the initiation of the first cycle of chemotherapy and presurgery CT scan within 1 month after the completion of the last cycle of chemotherapy; (e) underwent radical distal gastrectomy or total gastrectomy with R0 resection; and (f) had regular follow-up in our hospital and complete clinical pathological data. Patients were excluded if they (a) died during the perioperative period; (b) received chemotherapy for other tumors within half a year before the diagnosis; (c) received preoperative neoadjuvant radiotherapy, targeted therapy, or immunotherapy; (d) had gastric remnant cancer; (e) received preventive intraperitoneal infusion chemotherapy; or (f) had poor CT image quality. The current study was approved by the Institutional Review Committee of Peking University Cancer Hospital (2019YJZ26).

CT Image Acquisition

All patients underwent enhanced CT examination of the abdomen and pelvis after fasting for more than 6–8 h. Before CT examination, 10 mg anisodamine (654-2, Hangzhou Minsheng Pharma) were injected intramuscularly, and 6 g gas-producing crystals with 10 ml warm water were administered orally to distend the gastric wall. The CT scan was performed either by the LightSpeed 64 VCT or the Discovery CT750 HD.

Tumor Segmentation

Baseline and presurgery enhanced CT in the arterial and venous phases were reviewed by two radiologists on ITK-SNAP (v.3.6.0, http://www.itksnap.org). The most evident part of the primary lesion on axial CT was manually delineated by one radiologist (doctor A with 15 years of experience in radiology) and supervised by a senior radiologist (doctor B with 20 years of experience in gastrointestinal radiology). Both radiologists were blinded to all patients’ clinicopathological information but knew the location of gastric cancer according to endoscope results.

Measurements

The following clinical and pathological characteristics were collected: age, sex, comorbidity status, height, weight, Eastern Cooperative Oncology Group (ECOG) score, American Society of Anesthesiologists (ASA) score, number of preoperative chemotherapy cycles, preoperative chemotherapy regimen, operation duration, intraoperative blood loss, surgical approach, tumor location, tumor pathological type, degree of differentiation, pathological T stage (ypT), pathological N stage (ypN), vascular tumor thrombus, number of dissected lymph nodes, number of metastatic lymph nodes, and time between the completion of neoadjuvant chemotherapy and surgery.

Overall survival (OS) was defined as the time from diagnosis to death due to any reason or the last follow-up. On the basis of OS, we constructed a new outcome of death within 3 years, with 1 indicating that the patient died due to any cause within 3 years from the time of diagnosis.

Radiomic Feature Extraction

Image preprocessing was applied before feature extraction. For each patient, 408 radiomic features and 4 handcrafted features were extracted from the masked lesion area (Supplementary Figure S1). Radiomic features were calculated using the standardized algorithm available in pyradiomics. This encompasses first-order statistics that describe individual voxel values’ distribution, shape-based features detailing the region of interest’s shape, and texture features that elucidate voxel patterns and relationships, including the gray level co-occurrence matrix (GLCM), gray level run length matrix (GLRLM), gray level size zone matrix (GLSZM), neighboring gray tone difference matrix (NGTDM), and gray level dependence matrix (GLDM). Four handcrafted features were computed on the basis of the skeleton of the lesion mask, all of which aimed to measure the length of the lesion.

Radiomic Feature Building

We applied Yeo-Johnson power transformation to normalize radiomic features. After transformation, the intraclass correlation coefficient (ICC) was utilized to evaluate the reproducibility and robustness of the extracted features. We applied perturbations to the lesion mask to construct multiple raters on the same feature, and features with ICC < 0.8 were excluded. Less than half of the radiomic features (185/408) had reproducibility and robustness (ICC > 0.8) under mask perturbation. To further eliminate redundant features, we applied consensus clustering with Pearson’s correlation coefficient as the distance metric to group the features into clusters, and 12 features with the highest average consensus index in each cluster were selected (Supplementary Figure S2). These features were then divided into two groups. One group consisted of the radiomic features after neoadjuvant chemotherapy and therefore contained information on treatment results. The other group consisted of radiomic features of the before-and-after-chemotherapy difference and therefore contained information on treatment sensitivity. An overall radiomic score was built for each group of features by Cox regression with OS as the outcome.

Model Development and Validation

Associations of the radiomic signatures and clinical characteristics with death within 3 years were examined by univariable logistic regression. Variables were selected into the prediction models on the basis of their contributions to the prediction accurracy. We built four models: one model with ypT stage only (T model), one model with only radiomic features (RS model), one model with ypT stage and clinical characteristics (T+ model), and one model with radiomic features (same as the RS model) and clinical characteristics (same as the T+ model but excluding the ypT stage). The clinical characteristics included in the latter two models were histological grade, tumor location, vessel carcinoma embolus, and ypN stage.

We used tenfold cross-validation to validate the model performance. We assessed and compared the model accuracy using the AUC statistic. Model calibration was assessed by calibration plots and Hosmer–Lemeshow tests. Decision curve analysis was conducted to evaluate the models’ clinical usefulness by quantifying the net benefit at different threshold probabilities.

Statistical analysis was conducted with STATA software (version 16) and R (version 4.2.1). A two-sided P-value < 0.05 was used to indicate statistical significance.

Results

Patient Characteristics

The entire study cohort comprised 205 patients with gastric adenocarcinoma, with 71 (34.6%) deaths within 3 years (Table 1). Approximately 81% (n = 166) of the cohort was male, and the mean age was 59.9 (SD 10.3). Patients with adenocarcinoma made up 80.0% (n = 164) of the study population. Most participants were at ypN 0 stage (n = 81, 39.5%) or at ypT 4 stage (n = 105, 51.2%).

Table 1 Association of patient clinicopathological characteristics with death within 3 years.

Univariable Analyses

Univariable analyses showed that several patient characteristics were associated with death within 3 years, including BMI, degree of differentiation, tumor location, vessel carcinoma embolus, pathological short diameter, pathological long diameter, ypN stage, and ypT stage (all P < 0.05) (Table 1).

Seven candidate radiomic features were used for selection into the final models, of which three were significantly associated with death within 3 years (Table 2). After cross-validation, three radiomic features (score_after, after_Sphericity, after_long) had an AUC value that was significantly higher than that of ypT stage (all P < 0.05). The individual cross-validated AUC statistic ranged from 0.534 to 0.670.

Table 2 Association of candidate radiomic features with death within 3 years.

Model Performance

Table 3 presents the models’ AUC values. The non-cross-validated AUCs were 0.597 (95% CI 0.525–0.669), 0.654 (95% CI 0.579–0.729), 0.823 (95% CI 0.760–0.885), and 0.833 (95% CI 0.777–0.890) for the T, RS, T+, and RS+ models, respectively. The cross-validated AUC of the RS model was significantly higher than that of the T model (0.598 versus 0.516, P = 0.009). The cross-validated AUC of the RS+ model was significantly higher than that of the T model (0.769 versus 0.516, P < 0.001), the RS model (0.769 versus 0.598, P < 0.001), and the T+ model (0.769 versus 0.738, P = 0.023).

Table 3 Model AUCs and their comparisons

The calibration plot (Figure 1A) and Hosmer‒Lemeshow test (P = 0.92) indicated that the RS+ model performed well in terms of calibration when not subjected to cross-validation. After cross-validation, the calibration plot (Figure 1B) revealed a discrepancy between the observed and expected outcomes. The Hosmer‒Lemeshow test was statistically significant (P = 0.026), indicating a departure from the ideal fit between the predicted and observed outcomes. In the decision curve plot (Figure 2), the T+ and RS+ models had comparable net benefits greater than the T and RS models at all threshold probabilities.

Fig. 1
figure 1

Calibration curves of the RS+ model in the development cohort (A) and after cross-validation (B).

Fig. 2
figure 2

Decision curve analysis for all models.

Discussion

In the current study, we built a radiomic model predicting prognosis of patients with gastric cancer after neoadjuvant chemotherapy. The final model included both pre-treatment and post-treatment radiomic signatures, along with clinicopathological characteristics. The model demonstrated excellent prediction accuracy in the development set, and its accuracy remained acceptable after cross-validation.

Unlike those who only undergo surgery, patients with neoadjuvant chemotherapy have more influencing factors on prognosis, including the patient’s baseline tumor stage, response to chemotherapy, surgical radicality, and so on. Therefore, it is relatively more difficult to evaluate the prognosis of patients with gastric cancer after neoadjuvant chemotherapy. From the perspective of pathological staging alone, for patients undergoing direct surgery, the currently reported AUC of pTNM staging was 0.719,12 while the AUC of ypTNM staging in patients after neoadjuvant chemotherapy was 0.657,13 indicating the difficulty of prognostic evaluation of patients after neoadjuvant chemotherapy, and the existing standards are far from meeting the clinical needs. Past research on gastric cancer radiomics has mainly focused on the prediction of peritoneal metastasis, lymph node metastasis, and pathological staging. In terms of prognosis, the only available study suggested that the AUC was 0.860 in the training set for a model consisting of radiomic and clinicopathological characteristics.14 For the efficacy of neoadjuvant chemotherapy, previous studies have used pre-treatment radiomics to predict patient TRG, with an AUC of 0.736 in the training set and 0.679 in the validation set.8 There are also studies predicting pathological ypT0–1 stage through baseline radiomics and clinicopathological characteristics. The AUC was 0.763 in the training set and 0.744 in the validation set.9 A study predicting tumor downstaging with baseline or after treatment CT scans yielded an AUC between 0.750 and 0.966.15 Our study first used radiomics and clinicopathological information to predict the survival of patients after neoadjuvant chemotherapy and reached AUCs comparable to those of previous studies (0.769–0.833).

The prognosis of patients after neoadjuvant chemotherapy depends on two aspects: the pathological stage of the patient before chemotherapy and the tumor’s response to chemotherapy. In the current analysis, we included the radiomic information of patients at baseline and after chemotherapy. We also achieved delineation and calculation of the tumor long diameter through radiomic artificial intelligence. On the basis of the cross-validation results, it was found that three radiomic features, namely score_after, after_Sphericity, and after_long, had significantly higher AUC values than the ypT stage. The score_after variable represents the radiomic signature constructed using post-chemotherapy features. The after_Sphericity variable provides information on the 2D shape sphericity of the post-chemotherapy primary lesion. Finally, our after_long indicator measures the size of irregular lesions through radiomics.

In addition to the acceptable to excellent prediction accuracy of the RS+ model, another important clinical implication of the current study is that it provides a new approach for evaluating chemotherapy efficacy for patients with gastric cancer undergoing neoadjuvant chemotherapy. At present, the RECIST standard is used to evaluate the effect of neoadjuvant chemotherapy for gastric cancer in clinical practice and is also regarded as the gold standard for solid tumors.16 However, its application in gastric cancer has limitations. The stomach is a hollow organ, and its primary lesion cannot be evaluated as a target lesion. Studies have suggested that the evaluation result of RECIST was not an independent prognostic factor for patients with gastric cancer after neoadjuvant chemotherapy.17,18 Four of the radiomic features in the current study were independently associated with patient prognosis. It may provide a more accurate evaluation of neoadjuvant chemotherapy efficacy for patients with gastric cancer. For patients with a good predicted prognosis, surgery may be considered, while for patients with a poor predicted prognosis, it may be more appropriate to consider extending the number of chemotherapy cycles or adjusting the chemotherapy regimen to achieve more precise treatment of the patient. At the same time, the limitation of the lack of target lesions in the RECIST standard was avoided because the region of interest (ROI) of the radiomics mainly focused on the primary lesion.

Our study has several limitations. A major one is the choice of ROI. We chose the CT image of the most evident part of the tumor for delineation and did not delineate the lesion layer by layer. Because of the heterogeneity of gastric cancer, layer-by-layer delineation might improve the prediction accuracy of the model but also reduce the feasibility of radiomics, as it requires manual delineation. The ROI delineated only the primary tumor and did not involve the lymph nodes. However, we did not delineate the lymph node area for two reasons. On the one hand, clinical imaging has limited accuracy in judging whether there is lymph node metastasis, generally approximately 64–66%.19,20,21 Through the new technology of radiomics, the AUC of lymph node metastasis can be increased to 0.9319 in the training set and 0.8546 in the validation set. For patients after neoadjuvant chemotherapy, whether there is lymph node metastasis before chemotherapy cannot be pathologically verified, and lymph node metastasis after chemotherapy is more difficult to judge because of the chemotherapy. On the other hand, some studies have suggested that the imaging characteristics of the primary tumor could indicate whether the lymph node has metastasized.22,23,24

Other limitations include the following. We used a retrospective dataset from a single center to develop the models, and the sample size was not very large. This may limit the generalizability of the results. Due to data availability and accuracy issues, we were unable to analyze important prognostic factors, such as clinical T stage before neoadjuvant chemotherapy and TRG. The use of the model requires researchers to manually delineate the ROI on the CT scan. To simplify the model, we dichotomized patient survival into alive or dead within 3 years, which caused some information on the length of survival time to be lost. The clinical meaning of some extracted radiomic features can be elusive and inexplicable. The associations of these features with patient clinical manifestations should be further investigated. The AUCs of the RS+ model were high, but the calibration after cross-validation was poor. The model’s robustness needs to be further validated in future research.

In summary, we developed and validated a pre- and post-treatment double-sequential-point dynamic radiomic prediction model on the basis of CT phenotypes and clinical characteristics for the prediction of prognosis among patients with gastric cancer after neoadjuvant chemotherapy. The model had good accuracy and could be used as a decision aid tool in clinical practice to differentiate patient prognosis.