Introduction

Gastric cancer (GC) remains prevalent and ranks fifth in incidence and fourth in mortality worldwide [1]. One of the most widely used histopathological classifications of gastric adenocarcinoma is Lauren’s classification, which divides lesions into intestinal, diffuse, and mixed types [2]. Tumors with different histological types have distinct molecular features and clinical behaviors [3,4,5]. Studies have shown that the prognosis in patients with diffuse-type GC is less favorable than that in patients with intestinal-type GC [6]. Diffuse-type GC is associated with a higher postoperative recurrence rate than intestinal-type GC [6]. More importantly, recent studies have revealed that intestinal-type GC may show better overall survival and progression-free survival after neoadjuvant chemotherapy [7, 8]. Another study showed that diffuse-type GC has a more extensive resection range than intestinal-type GC as diffuse-type GC often shows a wide range of violations, and patients with diffuse-type GC typically require chemotherapy after surgery [9].

Thus, accurate identification of intestinal-type adenocarcinomas may improve the prognostic system and facilitate more personalized treatment.

Lauren GC type is determined by pathological analysis of postoperative GC samples. When choosing a proper treatment plan before surgery, the diagnosis is somewhat hysteretic: it has a limited perspective and cannot evaluate invasion outside the wall; in addition, few tissue specimens can be obtained, which greatly influences the diagnostic accuracy [10]. Therefore, there is an urgent need to adopt a more effective method to differentiate between Lauren GC subtypes.

Radiomic features, which are the quantitative characteristics of GC based on medical imaging, have been extensively explored and studied. They have been correlated with distant metastasis [11], therapeutic responses [12], and prognoses [13, 14] related to the differentiation of Lauren GC subtype. Most studies have used CT-based radiomics to differentiate Lauren GC subtype; however, these were limited to 2D ROIs from the largest tumor areas in single-phase images (e.g., venous phase [VP]). Previous studies have shown that 2D ROIs may not sufficiently represent the heterogeneity of tumors [15,16,17,18]. Meanwhile, Wang et al. [19] found that a nomogram combining multi-phase images (arterial phase [AP], VP, and delay phase) could better predict intestinal-type GC from diffuse- or mixed-type GC; however, only gastric adenocarcinomas were included in their study. To the best of our knowledge, no study has combined clinical features with radiomics extracted from the whole tumor volume in multi-phased images to predict Lauren GC subtype.

In this study, we constructed and compared four predictive models to preoperatively differentiate between intestinal- and diffuse-type GC, namely clinical, AP radiomics, VP radiomics, and combined models. We also developed and validated a radiomics nomogram combining multi-phase radiomic features and clinical features for the non-invasive prediction of Lauren’s GC in patients preoperatively.

Materials and methods

Patients

This retrospective study was approved by the Institutional Review Board, which waived the requirement for written informed consent. A total of 95 consecutive patients (66 male and 29 female) treated between July 2014 and October 2018 were included in this study (mean age, 60.0 ± 10.5 years; age range, 27–84 years). The inclusion criteria were as follows: (1) pathologically confirmed GC patients who underwent surgery, (2) patients with abdominal contrast-enhanced CT examinations 1 week prior to surgery, and (3) Lauren intestinal- or diffuse-type GC confirmed by postoperative pathology. The exclusion criteria were as follows: (1) patients who had received preoperative chemotherapy, radiation therapy, or chemotherapy and radiotherapy; (2) patients with concurrent abdominal tumors; (3) patients with missing clinical information; and (4) patients with Lauren mixed-type GC. To determine the Lauren classification, the pathologists stained the whole excised specimen with hematoxylin and eosin following surgery. A review of the electronic medical charts revealed the following clinical characteristics: age, sex, tumor location, tumor thickness at the thickest point, CT_T stage, CT_T stage, CT_M stage, and clinical stage. CT_T stage, CT_N stage, CT_M stage, and clinical stage were determined according to the American Joint Committee on Cancer-International Union Against Cancer (seventh edition). All enrolled patients were randomly allocated to either the training cohort (n = 66) or the test cohort (n = 29) at a ratio of 7:3.

CT examination

All patients were asked to fast for 8 h, drink 800–1000 ml of water, and practice holding their breath prior to CT examination. All CT examinations were performed using a 64-slice multi-detector spiral CT system (SOMATOM Definition AS+, Siemens or Light Speed-XT, GE Medical System). Patients were administered scopolamine hydrochloride intramuscularly before the examination to reduce gastrointestinal motility artifacts. A dose of 1.5 ml/kg of ioversol contrast agent (320–370 mg/ml) was injected via an automatic high-pressure syringe at an injection speed of 3.0 ml/s. AP and VP images were acquired with 30–35-s and 70–75-s delays after the injection of the contrast material, respectively. Patients were placed in the supine position with a scanning range that included the entire abdominal region. All images were acquired under a tube voltage of 120 kV, slice thickness and spacing of 5 mm, and auto-current tube modulation.

Tumor segmentation

We used 3D-Slicer software (http://www.slicer.org) to segment both the AP and VP images to identify the tumor. Two radiologists (radiologist A with 10 years of experience and radiologist B with 6 years of experience) determined the total tumor volume. Figure 1 shows a segmented tumor with the tumor edge removed, and the intraluminal fluid and gas were carefully excluded (Fig. 1). During segmentation, the coronal and sagittal planes were used as references. Following the definition of the tumor ROIs by radiologist A for all 95 patients, radiologist B performed segmentation on 30 randomly selected patients. During determination of tumor ROIs, both radiologists were blinded to the clinical information and pathologic results, with the exception of surgically proven lesion locations.

Fig. 1
figure 1

Workflow of the radiomics process

Radiomics feature extraction

Using a spline interpolation algorithm, all CT images were resampled to the same size (1 mm × 1 mm × 1 mm), regardless of the scanner from which they were acquired. We then extracted radiomic features using the PyRadiomics software (https://pyradiomics.readthedocs.io/). For each patient, 1211 radiomic features were extracted from the AP and VP CT images. In this radiomics feature analysis, radiomic features were classified into seven different groups: shape features, first-order features, gray-level co-occurrence matrix features, gray-level dependence matrix features, gray-level run length matrix features, gray-level size zone matrix features, and neighborhood gray-tone difference matrix features. A quantitative radiomics feature can be extracted from three types of images, namely original, Laplacian of Gaussian (LoG), and wavelet, which comprise two decompositions following wavelet filtering. In three dimensions, high- (H) or low-pass (L) filters can be applied, resulting in eight possibilities: LHL, HHL, HLL, HHH, HLH, LHH, LLH, and LLL. LoG images were formed using a sequence of sigma values generated by the LoG filter. The sigmas used in this study are 2, 3, and 5.

Feature selection and predictive model building

In this study, dimensionality reduction of radiomic features was performed in three steps. To evaluate the reproducibility of the features, the intra-class correlation coefficient (ICC) was calculated for the re-segmentation data. ICC values > 0.75 were reserved for stable features. The second method was to select statistical influence features for Lauren GC subtype from the analysis of variance (ANOVA). After least absolute shrinkage and selection operation (LASSO) regression was performed, radiomic features derived from Lauren type classifications were available after selection of the non-zero coefficients from the training cohort.

Four prediction models were constructed after feature selection, namely clinical model, AP radiomics model, VP radiomics model, and combined model. The RAD-score was calculated for each patient using an AP and VP radiomics models, which were based on LASSO regression with selected features that were weighted based on their coefficients. Radiomic signature development and feature selection were performed in the training cohort.

Statistical analysis

R software (version 3.5.1; http://www.R-projetc.org) and Python (version 3.7.12) were used for all statistical analyses. A p value < 0.05 indicated a statistically significant difference. For categorical variables (age, sex, and tumor location), Chi-square tests were used, whereas for radiomic features, Mann–Whitney U tests were used. Four classification models were created based on the Scikit-Learn Python package. These models are receiver-operating characteristic (ROC) curves. For each of the four prediction models, AUC values were computed for the training and test cohorts. DeLong tests were used to compare the AUC values between the two models. As part of the validation process, a calibration curve was used to verify that the prediction results of the nomogram corresponded with the actual clinical findings, and a decision curve was used to test the value of the nomogram in clinical practice.

Results

Patient characteristics

The baseline clinical characteristics of patients in both the training and testing cohorts are summarized in Table 1. Patients with diffuse-type GC comprised 45.45% (30/66) and 44.83% (13/29) of the training and test cohorts, respectively, and patients with intestinal-type GC comprised 54.54% (36/66) and 55.17% (16/29), respectively. The training and test cohorts were balanced in terms of age (p = 0.737), sex (p = 0.370), tumor location (p = 0.391–0.893), tumor size (p = 0.799), CT TNM stage (p = 0.299–0.920), and clinical stage (p = 0.539) (Table 1).

Table 1 Clinical characteristics between intestinal type and diffuse type in the training and test cohorts

Construction and validation of the radiomic signature models

After exclusion of non-reproducible and redundant features, three AP and eight VP radiomic features remained. AP and VP radiomic signatures were constructed and compared in the training and test cohorts based on the two feature sets. The VP radiomics model had better performance, with an AUC of 0.832 (95% CI 0.735, 0.929) in the training cohort and 0.760 (95% CI 0.580, 0.940) in the test cohort. The AP radiomics model showed an AUC of 0.701 (95% CI 0.575, 0.827) in the training cohort and 0.553 (95% CI 0.327, 0.779) in the test cohort (Table 2 and Fig. 2).

Table 2 Predictive performances of each model on the training and test cohorts
Fig. 2
figure 2

ROC curves in the training cohort (a) and test cohort (b). The performance of the combined model in predicting Lauren diffuse-type GC was the best among four prediction models, with an AUC of 0.849 in the training cohort and 0.793 in the test cohort. ROC, receiver-operating characteristic

Development of individualized radiomics nomogram

Compared to the clinical model or radiomic signature models, the combined model performed better, generating an AUC of 0.849 (95% CI 0.758, 0.940) in the training cohort and 0.793 (95% CI 0.629, 0.957) in the test cohort. The radiomics nomogram of the combined model was constructed by combining a single clinical factor (age) and the radiomic signatures (Fig. 3). There was good agreement between the calibrated nomogram model and the actual clinical outcomes (Additional Fig. 1). Using the nomogram model to identify Lauren GC type was more beneficial in determining whether a patient should be treated if their threshold probability was within 0.0–1.0 in the test cohort (Additional Fig. 2).

Fig. 3
figure 3

Nomogram with visualization and interpretability, indicating that gastric cancer patients with younger age and greater “VP rad-scores” (extracted from venous phases images) are more likely to be diagnosed with Lauren diffuse-type GC

Performance comparison of the different predictive models

Delong test indicated that the clinical model had significantly lower predictive performance than the VP radiomics model (training cohort, AUC = 0.622 vs. 0.832, p = 0.003) and the combined model (training cohort, AUC = 0.622 vs. 0.849, p < 0.001) (Additional Table 1), and the same results were observed in the test cohort (AUC = 0.490 vs. 0.760, p = 0.036; AUC = 0.490 vs. 0.793, p = 0.016) (Additional Table 2). A significant difference in predictive performance was also observed between the AP radiomics and combined models in both cohorts (training cohort: AUC = 0.701 vs. 0.849, p = 0.037; test cohort: AUC = 0.553 vs. 0.793, p = 0.044). The VP radiomics model showed a statistically significant trend compared to the AP radiomics model (training cohort, p = 0.074; test cohort, p = 0.077). Compared to the VP radiomics model, the combined model achieved similar AUCs and demonstrated improvements in both accuracy and sensitivity.

Discussion

In this study, we developed a combined nomogram incorporating AP and VP images and clinical risk factors for predicting pathological Lauren GC subtype preoperatively. The results showed that the VP radiomics model performed better than the clinical and AP radiomics models. The combined model showed the best overall performance in the distinction between intestinal- and diffuse-type GC, with an AUC of 0.849 (95% CI 0.758, 0.940) in the training cohort and 0.793 (95% CI 0.629, 0.957) in the test cohort.

Previous quantitative magnetic resonance imaging (MRI) techniques, including diffusion-weighted imaging (DWI), dynamic contrast-enhanced MRI (DCE-MRI), and diffusion kurtosis imaging (DKI), have proven their value in predicting Lauren GC subtype. Ma et al. demonstrated that the Ve and K-trans values extracted from DCE-MRI scans in diffuse-type GC were higher than those in intestinal-type GC scans [20]. Karaman et al. [21] found that the diffusion kurtosis coefficient (K value) from DKI could differentiate diffuse-type GC from intestinal or mixed-type GC with an AUC of 0.737, which was higher than the apparent diffusion coefficient (0.649) and corrected diffusion coefficient (0.572). In another study [22], a fractional-order calculus diffusion model (based on DWI images) parameter μ (a microstructural quantity) produced the best performance (AUC = 0.739; 95% CI 0.588, 0.889) in the assessment of Lauren subtype. The combinations of μ, D (diffusion coefficient), and β (intravoxel diffusion heterogeneity) produced the best performance, with an AUC of 0.793 (95% CI 0.657, 0.929). However, all these quantitative parameters or ROIs were measured on the most significant slice of the tumor, which could not completely represent the entire tumor or eliminate possible sampling inconsistencies. In addition, these studies may not be applicable in clinical settings because of their low sample numbers and lack of model construction. This study quantified and compared four prediction models and constructed a visual and interpretable nomogram. This graphical calculation device can be used for calculations by simply drawing several lines. In addition, the predictive accuracy (75.8%) of the combined model in our study achieved equivalent or higher efficiency values than those in the above studies, and it was also higher than that in preoperative gastroscopic biopsy (64.7%) [23].

An interesting observation of this study was that the predictive performance of the VP radiomics model was significantly better than that of the AP radiomics model. DeLong test did not show a significant difference, but there seemed to be an essential trend. The difference between the groups may not have been significant because of the small sample size. Previous studies have shown that there were differences in CT enhancement patterns among histological types of gastric cancers [24, 25], these differences may be due to the abundant neovascularization in the immature fibrous stroma [26]. Meanwhile, micro-vessel density can reflect the abundance of intratumoral neovascularization, and Chen et al. [27] used iodine concentration (IC) value which represented micro-vessel density to identify Lauren types of GC, the results also showed that the IC value of VP was better than that of AP to show the differences of Lauren types. These findings might explain the differences in diagnostic performance between arterial and portal venous phases. Wang’s study demonstrated that the VP radiomics model was significantly superior to the AP radiomics model in discriminating Lauren subtype (AUC 0.815 compared to 0.754), which is consistent with the results of the current study [19]. Unfortunately, only patients with gastric adenocarcinomas were included in that study. According to our results, the combined model performed slightly better than the VP radiomic model, with no significant difference in the DeLong test results, but it also had enhanced accuracy and sensitivity. We speculate that the rapid increase in AP images may be responsible for the improved accuracy and sensitivity.

Nomograms for the prediction of Lauren GC subtype in some studies also included clinical factors and radiomic signatures. For example, Wang et al. [28] demonstrated that a combined model constructed with age, CT_T stage, CT_N stage, and radiomic signatures showed the best performance compared to other models, with a training AUC of 0.745 and validation AUC of 0.758. The efficacy of this nomogram was slightly lower than that in a previous study [29]. Sun et al. [29] showed the highest predictive performance (training cohort AUC, 0.846; test cohort AUC, 0.864) with their nomogram model than with clinical and radiomic signature models in predicting Lauren GC subtype, and these results are similar to the results of this study. While these studies only used 2D ROIs of the tumors to extract radiomic features, previous studies have shown that radiomic features extracted from 3D ROIs of whole tumors could better represent the heterogeneity of tumors, which may lead to better performance than a single slice [15,16,17,18]. In this study, the whole volume of the tumor was segmented by drawing a 3D region along the tumor edge, which is the preferred method [15]. In contrast, Zhao et al. demonstrated that 3D features are more reproducible than 2D features in different imaging settings [30].

This study has several limitations. The radiomics nomogram performance needs to be confirmed through external validation in a larger cohort as this was a single-center study. Meanwhile, only surgical cases were included in this study, validation study including other cases such as underwent chemotherapy would strengthen the results of this study. In addition, Choi et al. [31] showed that patients with mixed-type GC had the same survival outcomes as those with diffuse-type GC. Statistical bias may have resulted from the exclusion of patients with mixed-type GC in this study. Finally, the discrepancies between the surgical biopsy specimens and the definitive results were not compared. We also did not compare the performance of the radiomics model with that of gastroscopic biopsy.

Due to its easy-to-use visualization and interpretability, the nomogram developed in this study, which integrates radiomic features of the gastric tumor and clinical characteristics, was shown to be very effective in predicting Lauren GC subtype before surgery.