Introduction

Although its incidence has been significantly reduced over the past decades, gastric cancer is still the third leading cause of cancer-related death worldwide [1]. Lymph node (LN) metastasis is one of the main prognostic factors for gastric cancer, and it plays a pivotal role in the selection of appropriate candidates for neoadjuvant chemotherapy (NAC) therapy [2,3,4,5]. NAC is recommended for locally advanced cases only in the presence of bulky LN metastasis according to the newly published Japanese Gastric Cancer Guidelines (Ver5) [5, 6]. A prospective multi-institutional validity study carried out by the Japan Clinical Oncology Group proposed to add LN metastasis as a criterion for the selection of patients to receive NAC therapy [2]. Thus, an accurate prediction of LN metastasis in gastric cancer is crucial for clinical decision-making and the improvement of prognosis. Traditional methods for determining nodal status using imaging tools are mainly based on the size of the LNs. However, reactive or inflammatory LNs can be enlarged, normal-sized, or minimally enlarged nodes accounting for a considerable proportion of malignance. Hence, a proportion of patients are at high risk for inaccurate clinical nodal staging [7,8,9]. Computed tomography (CT) is the most commonly used imaging modality for the preoperative assessment of LN status, but the reported accuracy was unsatisfactory, only at approximately 60% [7, 10].

An alternative, noninvasive technique is needed to supplement the routinely used imaging tools. As a previous study revealed, tumors with a poor differentiation degree, vascular invasion, and epidermal growth factor receptor overexpression were reported to be more prone to LN metastasis [11,12,13]. In this case, another strategy to evaluate the LN status can be relied on the comprehensive characteristics of the primary lesion. However, most of those features can only be obtained from postoperative pathological examination. Radiomics is an emerging field that allows a noninvasive approach to extract quantitative features from medical images [14,15,16], which has exhibited great potential in oncological practice, including differentiation diagnosis, prediction of histological classification, LN metastasis, therapeutic response, and prognosis [17,18,19,20,21,22,23,24,25,26]. Currently, studies of radiomics in the prediction of LN metastasis in gastric cancer are rare, and most of which are MRI-based approaches [20, 21]. The purpose of this study was to investigate the role of radiomics derived from standard-of-care CT images in the prediction of LN metastasis in gastric cancer.

Materials and methods

Patients

This retrospective study was approved by the institutional review board with a waiver of written informed consent. A total of 247 patients (167 males and 80 females; mean age, 60.7 ± 11.2 years) with pathologically proven gastric cancer were consecutively enrolled at our institution between June 2014 and November 2018. The inclusion criteria entailed pathologically confirmed gastric cancer with definite LN stage, without NAC before surgery, and preoperative enhanced abdominal CT examination within 2 weeks prior to the operation. The exclusion criteria were as follows: received NAC or radiotherapy before surgery, image quality was insufficient for diagnosis due to artifacts or poor distention, lack of definite information on postoperative LN status, and small lesions that were hardly visible on CT images. Demographical data, clinical information, and pathological node stage were derived from medical records. Although distant LN metastasis should be classified as M1 [27], these patients were also included and labeled as positive LN metastasis as the study mainly focused on the prediction of the presence of LN metastasis. We hypothesized that the radiomics model can improve the accuracy for the prediction of LN metastasis to be 82.3% [20], given that the reported accuracy was 64% for routine CT [10]; the sample size estimation formula revealed that 73 patients would ensure us to have 95% confidence and 80% power to detect the expected superiority of radiomics [28]. Considering the primary purpose of the study was to construct a radiomics model for prediction of LN metastasis based on the retrospectively collected CT images, all eligible subjects were kept as inclusion of large population which was recommended in radiomics study to provide more power for the predictive classifier [29].

CT examination and CT-reported LN status

The CT examination was performed using two 64-channel CT (Discovery CT750, GE Medical Systems; IQon-Spectral, Philips) and one dual-source CT (SOMOTOM Definition Flash, Siemens) vendors. Patients were required to fast for no less than 6 h. Before the CT scan, all patients drank approximately 1000 ml of water to distend the stomach. In addition, 20 mg of anisodamine was injected intravenously to avoid gastric peristalsis. Patients were required to hold their breath while the examination was performed. Following the unenhanced scan, patients were infused with 1.5 ml/kg of nonionic contrast material (Ultravist 370, Bayer Schering Pharma) at a rate of 3.0 ml/s into the antecubital vein using an automatic pump injector. Arterial phase (AP), portal phase, and delay phase contrast-enhanced images were acquired at 25–30-s, 65–70-s, and 110–120-s delay after the injection of contrast material, respectively. The other acquisition parameters included the following: tube voltage of 80 to 120 kVp; tube current of 120 to 640 mA; detector collimation of 0.625 or 0.6 mm; image matrix of 512 × 512; and reconstruction slice thickness of 0.625 mm or 1 mm. Two experienced abdominal radiologists reviewed the CT images to evaluate the presence of LN metastasis in consensus blinded to the pathological information. Any disagreement was resolved by consultation. LN metastasis was established if the short-axis diameter was larger than 6 mm for perigastric LN and larger than 8 mm for extraperigastric LN [7].

Tumor segmentation and feature extraction

Tumor segmentation was performed by using a dedicated semi-processing prototype software “Radiomics” (syngo.via Frontier, Version 1.0.0, Siemens). One radiologist with a 6-year experience in abdominal radiology segmented the lesions of all subjects. A senior radiologist (15-year experience in abdominal radiology) segmented 50 cases with 25 pathologically proven LN metastasis and 25 free of LN metastasis randomly selected from all samples to evaluate the inter-operator variability. When performing the segmentation, the radiologists were informed of the proven surgical locations of the tumor but were blinded to other clinical information and pathologic results. It seems that the performance of texture features extracted from AP images was slightly better than those from portal phase images in prior studies [17, 19]. Therefore, the current study retrieved AP enhanced images of Digital Imaging and Communications in Medicine (DICOM) format for radiomics feature extraction. The entire volume of the lesion was segmented semi-automatically. The contours were drawn carefully to avoid involving adjacent fluid or air. The top and bottom slices were also excluded to reduce bias caused by partial volume effects. A total of 844 radiomics features were extracted from the volume of interest (VOI). Four types of radiomics features were obtained: (1) 16 shape and size features, features related to the three-dimensional size and shape of the tumor; (2) 18 first-order statistics, features about the distribution of voxel intensities within the selected region; (3) 74 texture-based features, features describing patterns or the spatial distribution of voxel intensities, which were calculated from gray-level co-occurrence matrix (GLCM) and gray-level run-length matrix (GLRLM). Voxel intensities were resampled into equally spaced bins using a bin-width of 25 HU, as the discretization of the voxel intensity values within the VOI was necessary to determine texture matrix representations. Discretization not only reduces image noise but also normalizes intensities across all patients, allowing a direct comparison of all calculated textural features between patients; (4) 736 wavelet features, wavelet decompositions of first-order statistics and texture features. Wavelet transform effectively decouples textural information by decomposing the original image in a manner similar to the Fourier analysis in low and high frequencies. Figure 1 showed the workflow of tumor segmentation and feature extraction.

Fig. 1
figure 1

Work flow of tumor segmentation and feature extraction. a Axial AP enhanced CT images shows the tumor (white arrow). b, c Tumor segmentation. The edge of the lesion is drawn carefully to avoid involving adjacent fluid or air (b). The yellow area represents the selected region on the slice (c). dg Extraction of four categories of radiomics features

Building up the radiomics model

Radiomics model was built with our in-house software programmed with the Python Scikit-learn package (Python version 3.7, Scikit-learn version 0.21, http://scikit-learn.org/). All patients were first split into training cohort and test cohort with a ratio of 8:2 by using the random series generated by the computer [30, 31]: 197 patients (mean age, 61.0 ± 10.9 years; positive LN metastasis, 146 cases; negative LN metastasis, 51 cases) were allocated into the training cohort; 50 patients (mean age, 59.5 ± 12.6 years; positive LN metastasis, 37 cases; negative LN metastasis, 13 cases) were allocated into the test cohort. A popular data-preprocessing method in machine learning—random under-sampling algorithm—was applied to the training cohort to handle the imbalance between LN statuses (positive vs negative, 146/51) with the purpose of avoiding bias toward majority class cases and achieving a high classification rate [32]. Random forest (RF) method was used to construct the prediction model because of its high variance-bias trade-off capability. Features with intraclass correlation coefficient (ICC) value higher than 0.8 were considered stable and selected for model construction [33, 34]. Further feature selection was skipped for the sake of its help in reducing computational time more than improving the prediction accuracy when using RF method [35]. RF ensembles are a series of decision trees that act as weak classifiers individually yet form a robust prediction in aggregate. This method would randomly draw bootstrap samples and features for the establishment of each classification and regression tree (CART). Each time, 2/3 of all data sets and the square root of total features were randomly selected with replacement for the training process. Gini impurity was used for splitting the branches. The final prediction was made by voting or averaging using the bagging method. A fivefold cross-validation was utilized to improve the model performance [30]. Hyperparameters in the study including the number of trees in the forest and the minimum sample limitation for leaf node were optimized through the cross-validation process. Finally, the best model was selected and validated on the test cohort. Predicted LN status and radiomics score for each case were recorded.

Radiomics nomogram construction

All available clinical variables, including age, gender, location of the tumor, tumor morphology, laboratory tests, CT-reported LN status, and radiomics scores, were evaluated with a univariate logistic regression algorithm in the training set. Variables with p < 0.2 from univariate analysis were included for further application in a multivariate logistic regression algorithm using forward stepwise selection. Likelihood ratio test based on the maximum partial likelihood estimates was applied to select the significant predictors for LN metastasis. Finally, a radiomics nomogram was constructed based on the multivariate logistic regression model incorporating the selected predictors with the R software package (version 3.5.2: http://www.Rproject.org).

Statistical analysis

Statistical analysis was performed with SPSS software (version 19), SAS (version 9.2), R software package (version 3.5.2: http://www.Rproject.org), and the Python Scikit-learn package (version 3.7, Scikit-learn Version 0.21, http://scikit-learn.org/). Qualitative variables were presented as frequencies. Differences between qualitative variables were compared with the chi-square test. The continuous variables were tested for normality by using the Kolmogorov-Smirnov test. Normally distributed variables were shown as the mean ± SD (standard deviation). Differences between normally distributed variables were compared with t test. Data were presented as the median and interquartile range (IQR) when the variables were nonnormally distributed, and differences between these variables were analyzed with the Mann-Whitney U test. Inter-operator variability of the radiomics features was assessed with ICC. Violin plot—a hybrid of a box plot and kernel density plot—was used to illustrate the distribution of the radiomics scores and its probability density in both cohorts [36]. The performance of the radiomics model and nomogram in both cohorts were evaluated with receiver operator characteristics (ROC) curves. The ROCs of the radiomics model in the two cohorts were compared with the DeLong test to evaluate whether overfitting occurred. The calibration of the radiomics nomogram was assessed with a calibration curve. The goodness-of-fit of the nomogram in both cohorts was assessed with the Osius test, McCullagh test, and Hosmer-Lemeshow test, respectively. Decision curve analysis (DCA) was performed to determine the clinical usefulness of the radiomics model and nomogram by calculating the net benefits at different threshold probabilities in the training and test cohort. The net benefit is equivalent to the proportion of net true positives in brief [37].

Results

Clinical characteristics

Among the 247 eligible patients, 183 patients had LN metastasis (181 patients with regional LN metastasis and 2 patients with distant LN metastasis); the rest 64 patients showed negative LN metastasis. Except for the level of albumin in patients with LN metastasis being lower than those with negative LN metastasis in the test cohort, no significant differences were found between patients with positive and negative LN metastasis in both cohorts in terms of age, gender, tumor location, tumor morphology, or the other laboratory tests. The accuracy of routine CT in the diagnosis of LN metastasis was 62.4% in the training cohort and 62% in the test cohort (p = 0.911). Details of the demographical data and clinical characteristics of the training and test cohorts were summarized in Table 1.

Table 1 Clinical characteristics in the training and test cohorts

Radiomics model building and evaluation

Altogether 150 radiomics features were shown to be stable, including 9 shape and size features, 11 first-order features, 35 texture features, and 95 wavelet features. To take the effects of scanning parameters and patient variance into consideration, the radiomics model was built based on 5 general features (scanning kilo-voltage, tube current, slice thickness, patient age, and gender) as well as these 150 radiomics features. The radiomics scores in patients with LN metastasis were significantly higher than those with negative LN both in the training cohort (0.67 ± 0.24 vs 0.32 ± 0.22, p < 0.001) and test cohort (0.53 ± 0.23 vs 0.24 ± 0.19, p < 0.001). The violin plot of the radiomics scores in both cohorts was displayed in Fig. 2a. The radiomics model showed a favorable discriminatory ability in the training cohort with an area under the curve (AUC) of 0.844 (95% CI, 0.759 to 0.909), which was confirmed in the test cohort with an AUC of 0.837 (95% CI, 0.705 to 0.926) (Fig. 2b). There was no significant difference in ROCs between the two cohorts (DeLong test, p = 0.9002). The accuracy of the radiomics model was 80% (sensitivity, 90%; specificity, 71%) in the training cohort and 84% (sensitivity, 86%; specificity, 77%) in the test cohort. The radiomics model outperformed the conventional routine CT in the prediction of LN metastasis (training cohort, 80% vs 62.4%; test cohort, 84% vs 62%).

Fig. 2
figure 2

a Violin plot of the radiomics scores in the training cohort and test cohort. pLN −, pathological negative LN metastasis; pLN +, pathological positive LN metastasis. The difference between radiomics scores was compared with t test. Wider sections of the plot represent a higher probability that patients of the group would take on the given value; skinnier sections represent a lower probability. The red lines represent the median value. The green lines represent the interquartile. b ROC curves of the radiomics model in the training and test cohort

Construction and validation of radiomics nomogram

The results of the univariate and multivariate regression analysis were summarized in Table 2. Multivariable analysis revealed that radiomics scores and CT-reported LN status were significant independent factors of LN metastasis. A quantitative nomogram incorporating these two variables was built to predict the individual LN status (Fig. 3a). The nomogram showed good performance for discriminating LN metastasis with AUCs of 0.886 (95% CI, 0.808 to 0.941) and 0.881 (95% CI, 0.759 to 0.956) in the training and test cohorts, respectively (Fig. 3b). The Osius test (training cohort, p = 0.688; test cohort, p = 0.707), McCullagh test (training cohort, p = 0.674; test cohort, p = 0.652), and Hosmer-Lemeshow test (training cohort, p = 0.210; test cohort, p = 0.822) yielded nonsignificant p values in both cohorts indicating that the nomogram was acceptable. Figure 4 displayed the calibration curves of the nomogram in both cohorts. The nomogram yielded an accuracy of 83% (sensitivity, 81%; specificity, 85%) in the training cohort and 84% (sensitivity, 65%; specificity, 94%) in the test cohort. Examples of using the nomogram to predict the risk of LN metastasis were presented in Fig. 5.

Table 2 Results of univariate and multivariate regression analysis
Fig. 3
figure 3

a CT-based radiomics nomogram for the prediction of LN metastasis in patients with gastric cancer. b ROC curves of the radiomics nomogram in the training and test cohorts

Fig. 4
figure 4

Calibration curves of the nomogram in the training cohort (a) and test cohort (b). The 45° dotted line represents a perfect prediction. The solid line represents the predictive performance of the nomogram. The solid line has a close fit to the dotted gray line, which indicates good predictive capability of the nomogram. The blue belt represents the 95% CI of the predictive performance of the nomogram

Fig. 5
figure 5

Examples of using the nomogram to predict the individual risk of LN metastasis by manually placing straight lines across the diagram. Three steps are required to obtain individual risk. First, draw vertical lines for each of the variable of the patients (CT-reported LN status: blue lines, radiomics score, green lines). Then, sum up the values intersected by the lines on the “Points” scale to obtain total points. Finally, draw a vertical line (the red lines on b, d, and f) the “Total points” scale to read the “Risk” of LN metastasis. a, b A 62-year-old female with gastric cancer (a, arrowhead). A small (diameter, 5 mm) LN (a, white arrow) was identified; all the other LNs were smaller than this one and not displayed on the figure. The radiomics score was 0.94. Vertical lines of each variable were drawn. The values on the “Points” scale intersected by the lines were added to obtain total points (0 + 94 = 94). The graph revealed that the risk of LN metastasis was about 86% by drawing a vertical line on the “Total points” scale. Postoperative pathology proved the positive LN metastasis. c, d A 70-year-old male with gastric cancer (c, arrowhead). An enlarged (diameter, 13 mm) LN (c, white arrow) was found. Routine CT reported positive LN metastasis. The radiomics score was 0.0615. Total points (32 + 5 = 37) were obtained by calculating the sum of values on the “Points” scale intersected by the vertical lines of each variable. The risk of LN metastasis turned out to be about 18% by drawing a vertical line on the “Total points” scale. Postoperative pathology revealed that the patient has negative LN metastasis. e, f A 70-year-old female with gastric cancer (e, arrowhead). An enlarged (diameter, 12 mm) LN (e, white arrow) was detected. Conventional CT reported positive LN metastasis and the radiomics score was 0.876. Total points (32 + 86 = 118) were calculated by combining the values on the “Points” scale intersected by the lines of each variable. After drawing a vertical line on the “Total points” scale, the nomogram displayed that the risk of LN metastasis was over 95%. Pathological examination validated the LN metastasis

Clinical usefulness of the radiomics model and nomogram

DCA displayed that both the nomogram and radiomics model would offer net benefits over the “treat-all” or “treat-none” scheme within a certain range of threshold (radiomics model, between 20 and 90%; nomogram, between 20 and 95%) in the training cohort (Fig. 6a). Similar results could be found in the test cohort (radiomics model, 30% and 90%; nomogram, 30% and 95%) (Fig. 6b).

Fig. 6
figure 6

DCA for the nomogram and radiomics model in training cohort (a) and test cohort (b). The net benefit versus the threshold probability is plotted. Gray line represents the assumption that all patients have LN metastasis. Black line represents the assumption that all patients have negative LN metastasis. Red curve represents the radiomics model. Blue curve represents the nomogram. The x-axis shows the threshold probability. The y-axis shows the net benefit. A model is only clinically useful if it has a higher net benefit than the default treat-all (all cases have LN metastasis) and treat-none (none of the cases have LN metastasis) strategies. It is clear from the graph that the radiomics model and nomogram are superior to either treat-all or none strategy within certain ranges of risk threshold

Discussion

In the present study, we developed and validated a CT-based radiomics nomogram for the preoperative prediction of LN metastasis in patients with gastric cancer. The radiomics nomogram holds the potential to facilitate a noninvasive individualized preoperative identification of a higher risk of LN metastasis in gastric cancer.

Accurate prediction of LN status is significant for selection of optimal therapeutic plan in patients with gastric cancer. Imaging plays a crucial role in assessing LN stage in clinical practice [38]. Nevertheless, it is still challenging in clinical practice by routinely used imaging modalities such as endoscopy ultrasonography or CT. The documented accuracy was 64% for endoscopy ultrasonography and ranged from 61 to 64% for CT in the distinction of LN metastasis [7, 10, 39]. In the current study, routine CT displayed similar capability with the accuracy of about 62%. Recently, newly emerged radiomics have made it possible to predict LN status using the features of the primary tumor. The works conducted by Liu et al [20, 21] revealed that whole-lesion-based apparent diffusion coefficient histograms were valuable in the discrimination of LN metastasis with the best accuracy reaching 82.3%. While CT is more widely used and serves as the standard-of-care imaging tool for the preoperative evaluation of LN status, building radiomics models using features extracted from routinely acquired contrast CT images could be more convenient and efficient. Our study revealed that the CT-based radiomics model showed good performance in discriminating LN metastasis with AUCs of 0.844 and 0.837 in the training and test cohorts, respectively. Besides, the model outperformed routine CT in the discrimination of cases with LN metastasis, with the accuracy increased to 80–84%. The work conducted by Feng et al [40] described that the radiomics model based on support vector machine algorithm displayed an AUC of 0.824 and 0.764 in the training and test cohorts, respectively. And the model exhibited an accuracy of 76.4% in the training cohort and 71.2% in the test cohort, which was also superior to routine CT in the distinction of LN metastasis. Although with different machine learning methods, the work by Feng et al [40] and ours rendered similar results that the radiomics approach holds the potential to enrich image interpretations and supplements the routine CT in the evaluation of LN status in gastric cancer.

Furthermore, this study also constructed an easy-to-use, graphical analog computation device—the nomogram, which allows clinicians to obtain results quickly and reliably by simply drawing several lines. The work carried out by Li et al [41] developed a nomogram based on the iodine concentration of the tumor and Borrmann classification for the distinction of LN metastasis in patients with gastric cancer, which yielded AUCs of 0.760 and 0.793 and accuracies of 0.700 and 0.757 in the training and test cohorts, respectively. In contrast, the radiomics-based nomogram in the present study showed higher AUCs of 0.886 and 0.881 in the training and test cohort, as well as higher accuracy of 84% in both cohorts. The superior performance in this study was presumably attributable to the inclusion of the radiomics scores, which incorporated numerous quantitative features, especially parameters not easily visible by simple visual analysis or measured by routine imaging tools. The nomogram carries great clinical significance in assisting clinicians for the selection of appropriate candidates for NAC treatment. Despite the promise in downgrading of the disease and improving the radical resection rate, evidence-based demonstration of a survival benefit of NAC compared with surgery alone is absent until now. A problem with intensive NAC is the possible inclusion of patients who may be curable by surgery alone experiencing grave adverse events from unnecessary chemotherapy [2]. Identification of the presence of LN metastasis serves as a key point in the selection of appropriate candidates for NAC therapy [2, 5]. Both the radiomics model and nomogram in the current study yield excellent performance for the distinction of LN metastasis and could offer net benefits over the “treat-all” or “treat-none” strategy within certain threshold probability.

The study had several limitations. Firstly, due to the retrospective design of the study, carcinoembryonic antigen results close to the CT scanning were performed outside our institution and not available for a subset of patients; thus, they were not included. Secondly, the study focused only on the binary classification of patients with negative or positive LN metastasis. The role of radiomics in the prediction of the detailed stage (N1-N3b) and anatomical location (16 stations) of LN metastasis was not investigated which merits further studies. Thirdly, the study only included a small number of subjects; future researches with larger population and external validation would be warranted. Finally, the radiomics features were extracted based on three-dimensional VOIs, which were labor-intensive, and further studies using two-dimensional region-of-interest-based features should be explored.

The study proved that radiomics nomogram might hold promise in the preoperative individualized prediction of LN metastasis in patients with gastric cancer.