Gastrointestinal stromal tumors (GISTs) are the most frequently diagnosed gastrointestinal mesenchymal tumors. The prevalence of GIST is 130 per million,1,2 and the annual incidence is 4.3–21.1 per million.3 Owing to developments in pathology and diagnostic technology, as well as increased levels of physician knowledge regarding GISTs, a gradual upward trend in the diagnosis of GISTs has been reported.4,5 In 1983, Mazur and Clark observed that some gastric wall tumors did not originate from smooth muscle, but instead originated from perineurial or mesenchymal nerve sheath cells.6 In 1998, Hirota et al. indicated that GISTs are derived from the interstitial cells of Cajal (ICCs).7 Approximately 75–80% of GISTs harbor a mutation in KIT,7,8 and approximately 10% contain a mutation in platelet-derived growth factor receptor-α (PDGFR-α).9,10

GISTs can be benign or malignant. R0 resection is the gold standard in the treatment of localized primary GISTs; however, the risks of recurrence and metastasis persist in some patients after resection. Adjuvant imatinib administration can improve the prognosis of unresectable or metastatic GISTs11 and those of high-risk patients.12,13 Imatinib is generally well-tolerated, but almost all patients have at least one adverse event.13 For these reasons, it is critical to predict the risk of recurrence and metastasis, as a means of assisting with treatment planning.

Several risk-prediction models for GIST have been published.14,15,16,17,18,19,20,21,22 The first consensus approach—the National Institutes of Health (NIH) consensus14—defines the risk of aggressive GIST behavior based on tumor size and mitotic count. The Armed Forces Institute of Pathology (AFIP) criteria15 subdivided GIST patients into eight subgroups (1–6b) based on tumor size and mitotic count, and stratified them into five risk groups based on tumor site. The modified NIH consensus criteria17 stratified GIST patients into four risk groups through the addition of two other risk factors—tumor site and rupture. Notably, in those studies, only a small portion of patients were Asian.16,19 In the current study, we collected data from 23 hospitals in Shandong Province, China, and used the data to examine prognostic factors in Chinese patients and establish a new recurrence-free survival (RFS) prediction model.

Methods

Patients

We retrospectively analyzed data of patients with a confirmed diagnosis of GIST from 23 hospitals in Shandong Province from 2001 to 2014. All the experimental protocols were approved by the Ethics Committee of Qingdao University, China. Demographic and clinical data were obtained from case records. Data on tumor location, size, mitotic count, rupture, and surgical margin were acquired from the operative reports and pathology reports.

We excluded patients who did not undergo endoscopic or surgical resection, as well as those with positive surgical margins. Individuals with multiple neoplasms, primary metastatic disease, recurrent tumors, and other malignancies were excluded. We also excluded patients who received imatinib either before or after the surgery and died perioperatively, or whose age, sex, and date of surgery were unknown.

Statistical Analysis

The main endpoint of the study was RFS, while secondary endpoints included disease-specific survival (DSS) and overall survival (OS). RFS, DSS, and OS were estimated using the Kaplan–Meier method and compared using log-rank tests. To determine the hazard ratio (HR) for survival associated with each variable after radical resection, the univariate Cox proportional hazards model was employed. Multivariable Cox regression analysis was used to detect prognostic factors, and variables with a p value < 0.05 were entered into the final model to establish 1-, 3-, and 5-year RFS prediction models. The discrimination of the RFS prediction model was measured by concordance index (C-index), and the calibration of the RFS prediction model was assessed by comparing predicted RFS with actual RFS. We used the bootstrapping method for internal validation, and the data of 1000 bootstrap samples were analyzed, as in the study samples from our database.23,24 A C-index of 0.5 means that the model does not have predictive power, whereas a C-index of 1 represents that the prediction result of the model is completely consistent with the facts. All statistical analyses were performed using the SPSS 24.0 statistical software package (IBM Corporation, Armonk, NY, USA), Stata 15.1 software (StataCorp LLC, College Station, TX, USA) and R version 2.3 (The R Foundation for Statistical Computing, Vienna, Austria). P-values were two-sided and were considered as significant at a level of 0.05.

Results

Demographic Data

We retrospectively analyzed the data of 5282 patients with GISTs from 23 hospitals in Shandong Province, China, from 2001 to 2014. A total of 50.3% (n = 2648) of patients were men and 49.1% (n = 2615) were women. The median age at diagnosis was 59 years (range 4–91) (Table 1).

Table 1 Demographic data and tumor characteristics

Tumor Characteristics

Tumor location was known in all analyzed cases. More than half of the tumors were located in the stomach (56%, n = 2958), with the second most common location being the small intestine (24.1%, n = 1275). In addition, 11.6% (n = 661) of the tumors were located outside of the gastrointestinal tract, mainly in the omentum, mesentery, and retroperitoneum. The median tumor size was 5.0 cm (range 0.1–50 cm) in diameter. Regarding mitotic count, values lower than 5 per 50 high-power fields (HPFs) accounted for the largest proportion of cases (71.0%, 1916 of 2697). The tumor rupture group includes tumors that ruptured either preoperatively or intraoperatively. A total of 6.2% (291/700) tumors ruptured preoperatively or intraoperatively (Table 1).

Treatment

Based on tumor location and size, 7.7% (n = 408) of patients underwent endoscopic resection, while 83.1% (n = 4394) of patients underwent surgical resection, including open surgery, laparoscopy, and robotic surgery. A total of 0.9% (n = 46) of patients underwent laparoscopic-assisted endoscopic polypectomy. In 95.6% (n = 5050) of patients, the tumors were microscopically resected with negative resection margins (R0).

Survival Analysis

Of the 5285 enrolled GIST patients, 175 had missing data regarding age, sex, or date of surgery. Overall, 143 and 34 patients underwent R1 and R2 resection, respectively. Fifty-eight patients underwent biopsy only, and one patient underwent transarterial chemoembolization (TACE). A total of 217 patients had other malignancies. One hundred and sixty-eight patients were diagnosed with multiple neoplasms or primary metastatic disease. Five patients died perioperatively owing to complications and 25 patients had recurrent/metastatic GIST. Three and 407 patients received imatinib before and after surgery, respectively. A total of 4216 patients met the inclusion criteria. The complete follow-up data of 3363 patients were obtained, which amounted to a follow-up rate of 79.8%. The median follow-up was 41 months (range 1–156 months). Of all patients, 2704 (80.4%) were alive without recurrence or metastasis; 236 patients (6.5%) were alive with disease. A total of 320 patients (9.5%) died of GIST, while 103 patients (3.1%) died of unrelated disease. In most cases, tumor recurrence or metastasis occurred within the first 5 years after resection, with a maximum interval of 9 years after resection. The liver and abdomen were the most common sites of metastasis.

One-, 3-, and 5-year RFS was 94.6% (95% confidence interval [CI] 93.8–95.4), 85.9% (95% CI 84.7–87.1), and 78.8% (95% CI 77.0–80.6), respectively; 1-, 3-, and 5-year DSS was 97.6% (95% CI 97.0–98.2), 90.7% (95% CI 89.7–91.7), and 88.9% (95% CI 87.7–90.1), respectively; and 1-, 3-, and 5-year OS was 97.1% (95% CI 96.5–97.7), 89.1% (95% CI 87.9–90.2), and 84.9% (95% CI 83.3–84.5), respectively (Fig. 1).

Fig. 1
figure 1

(a) Recurrence-free survival and (b) overall survival. Num number

After being stratified by the modified NIH consensus criteria, no statistically significant differences were observed for the RFS (Chi square = 0.001, p = 0.982), DSS (Chi square = 0.03, p = 0.868), and OS (Chi square = 0.83, p = 0.361) rates of very low-risk and low-risk patients. Additionally, no statistically significant differences were observed between the OS rates of very low-risk patients and intermediate-risk patients (Chi square = 3.35, p = 0.067). There were significant differences among other groups (Chi square = 15.71–258.28, p = 0.000–0.003) (Fig. 2).

Fig. 2
figure 2

(a) Recurrence-free survival, (b) disease-specific survival, and (c) overall survival, stratified by the modified National Institutes of Health consensus criteria. Num number

In univariate analyses, sex (HR 1.357, 95% CI 1.148–1.603, p < 0.001), age (HR 1.011, 95% CI 1.004–1.018, p = 0.002), tumor location, size (HR 1.094, 95% CI 1.085–1.103, p < 0.001), mitotic count (6–10 per 50 HPFs vs. < 5 per 50 HPFs: HR 5.940, 95% CI 4.618–7.640, p < 0.001; > 10 per 50 HPFs vs. < 5 per 50 HPFs: HR 12.196, 95% CI 9.328–15.947, p < 0.001), and rupture (HR 5.286, 95% CI 4.148–6.736, p < 0.001) were predictive of outcome. Compared with patients whose tumors originated in the stomach, RFS was worse for those with tumors that arose from the small bowel (HR 2.454, 95% CI 2.026–2.972, p < 0.001), the colorectum (HR 3.525, 95% CI 2.411–5.155, p < 0.001), or outside the gastrointestinal tract (HR 2.689, 95% CI 2.134–3.388, p < 0.001). On the other hand, RFS was better for those with tumors located in the esophagus than for those whose tumors originated in the stomach (HR 0.228, 95% CI 0.057–0.918, p = 0.038) (Fig. 3).

Fig. 3
figure 3

Recurrence-free survival by (a) sex, (b) tumor site, (c) size, (d) mitotic count, and (e) rupture. HPR high-power field, E-GIST extra-gastrointestinal stromal tumor

In multivariate analyses, esophageal GISTs were excluded because of the small number of patients. Log-rank tests did not reveal any significant RFS differences between small intestinal GISTs, colorectal GISTs, and E-GISTs (extra-gastrointestinal stromal tumors). For this reason, we combined these three groups into a single group for comparisons with gastric GISTs. The six variables that were significant in univariate analyses (p < 0.05) were entered into the multivariate Cox regression model. Sex (HR 1.310, 95% CI 1.052–1.632, p = 0.016), tumor location (HR 1.419, 95% CI 1.144–1.760, p = 0.001), size (HR 1.100, 95% CI 1.081–1.120, p < 0.001), mitotic count (6–10 per 50 HPFs vs. ≤ 5 per 50HPFs: HR 4.231, 95% CI 3.261–5.489, p < 0.001; > 10 per 50 HPFs vs. ≤ 5 per 50 HPFs: HR 7.585, 95% CI 5.678–10.134, p < 0.001), and rupture (HR 3.522, 95% CI 2.573–4.822, p < 0.001) were the variables that showed significant and independent associations with prognosis (Table 2).

Table 2 Multivariate analysis of recurrence-free survival

Based on the estimated regression coefficients, PI = 0.000 (if female) + 0.270 (if male) + 0.000 (if gastric GIST) + 0.350 (if non-gastric GIST) + 0.000 (if no tumor rupture) + 1.259 (if tumor rupture) + 0.000 (tumor mitotic count < 6 per 50 HPFs) + 1.442 (tumor mitotic count between 6 and 10 per 50 HPFs) + 2.026 (tumor mitotic count > 10 per 50 HPFs) + 0.096 × tumor size (cm). Predictions of 1-, 3-, and 5-year RFS were obtained from the model as follows: S (12, X) = 0.9926exp(PI), S (36, X) = 0.9739exp(PI), and S (60, X) = 0.9471exp(PI), respectively. For instance, for a male patient with a ruptured small intestinal GIST 10 cm in diameter and with > 10 mitoses per 50 HPFs, the PI is 0.98 and the 1-, 3-, and 5-year RFSs are 38.2%, 3.2%, and 0.1%, respectively. As another example, for a female patient with an unruptured gastric GIST 10 cm in diameter and with ≤ 5 mitoses per 50 HPFs, the PI is 0.96 and the 1-, 3-, and 5-year RFSs are 98.1%, 93.3%, and 86.8%, respectively.

The overall apparent C-index of the RFS prediction model was 0.850. In internal validation, the RFS prediction model’s performance was assessed using the bootstrapping method. The bootstrapping corrected C-index was 0.850 (95% CI 0.830–0.870). Figure 4 shows the calibration of the RFS prediction model was assessed by comparing the predicted RFS with the actual RFS.

Fig. 4
figure 4

Calibration of the RFS prediction model. Actual RFS is shown compared with the RFS prediction model at (a) 1, (b) 3, and (c) 5 years. RFS recurrence-free survival

In the assessment of accuracy, we used receiver operating characteristic (ROC) curve analyses to compare the proposed RFS prediction model and the modified NIH consensus criteria, the AFIP criteria, and Gold’s nomogram. The area under the ROC curve (AUC) of the RFS prediction model was 0.879 (95% CI 0.859–0.899, p < 0.001), whereas the AUC of the modified NIH consensus criteria was 0.833 (95% CI 0.812–0.855, p < 0.001), the AUC of the AFIP criteria was 0.853 (95% CI 0.831–0.876, p < 0.001), the AUC of the 2-year nomogram was 0.857 (95% CI 0.836–0.879, p < 0.001), and the AUC of the 5-year nomogram was 0.857 (95% CI 0.835–0.878, p < 0.001) (Fig. 5).

Fig. 5
figure 5

Receiver operating characteristic curve analysis comparing our RFS prediction model with the modified NIH consensus criteria, the AFIP criteria, and Gold’s nomogram. RFS recurrence-free survival, NIH National Institutes of Health, AFIP Armed Forces Institute of Pathology

Discussion

In this study, we retrospectively analyzed the records of patients with a confirmed diagnosis of GIST in Shandong Province, China, as a means of evaluating the risk factors for recurrence of GIST after radical resection, and of developing an RFS prediction model that would be applicable to the Chinese population. We then assessed the accuracy of the RFS prediction model through comparison with the modified NIH consensus criteria, the AFIP criteria, and Gold’s nomogram. In terms of the epidemiological features, our study showed an almost-balanced sex ratio (male:female = 1.01). Most of the tumors were located in the stomach, followed by the small intestine and colorectum. These results are consistent with previous studies.3

Our study revealed that Chinese GIST patients had a better prognosis than has been reported in Western countries. In our study cohort, the 5-year RFS and DSS rates were 78.8% and 88.9%, respectively. On the other hand, in an analysis of pooled population-based cohorts, Joensuu et al. reported a 5-year RFS of 70.5%.19 Additionally, Mucciarini et al. reported that the 5-year DSS was 84.6% for patients treated with radical surgery.25

As for this discrepancy, this is partly because, on the one hand, the percentage of small GISTs (≤ 2 cm) in our cohort (22.9%) was a little higher than those in the cohorts of Joensuu et al. (12.3%) and Mucciarini et al. (17.5%). On the other hand, it is possible that the differences in prognosis result from genetic differences associated with ethnicity, although additional research would be needed to investigate this explanation and to determine whether it is valid.

We determined that male sex, non-gastric GISTs, large tumor size, high mitotic count, and ruptured tumors were adverse, independent prognostic factors. On the other hand, female sex, gastric GISTs, small tumor size, low mitotic count, and unruptured tumors were associated with better survival.

Tumor size and mitotic count are the recognized independent prognostic factors. In 2002, Fletcher et al. first published a consensus approach (the NIH consensus) for predicting GIST outcomes.14 Using this approach, the risk of aggressive behavior in GISTs is staged according to tumor size and mitotic count. Similarly, in the current study, these two factors were independently related to patient outcomes in both univariate analyses and multivariate Cox regression.

In 2006, the AFIP criteria15 were developed by Miettinen et al. based on four long-term follow-up studies.26,27,28,29 The suggested guidelines indicate that the clinical behavior of GISTs varies by site; this suggestion agrees with our results. In the current study, significant RFS differences were observed between esophageal GISTs and gastric GISTs; however, no significant difference was observed in terms of DSS or OS. Esophageal GISTs were associated with better survival in univariate analysis, a finding that may be related to the pathological features of tumors in our database. In addition, we observed that the prognosis of small intestinal GISTs, colorectal GISTs, and E-GISTs was worse than that of gastric GISTs.

In 2007, Rutkowski et al.30 and Takahashi et al.31 suggested that tumor rupture was a significant prognostic factor. The following year, Joensuu et al.17 modified the NIH consensus though the addition of two factors—tumor site and rupture. Consistent with this modification, our analyses of RFS showed that the presence of tumor rupture was associated with an HR of 3.52.

However, it is controversial whether sex is an independent risk factor among GIST patients. The results of the current study were consistent with previous reports from Joensuu et al.19, Rutkowski et al.30, and Lv et al.5 that male sex may be associated with a worse prognosis; however, Huang et al.16 observed no relationship between sex and survival. These discrepancies deserve further study.

This study had a multicenter retrospective design with a large patient population. Compared with the modified NIH consensus criteria and the AFIP criteria, the new RFS prediction model can predict the risk of the individual patient. It may be appropriate for doctors to treat certain GIST patients with adjuvant imatinib. Importantly, it is both difficult and valuable to acquire detailed, large-scale data on GIST patients who are not treated with imatinib, and these data reflect the original clinical and pathological features of GIST in Chinese patients. Moreover, our results revealed that Chinese patients with GIST had a better prognosis than GIST patients from Europe and America, indicating that the clinicopathological features and prognosis of GIST varies geographically or ethnically. These results also verify that it is necessary to establish specific standards for monitoring and treating GIST in different areas of the world.

This study is subject to some limitations. First, gene mutations and SDHB protein expression were not included in the prediction model. In 2013, a National Comprehensive Cancer Network (NCCN) guideline first pointed out that genetic analysis of the tumor should be assessed if planned treatment involves tyrosine kinase inhibitors. Previously, genetic analysis was not routinely performed.32 In 2014, another NCCN guideline suggested that if no mutations were observed in KIT or PDGFR-α, SDHB immunohistochemistry should be further evaluated. Since our database included patient data from 2001 to 2014, most of these cases did not examine SDHB immunohistochemistry;33 therefore, we did not evaluate the effects of gene mutations and SDHB protein expression on prognosis because of incomplete mutation data. Prospective studies and long-term follow-up are needed to address these issues. As this study was a retrospective study, which had inevitable limitations, the missing data of mitotic count might influence the stability of the results. Second, no other cohort was available for validation, therefore we were not able to perform a validation study using an external cohort; however, the internal validation has suggested a good prediction power.

To our knowledge, this study included the largest cohort of Asian GIST patients that has been investigated to date. In this large, multicenter, retrospective study, we determined that sex, tumor location, size, mitotic count, and rupture were independent prognostic factors. Based on the independent prognostic factors and the estimated regression coefficients, we developed an RFS prediction model that was capable of estimating the risks of recurrence and metastasis at 1, 3, and 5 years.

Conclusions

Our RFS prediction model is able to predict the risk among individual patients. In the meantime, this can be of use to health policymakers and doctors in the formulation of treatment strategies. The RFS prediction model is available for use at the following URL: http://www.gistriskcalculator.com/.