Introduction

Treatment outcomes for gastric cancer are usually evaluated by postoperative pathologic tumor–node–metastasis (TNM) staging [14]. Although the selection of a treatment strategy is primarily based on the preoperative clinical stage, reports regarding the results of treatments selected on this basis are very limited because gastric cancer has traditionally been treated with standard surgery alone, regardless of clinical T (cT) and clinical N (cN) grades.

However, as mentioned at the beginning of the TNM textbook, TNM clinical stages can be used as a guideline for deciding on the therapeutic strategy. Since the early twenty-first century, clinical staging assessments have been considered important becasue of the increased diversity of treatment modalities such as endoscopic submucosal dissection [5], limited surgery for early cancer, and multimodality treatments, including neoadjuvant chemotherapy, for advanced cancer [6, 7].

Another major factor underlying the lack of clinical-stage-based investigations for gastric cancer is the difficulty in preoperative staging with regard to tumor depth and lymph node (LN) metastasis. With advancements in endoscopy [8], endoscopic ultrasonography [9], and multidetector computed tomography (MDCT) [10], several techniques have been established that provide satisfactory diagnostic data regarding the appropriate depth of invasion. A previous study regarding LN evaluation reported the improved diagnostic accuracy of MDCT [11]. Despite the improvements in diagnostic technology, it is still difficult to accurately assess the depth of invasion and the presence or absence of LN metastasis preoperatively.

The first edition of the 2001 Japanese gastric cancer treatment guidelines specifically displayed the clinical-stage-based therapeutic strategy in a flowchart [12]. Since the establishment of our hospital in 2002, we have been aware of the importance of clinical stage data and have attempted to implement and record a clinical staging system based on specific diagnostic criteria.

The aim of the present study was to clarify whether the preoperative clinical stage based on the seventh edition of the TNM clinical staging system of the American Joint Committee on Cancer [13] is an effective predictor of survival in patients with gastric cancer, and in turn, will become an appropriate indicator for selecting the treatment modality.

To quantify the ability of staging to predict survival in absolute values, we used a concordance index (Harrell’s C-index, a measure of the separation of survival distributions) based on time-dependent receiver operating characteristic (ROC) curve analysis [14]. Reports on objective C-index-based evaluation are available for other cancers [15, 16] but not for gastric cancer.

Materials and methods

Patient population

A total of 3165 patients underwent surgery for gastric cancer at the Division of Gastric Surgery, Shizuoka Cancer Center, between October 2002 and March 2012. We excluded 110 patients who received neoadjuvant chemotherapy, as well as patients with special histologic types [17] other than adenocarcinoma, such as neuroendocrine carcinoma (n = 15), squamous cell carcinoma (n = 5), and carcinosarcoma (n = 2), because the prognosis of patients with these histologic types is considered to be quite different from that of patients with adenocarcinoma. The remaining 3033 patients were included, and the study was approved by the Research Ethics Committee of Shizuoka Cancer Center (no. 27-J119-27-1-3).

Table 1 presents the patient characteristics. Gastrectomy was performed in 95 % of cases. Lesions were most frequently observed in the middle part of the stomach. Esophagogastric junction (EGJ) tumors were found in 168 patients (5.5 %); 118 patients (70.2 %) had Siewert type 2 tumors and 50 patients (29.8 %) had Siewert type 3 tumors. Generally, an EGJ tumor should be classified as per esophageal cancer staging, according to the seventh edition of the TNM clinical staging system. However, Japanese gastric cancer treatment guidelines [18] consider these patients as having gastric cancer, so EGJ tumors were included in the present study. Approximately half of the patients had poorly differentiated cancers in terms of histologic type, including signet ring cell carcinoma and mucinous adenocarcinoma.

Table 1 Patient characteristics, surgical procedures, and final residual tumor

Among the 146 nonresected cases with single or multiple noncurable factors, half of the patients underwent gastrojejunostomy and the remainder underwent exploratory laparoscopy, depending on the symptoms at surgery. Regarding lymphadenectomy, approximately 40 % of the patients underwent standard D2 or extended D2+ lymphadenectomy, and approximately 60 % underwent limited D1 or D1+ surgery. Final pathology examinations showed residual tumors in approximately 17 % of patients, with the achievement of an R1 resection margin (positive peritoneal cytology findings or positive microscopic surgical margin) in 6 % and an R2 resection margin (unresected or palliative resection) in 11 %.

Adjuvant chemotherapy was not routinely performed until 2006; however, 1-year of S-1 treatment was administered in pathologic stage II cancer patients and pathologic stage III cancer patients after 2007, following the publication of the Japanese ACTS-GC trial of adjuvant chemotherapy for gastric cancer [19].

Data were collected from the prospective records of individual patients, and preoperative staging into clinical stages IA–IV was conducted at multidisciplinary team conferences, which included the participation of endoscopists, radiologists, oncologists, pathologists, and surgeons. Findings from staging laparoscopy were not included for determination of the clinical stage.

Criteria for determining the preoperative tumor depth

The preoperative tumor depth classification (cT1–cT4) was determined by normal endoscopy and chromoendoscopy. The most important diagnostic measure was the macroscopic type [20].

When type 0 cancer (superficial, flat tumors with or without minimal elevation or depression) was diagnosed, a case was classified as cT1. Even in early cancer types, tumor depth was considered deeper than cT2 if mucosal convergence was significant, a submucosal-tumor-like image was noted, or extensive submucosal tumor spreading was suspected. Endoscopic ultrasonography was performed in only 160 patients with early cancer (11.0 %), whose tumor was supposed to be cT1 but could have been cT2 as well. The depth was classified as cT3 or deeper if the advanced type 1/2 cancer was located deeper than 50 mm. If the surrounding mound was relatively high or the ulcer floor was deep, cT4a was diagnosed, even in type 2 cancer. Invasive type 3/4 cases were primarily classified as cT4a.

Upper gastrointestinal tract contrast radiography was performed in eligible patients without stenosis. Egashira et al. [21] determined invasion depth on the basis of gastric wall hypertrophy findings, insufficient wall extensibility, wall rigidity, and tumor size. Early cancers were classified as elevated or ulcerative. For elevated types, the depth was classified as cT1a if the cancers were pedunculated. In sessile elevated types, the classification was cT1a for tumors not larger than 3 cm and cT1b for tumors larger than 3 cm. However, a classification of cT2 was made if the lateral view showed trapezoidal deformity because the stomach wall was judged to be rigid and thick. In ulcerative-type cases, the classification was cT1b if the ulcer was deep or was encircled by a large nodular elevation or if a relatively thick coalescence of folds was observed. Otherwise, the case was classified as cT1a. In cases of advanced cancer, the classification was cT2 when neither wall rigidity nor insufficient wall extensibility was observed, and cT3 was indicated if the tumor was 50 mm or larger or if there was either wall rigidity or insufficient extensibility, regardless of tumor size. If both wall rigidity and insufficient extensibility were observed, the classification was cT4a, even for type 1/2 cancers. Type 3/4 invasive cancers were classified as cT4.

Endoscopic observations were considered important for early cancers, whereas radiography more clearly showed the tumor size and extent of wall rigidity in advanced cancers. Therefore, when radiography showed a deeper invasion of advanced cancer than that shown by endoscopy, the radiographic diagnosis took precedence.

For diagnoses of serosal invasion, MDCT findings were also considered [10]. A classification of cT4a was made when all layers of the gastric wall were visible on MDCT, indicating hypertrophy of the entire gastric wall and irregularities in the surrounding fatty tissue structures. When the serosal invasion was judged to directly affect an adjacent organ, the classification was cT4b, and ultrasound imaging was additionally performed to detect invasion into other organs [22].

Criteria for determining preoperative clinical nodal involvement

The diagnosis of LN metastasis in all 3033 cases was made with 16-row MDCT. Scanning was performed with 2-mm slices from the chest to the abdomen and pelvis, and images were obtained at 5-mm intervals. LNs were deemed positive in the following cases: (1) the longest diameter was 10 mm or greater, (2) the longest diameters were less than 10 mm but showed convergence, or (3) the LNs presented the same enhanced pattern as the primary lesion [23].

Statistical analysis

The Kaplan–Meier method and the log-rank test were used for survival analysis. We considered p < 0.05 to be statistically significant. To objectively evaluate the ability of prognostic predictions, Harrell’s C-index (a measure of the separation of survival distributions) was calculated on the basis of the time-dependent ROC curve [14]. The area under the curve (AUC), calculated by regular ROC analysis, ranged from 0.5 to 1.0; the predictive ability was defined to be moderate for AUCs of 0.7–0.8 and excellent for AUCs greater than 0.8 [24]. Likewise, a C-index of 0.8 or greater reflected excellent predictive ability for survival outcomes [25]. SPSS version 18.0 (SPSS, Chicago, IL, USA) and STATA/SE 12 (StataCorp, College Station, TX, USA) were used for statistical analyses.

Results

cT, cN, and clinical stage classifications

Approximately half of the patients were classified as having cT1 cancer (i.e., the diagnosis was early cancer); cT4a was the most frequent classification for advanced cancer. Around 70 % of patients were classified as having cN0 cancer. Regarding clinical stage, 1722 patients (57 %) had clinical stage I cancer, 521 patients (17.2 %) had clinical stage II cancer; 626 patients (20.7 %) had clinical stage III cancer; and 164 patients (5.4 %) had clinical stage IV cancer (Table 2).

Table 2 Incidence of each clinical T (cT) category, clinical N (cN) category, and clinical stage

Correlation between clinical staging and pathologic staging

For T staging, the overall accuracy was 67.2 % (1939/2887). The concordance rate was 88.6 % (1290/1456) in early cancers and 45.4 % (649/1431) in advanced cancers. Understaging was seen in 320 patients (11.1 %) and overstaging was seen in 628 patients (21.8 %) (Table 3). For N staging, the overall concordance rate was 63.5 % (1713/2699). Even among the cN0 cancer patients, 554 (27.8 %) were pN positive. The rate of underdiagnosed nodal metastases reached 29.8 % (805/2699), whereas the rate of overdiagnosis was only 6.7 % (181/2699; Table 3). The concordance between clinical and pathologic stages is shown in Table 3. Concordance was relatively high for stages I and IV (84.2 and 90.9 %, respectively); however, it was quite low for stages II and III (32.4 and 36.1 %, respectively).

Table 3 Correlation between clinical depth (cT)/nodal involvement (cN)/clinical stage and final pathologic depth (pT)/final pathologic nodal metastases (pN)/final pathologic stage

Survival outcomes according to the preoperative cT, cN, and clinical stage classifications

Figure 1a shows survival curves for the cT classifications. The 5-year survival rates were 94.7 % (cT1a), 91.7 % (cT1b), 84.1 % (cT2), 66.0 % (cT3), 42.9 % (cT4a), and 18.1 % (cT4b). Significant differences were noted between all cT groups (p < 0.001) except between the cT1a and cT1b groups (p = 0.149). As shown in Fig. 1b, significant differences in survival outcomes were observed between all cN groups (p < 0.001). The 5-year survival rates according to the cN classification were 84.5 % (cN0), 62.2 % (cN1), 44.4 % (cN2), 22.8 % (cN3a), and 0.0 % (cN3b). Figure 1c shows the survival curve for each clinical stage. The 5-year survival rates were 94.3 % (clinical stage IA), 84.7 % (clinical stage IB), 71.7 % (clinical stage IIA), 56.1 % (clinical stage IIB), 55.7 % (clinical stage IIIA), 42.3 % (clinical stage IIIB), 22.8 % (clinical stage IIIC), and 9.1 % (clinical stage IV). Although no significant difference in survival outcome was observed between the clinical stage IIB and clinical stage IIIA groups (p = 0.826), significant differences were noted between all other stages (p = 0.007 between clinical stage IIIA and clinical stage IIIB only and p < 0.001 between all other groups).

Fig. 1
figure 1

Overall survival curves of gastric cancer patients according to clinical factors. a Clinical tumor invasion depth (cT) categories, b nodal involvement (cN) categories, and c clinical stages

The cN and cT C-indexes both exceeded 0.7. The clinical stage C-index exceeded 0.8, being 0.825 (95 % confidence interval 0.819–0.831; Table 4).

Table 4 Concordance index (Harrell’s C-index) of the clinical stage according to the seventh edition of the TNM classification

Survival outcomes according to the postoperative pathologic stage classifications

Survival curves for the pathologic TNM classifications were plotted for the same cohort as for the clinical staging analysis (Fig. 2). The 5-year survival rates with pathologic stages IA, IB, IIA, IIB, IIIA, IIIB, IIIC, and IV were 95.4, 93.2, 81.8, 77.3, 70.4, 54.3, 41.6, and 11.2 % respectively. The C-index was as high as 0.850 (95 % confidence interval 0.839–0.852). A statistically significant difference between contiguous stages was observed in only four intervals: between pathologic stages IB and IIA (p < 0.001), pathologic stages IIIA and IIIB (p = 0.006), pathologic stages IIIB and IIIC (p = 0.016), and pathologic stages IIIC and IV (p < 0.001).

Fig. 2
figure 2

Overall survival curves for gastric cancer patients according to the pathologic stage

Discussion

The present study is the first to demonstrate the significant impact of the preoperative clinical stage of gastric cancer, classified according to the seventh edition of the TNM staging system, on patient survival. The relationship between clinical stage and survival rates was shown to have a satisfactory C-index. The TNM staging system was originally designed for determining treatment strategy, including surgery, chemotherapy, and radiotherapy. The Japanese gastric cancer treatment guidelines [17], and also the National Comprehensive Cancer Networ [26] and European Society for Medical Oncology [27] guidelines, suggest that therapeutic planning for gastric cancer should be determined on the basis of clinical staging. However, the feasibility and efficacy of clinical staging had not been fully evaluated for gastric cancer. Only a limited number of studies evaluating the survival outcomes by clinical staging have been published, although physicians have generally acknowledged the importance of the system [28]. Regarding the current version of the staging system, no reports have previously been published.

Ahn et al. [3] emphasized that the seventh TNM staging system provides a more detailed classification of prognosis than the sixth edition by evaluating the postoperative final pathologic stage. In the present study, we successfully demonstrated the prognostic significance of the seventh edition of the system in the preoperative clinical staging process as well. The most important change in cT on moving to the seventh edition was that cT2b was upgraded to cT3. Previously cT2 had been subclassified as cT2a and cT2b, although the stage was the same. Previous reports indicated significant differences in survival outcomes between pT2a and pT2b [29, 30]. In the present study, the survival outcome with cT3 was significantly poorer than that with cT2. In addition, in the present study, the 5-year survival curves clearly differed according to the cT classification. In the present study, the 5-year survival curves obtained according to cN were also clearly separated. It is a somewhat complicated procedure to count metastatic LNs from a computed tomography scan. However, a significant difference in the survival outcome was clearly observed between cN3a (7–15 metastatic LNs) and cN3b (more than 15 metastatic LNs). These results suggest that the counting of metastatic LNs is useful for predicting a patient’s survival, even if there are more than seven.

Pathologic staging systems are considered to be robust because the survival curves are clearly separated and the hazard rates for each stage are uniform. In the present study, we used a C-index calculated on the basis of time-dependent ROC curve analysis to quantify the ability to predict survival time in absolute values. There have been reports on scientific, objective C-index-based evaluations of the staging system in terms of the predictive ability for survival outcomes in colorectal cancer and endocrine cell cancer [15, 16]; however, no similar reports have been published related to gastric cancer. In the present study, the C-index exceeded 0.7, which is the threshold for moderate predictive ability, even when based on either cT or cN alone. Surprisingly, the C-index for the clinical stage (0.825) exceeded the threshold for excellent predictive ability (0.8). However, there was no significant difference in survival between clinical stages IIB and IIIA. A possible reason for this is as follows. Most patients with type 4 disease received a diagnosis of cT4aN0 cancer. A substantial proportion of these patients did not receive R0 resection because of peritoneal metastasis found during the operation. The R0 resection rate was poorer in clinical stage IIB cancer patients (68.4 %) than in clinical stage IIIA cancer patients (73.1 %). The difference in the R0 resection rate may have influenced the survival outcome in these patients.

We also investigated survival outcomes according to the pathologic stage for the same cohorts. Although the survival curves seem to be very close together, the C-index was higher than for the clinical stage, indicating that the pathologic stage is a robust indicator of prognosis. However, within a single pathologic stage, survival still differed according to the clinical stage. Figure S1a shows survival according to the clinical stage for pathologic stage II cancer patients. Survival was significantly worse for the clinical stage II/III cancer patients than for the clinical stage I cancer patients (p = 0.021). Similarly, survival for pathologic stage III cancer patients was significantly poorer for clinical stage II/III than for clinical stage I (p = 0.014) (Fig. S1b). These results also support the prognostic significance of the clinical stage independently of the pathologic stage for gastric cancer.

We analyzed the concordance rates between clinical and pathologic stages for T and N. The concordance rate for depth (T) was around 65 %. In studies based on MDCT, the concordance between cT and pT was reported to be 76–89 % [11, 3134]; however, the number of participants in these studies was small, ranging from 50 to 108, making the conclusions unreliable. Furthermore, the reports differed in the classification methods and the number of categories for cancer depth, making a generalized comparison impossible. Only a limited number of Japanese publications (with English abstracts) have reported diagnostic criteria according to T category (T1–T4) [20, 21]. Two articles have also reported diagnostic criteria for T category by means of MDCT [31, 33]. However, in most of the T1 and T2 tumors in our series, we failed to detect a tumor with MDCT. It is unlikely that MDCT criteria will be adopted for clinical T staging in gastric cancer.

The cN–pN concordance rate in our data was 61.7 %, and it has been reported as 59–87 % in previous studies [11, 3134]. These results imply a limitation to the diagnosis of LN metastasis. Noda et al. [35] reported that if LNs of size 5 mm or smaller were disregarded after the removal and processing of resected specimens following surgery for gastric cancer, approximately 40 % of metastasis-positive LNs were missed, resulting in possible stage migration. There is also a major problem regarding diagnostic criteria for cN category. D’Eila et al. [31] and Hur et al. [33] adopted the criteria that regional LNs were considered involved when the short-axis diameter was more than 6 mm for perigastric LNs and more than 8 mm for extraperigastric LNs, and demonstrated predictive accuracy of 87 and 67 % respectively. Habermann et al. [11] considered an LN as positive if the shorter diameter was more than 8 mm, and reported a concordance of 70 %. Similarly, Chen et al. [32] used the criteria that regional LNs were considered to represent local metastases if they were solitary or separate nodes 8 mm or greater in the long-axis diameter with marked enhancement, and reported an overall accuracy of 78 %. Thus, there have been no comprehensive criteria or cross-sectional studies that have established the optimal criteria for diagnosing LN metastasis. The present study is the largest-scale study to evaluate the predictive accuracy of MDCT for LN metastasis.

If validation using a large database system were possible, this would allow optimal criteria for clinical staging to be established. However, this is not currently possible because data for cT and cN have not been collected even by the Surveillance, Epidemiology, and End Results database [36] or the national gastric cancer registry of the Japanese Gastric Cancer Association. We consider it essential that a database system is established to collect clinical stage data prospectively.

In Japan, adjuvant chemotherapy using S-1 was introduced after the ACTS-GC trial in pathologic stage II cancer patients and pathologic stage III cancer patients who underwent R0 resection. We thus investigated whether survival rates of clinical stage II cancer patients and clinical stage III cancer patients changed after the introduction of adjuvant chemotherapy by grouping patients as those treated before or after January 1, 2007. Survival was significantly better for the latter than the former for clinical stage II cancer patients (p = 0.039) (Fig. S2a); however, there was no significant difference for clinical stage III cancer patients (p = 0.933) (Fig. S2b). In clinical stage III cancer, 40 % of patients were classified as having pathologic stage IV cancer, and in practice only one third of patients received adjuvant chemotherapy. This may explain the lack of an improvement in the survival rate in clinical stage III cancer patients after the introduction of adjuvant chemotherapy.

Among clinical stage III cancer patients, a substantial number received a diagnosis of pathologic stage IV cancer and underwent R1 or R2 surgery. The major reason for the stage discrepancy was positive peritoneal cytology findings or peritoneal metastasis. Song et al. [37] emphasized the usefulness of staging laparoscopy with a detection rate greater than 60 % for unsuspected peritoneal metastases. Similarly, Sarela et al. [38] reported the overall accuracy of staging laparoscopy with a sensitivity for distant metastases of 89 % and a specificity of 100 % in gastric cancer patients with cT3/cT4 and M0 disease who were treated with curative intent. Muntean et al. [39] reported the safety and effectiveness of this staging modality by performing staging laparoscopy in 98 cases. If an institution can safely perform staging laparoscopy, this would be best performed for clinical stage III cancer patients to select patients suitable for neoadjuvant chemotherapy.

We obtained good results from the investigation of the prognostic ability of the eight classifications based on the clinical staging system in the seventh edition of the TNM system. However, a clinical staging system with eight stages may seem rather complicated, and detailed subdivisions may not be required when the clinical stage is to be used for determining the actual therapeutic strategy. The five major classifications are considered sufficient, including the application of endoscopic treatment, limited surgery, routine surgery, neoadjuvant chemotherapy, and nonsurgical treatment. Given the complexity of the TNM clinical staging system, a simpler classification system should be recommended as a common guideline for use worldwide.

Limitations of the present study included that it was based on retrospective data collected from a single institution; a multicenter prospective study where all institutions share clear diagnostic criteria is warranted. In addition, only patients who received surgical treatment were included, and thus most of the patients who received palliative treatment were missing; survival in patients who received chemotherapy (without surgery) or palliative treatment is therefore unknown. However, patients who received chemotherapy or palliative treatment received a diagnosis of clinical stage IV cancer, and in the present cohort, survival in those patients was considered to be similar to that for the clinical stage IV cancer patients. Extending the use of the clinical stage system to all gastric cancer patients will require a prospective cohort study that includes all gastric cancer patients.

Nevertheless, the present study was meaningful in that diagnostic experts prospectively classified and recorded the preoperative cT category, cN category, and clinical stage according to a uniform-criteria-based principle.

In conclusion, preoperative staging makes possible the prediction of survival outcomes in gastric cancer. Clinical staging could become a complete criterion for selecting treatment strategies from limited surgery to extended surgery with neoadjuvant chemotherapy.