Introduction

While the incidence has been decreasing overall, gastric cancer (GC) remains one of the most common and lethal malignancies worldwide1. Currently, resection with appropriate lymphadenectomy (routinely D2 in Asia) remains one of the major treatment approaches assuring survival of most GC patients.2,3,4

With the advancement of minimally invasive techniques, laparoscopic surgery is gaining popularity in treating digestive cancers.5,6,7,8,9,10 Laparoscopic gastrectomy (LG) has become a common option to treat patients with early GC (EGC) especially in Asia. In the current Japanese guidelines3, distal gastrectomy by the laparoscopic approach has been recommended for cTNM stage I cancer, while many other indications for LG remain investigational. Meta-analyses11,12 have supported that for EGC located in the distal stomach, postoperative recovery is favored in patients treated with LG compared to conventional open gastrectomy (OG). Following the KLASS13,14 and JCOG0703 studies15 showing the non-inferiority in safety of LG compared to OG concerning morbidity and mortality for EGC, the Chinese Laparoscopic Gastrointestinal Surgery Study (CLASS) group led by the Nanfang Hospital (NFH) recently published the short-term results of a large multicenter randomized controlled trial (RCT)16 comparing laparoscopic versus open D2 distal gastrectomy for advanced GC (AGC), revealing that LG with D2 lymphadenectomy could be safely performed for AGC by experienced surgeons.

Limited high-quality evidence on LG-associated long-term survival is available, especially for specific subgroups. Results of pivotal phase III studies on EGC conducted in Japan (JCOG091217) and Korea (KLASS0113) are awaited. As for more advanced cancer, there is currently no evidence to recommend a laparoscopic approach since RCTs investigating long-term oncologic and survival outcomes are currently ongoing (JLSSG0901,18 KLASS02,19 and CLASS0116). For distal GC, an Italian single-center RCT20 reported that the outcomes from LG were not inferior to those from OG. Still, especially for cases with inferior patient and/or tumor characteristics, the uncertain long-term results preclude the advancement and widespread utilization of LG in GC management and the conduction of RCTs, due to lack of support from large prospective data. The application of D2 LG remains contentious especially in the elderly and for cancers with unfavorable stage or tricky location. Furthermore, long-term survival-associated factors in LG-managed patients remain largely elusive since it is difficult to have a relevant cohort large enough for adequate analysis. Before LG could become a universally applicable surgery for most GC patients, it is important to analyze the long-term outcomes using large high-quality data, and the role of observational study is untapped21.

Before and during the CLASS01 trial, data on laparoscopically or openly operated patients in the NFH have been prospectively registered since 2004. This comprehensive investigation aimed to assess the long-term outcomes of LG versus OG for GC overall and in specific subgroups according to age, performance status, tumor location, stage, and gastrectomy type, by analyzing the long-term follow-up data of a large prospective cohort. We further conducted in-depth analyses of survival-associated factors in the increasing GC patients undergoing LG.

Methods

Patients

A prospective GC database including information on both LG and OG derived from electronic medical records has been maintained in the NFH since 2004. Data monitoring was performed by a specific medical recorder with relevant work experience for ~ 10 years. The recorded variables covered demographic, clinical, pathologic, surgical, and long-term follow-up data of all consecutive patients undergoing gastrectomy for GC. The resection approach (LG or OG) and reconstruction methods were standard following guidelines.2,3,4,22

Patients eligible for analysis herein had biopsy-proven gastric carcinoma with microscopic confirmation resected via D2 gastrectomy with clear margin of the primary site, and were operated by surgeons having performed ≥ 50 gastrectomies. The endoscopic ultrasonography (EUS) was not used as part of the preoperative staging in our study. D2 lymphadenectomy followed the Japanese GC treatment guidelines3. Exclusion criteria included previous gastrectomy, endoscopic gastric surgery, and other upper abdominal surgery (excluding cholecystectomy), non-resectional surgery, emergency surgery, simultaneous surgery for other diseases, and preoperational anticancer therapy which is not a routine in China. Patients having other cancer within the past 5 years and with incomplete follow-up data were also excluded. This study was approved by the Ethics Committee of the NFH, and informed consent was obtained from all participants.

Cancer stage was determined or recoded according to the AJCC/UICC TNM staging system (seventh version).23 Tumor location was categorized as the upper, middle, or lower third of the stomach, and lateral invasion included involvement of small and large curvature, and anterior and posterior wall. Performance status was quantified using Eastern Cooperative Oncology Group (ECOG) score. Complications were stratified according to the Clavien-Dindo classification24. Disease-specific survival (DSS) was defined as the interval between surgery and GC-related death or end of follow-up. Disease-free survival (DFS) was the period spanning the time from surgery to cancer recurrence, which was diagnosed from clinical, radiologic, and/or endoscopic examination, death, or end of follow-up.

The follow-up schedule and indication for adjuvant treatment followed the guidelines3. Fluorouracil-based regimens with/without platinum were applied, and follow-ups were scheduled at three-month intervals for the first 2 years, at 6-month intervals for the next 3 years, and then annually until patient death. The follow-up program consisted of physical examination, laboratory tests, endoscopy, ultrasonography, and computed tomography.

Statistics

Data analyses were performed using R 3.4.1 (https://www.r-project.org/). Age was categorized into groups of < 50, 50–59, 60–69, and ≥ 70 years. To control confounding by different indications for a specific surgical approach between arms, we performed both unmatched and matched analyses. In case-matched analysis aiming to balance high-dimensional observed covariates, propensity score matching (PSM) was applied using the MatchIt package with calipers of 0.1 of the standard deviation of the propensity score. Matching factors included year of diagnosis, sex, age, body mass index (BMI), surgical histories, comorbidities, infections, performance status, tumor location, lateral invasion, stages, size, lesion number, histology, and grade. The χ2, t, or Fisher’s exact test was used for intergroup comparisons where appropriate. Determinants for LG versus OG application were evaluated using multivariable logistic regression. DSS and DFS were computed using the Kaplan-Meier method.

Associations of LG versus OG with survival overall and in stratifications by age group, performance status, tumor location, pTNM stage, and gastrectomy type were computed and survival-associated factors were evaluated using multivariable Cox proportional hazards (PH) regression adjusting for year of diagnosis, sex, age, comorbidities, hepatitis B, performance status, tumor location, differentiation, histology, pT, pN, and pM stages, size, vascular and neural invasion, gastrectomy type, blood loss, adverse event grade, and intervals between surgery and flatus, activity, and food intake. The adjusted factors were based on multivariable backward selection. The PH and linearity assumptions in continuous variables were examined using restricted cubic splines25. Continuous variables were transformed to adequate forms for fitting the assumptions as appropriate. For categorical variables, log-log survival plots were used for identifying the PH assumption, and all variables were fitted to the assumption. Results were considered statistically significant at two-sided P < 0.05.

Nomogram Construction and Validation

The survival and rms packages were used. Variables were selected using the backward stepwise selection method in the Cox regression model. Based on the predictive models with the identified prognostic factors, nomograms were constructed for predicting 3- and 5-year DSS and DFS. Nomogram validation consisted of discrimination and calibration. Discrimination was evaluated using Harrell’s concordance index (C-index).26 Generally, a C-index value > 0.75 is considered to represent relatively good discrimination. Calibration was performed by comparing the means of predicted survival with those of actual observed survival estimated by the Kaplan-Meier method. The nomogram was then validated.

Results

Patient and Surgery Characteristics

Together, 2119 patients (1353 receiving LG and 766 OG) operated between January 2004 and September 2016 were prospectively registered. After excluding 167 patients in the LG group and 75 in the OG group, finally, 1877 patients (LG, 1186; OG, 691) were analyzed (Fig. S1). DSS and DFS could be both assessed for these patients. Before PSM (Table 1), male proportions were 67 and 68% in the LG and OG groups, respectively. Mean age was 56 years in the LG group and 55 years in the OG group, and most operated patients were 50–59 years (LG, 31%; OG, 33%). BMI was comparable between the two groups (mean, 22.0 vs. 21.5 kg/m2). Laparoscopically operated patients overall had poorer performance statuses than openly operated ones. Patients receiving LG had smaller tumors (mean size, 4.2 vs. 4.5 cm) and lighter invasion depth compared to those receiving OG. pTNM stage IA, IB, IIA, IIB, IIIA, IIIB, IIIC, and IV tumors were identified in 15%, 5%, 6%, 13%, 11%, 11%, 21%, and 18% of patients in the LG group and in 9%, 6%, 4%, 16%, 10%, 16%, 17%, and 22% of patients in the OG group. No significant differences regarding tumor location, differentiation, histology, and lesion number, and metastatic lymph node were observed between the two groups. Median follow-up was 61 and 64 months for LG and OG, respectively. Balance of variables between groups markedly improved after PSM.

Table 1 Patient and tumor characteristics

Surgical and perioperative parameters are shown in Table S1. Distal, total, and proximal gastrectomies were performed in 54%, 32%, and 3% of patients in the LG group, and in 62%, 21%, and 13% in the OG group. Multivisceral resection was performed in 8% and 10% of patients in the laparoscopic and open groups, respectively. Clavien-Dindo Grade I, II, and III+ morbidities occurred in 14%, 6%, and 3% of patients receiving LG and in 9%, 2%, and 5% of patients undergoing OG. Perioperative death occurred in 3 (0.3%) and 0 (0.0%) of patients receiving LG and OG, respectively.

Overall, larger tumors were less often laparoscopically managed (odds ratio = 0.99, 95% confidence interval (CI) = 0.99–0.99), while the choice between laparoscopic and open approaches was not significantly associated with the other clinicopathologic factors (Table S2).

Long-Term Survival of Patients Undergoing LG Versus OG

Before PSM, overall, both DSS and DFS were comparable between the LG and OG groups after multivariable adjustment (Table 2). In stratification analyses according to age group, performance status, tumor location, pTNM stage, and gastrectomy type, while survival was similar between arms in most subgroups, interestingly, LG was associated with better DSS among patients ≥ 70 years (hazard ratio (HR) = 0.41, 95% CI = 0.22–0.78), those with upper GC (HR = 0.50, 95% CI = 0.30–0.84), and those with metastatic disease (HR = 0.65, 95% CI = 0.45–0.94). DFS was similar between arms in all subgroups.

Table 2 Association of laparoscopic versus open gastrectomy with disease-specific and disease-free survival for gastric cancer using multivariable Cox regression

After PSM, LG remained associated with superior DSS within patients ≥ 70 years (HR = 0.33, 95% CI = 0.15–0.72) and those with proximal GC (HR = 0.51, 95% CI = 0.29–0.91); for metastatic disease, however, the association became insignificant. Notably, LG was significantly associated with better DFS for upper GC after case-matching (HR = 0.60, 95% CI = 0.37–0.99). All the other associations remained insignificant.

Unadjusted Survival of Patients Undergoing LG

Within laparoscopically operated patients (Fig. 1; Table S3), the median DSS was 75 months, and the 3- and 5-year DSS rates were 63% and 52%, respectively. The median DFS was 42 months, and the 3- and 5-year rates were 55% and 38%, respectively. Five-year DSS rates decreased dramatically with more advanced pTNM stage (I, 93%; II, 77%; III, 45%; IV, 14%). Interestingly, the rates increased with more distal tumor location (upper, 45%; middle, 50%; lower, 57%). Similar patterns were observed for 5-year DFS, which decreased with advancing stage (I, 85%; II, 69%; III, 36%), and increased with more distal location (upper, 36%; middle, 41%; lower, 49%).

Fig. 1
figure 1

Cancer-specific and disease-free survival in overall laparoscopic and open gastrectomy (a and b), and in laparoscopic gastrectomy according to age (b and f), tumor location (c and g), and TNM stage (d and h). The left panel shows cancer-specific survival, and the right panel disease-free survival

Using multivariable regression (Table 3), older age (HR = 1.01), positive hepatitis B (HR = 1.86), poorer performance status (e.g., HRECOG score ≥ 2 vs. 1 = 1.53), more advanced pT (e.g., HRpT4b vs. 1 = 6.83), pN (e.g., HRpT3 vs. 0 = 2.49), and pM (HR = 2.12) stages, and vascular invasion (HR = 1.47) were associated with inferior DSS, and hepatitis B (HR = 1.54), poorer performance status (e.g., HRECOG score ≥ 2 vs. 1 = 1.43), more advanced pT (e.g., HRpT4b vs. 1 = 4.14), pN (e.g., HRpT3 vs. 0 = 2.30), and pM (HR = 2.14) stages, and vascular invasion (HR = 1.47) were associated with poorer DFS. The association strengths were mostly stronger for DSS than for DFS. Notably, mucinous adenocarcinoma was associated with both better DSS (HR = 0.62) and DFS (HR = 0.64).

Table 3 Association of disease-specific and disease-free survival with demographic and clinicopathologic factors for laparoscopically resected gastric cancer using multivariable Cox regression

Prognostic Nomogram for LG

The constructed nomogram (Fig. 2) can assign survival probability by adding up the scores identified on the points scale for each variable. The total scores projected to the bottom scales indicate the probability of 3- and 5-year survival. C-indexes were 0.814 (95% CI, 0.793–0.835) for DSS and 0.809 (95% CI, 0.788–0.830) for DFS, which were superior to the seventh edition of TNM staging (DSS 0.769, 95% CI, 0.748–0.790, P = 0.004; DFS 0.751, 95% CI, 0.732–0.770, P = 0.001). For calibration, the actual survival corresponded closely with the predicted survival. When applying the nomogram to the OG group, C-indexes markedly dropped (DSS 0.759, 95% CI, 0.735–0.783; DFS 0.744, 95% CI, 0.722–0.766).

Fig. 2
figure 2

Nomograms predicting 3- and 5-year disease-specific (upper panel) and disease-free (lower panel) survival after laparoscopic D2 gastrectomy for gastric cancer and the corresponding calibration plots. The nomogram is used by adding up the points identified on the points scale for each variable. The total points projected on the bottom scales indicate the probability of 3- and 5-year survival. For calibration plots of the nomogram, the x-axis represents the nomogram-predicted survival, and the y-axis represents actual survival measured by Kaplan-Meier analysis

Discussion

Controversy persists regarding the application of LG in GC management especially for patients with more advanced disease and more complex situations (e.g., old age and junction cancer) which are more surgically challenging, due to the paucity of concrete valid evidence with sufficient power supporting its long-term oncologic and survival benefits. By investigating a homogeneous population of consecutive unselected patients and covering wide subgroups by age group, performance status, tumor location, stage, and gastrectomy type, this comprehensive analysis of the large prospective NFH database showed that LG was associated with survival non-inferior to OG for GC overall and in extensive stratifications. Better LG-associated survival (especially DSS) was observed in patients ≥ 70 years, with upper GC, or with metastatic disease. The findings raise novel hypotheses that minimally invasive approach potentially performs non-inferiorly than traditional open surgery regarding long-term results for patients with inferior, vulnerable, or challenging conditions who are rarely investigated in prospective or randomized studies. They potentially point to novel indications for LG that warrant randomized validation.

We revealed that overall both DSS and DFS were similar between the laparoscopic and open groups, and further addressed locally advanced and metastatic GCs beyond early disease. Both LG-Associated DSS and DFS were non-inferior in all stage-specific subgroups. Although LG has gained popularity in the management of EGC in Asia,14,15,27,28 its role in the treatment of AGC remains debated. The feasibility, safety, and perioperative benefits of the laparoscopic technique in select patients with AGC have been demonstrated by the CLASS01 study16. The long-term outcomes of laparoscopic surgery for AGC remain yet to be defined with limited small studies29,30,31,32,33 reporting relevant experience for highly selective patients. A large retrospective Korean study27 showed that LG was non-inferior to OG for surgically manageable GC concerning long-term overall survival in different stage groups, even in patients with AGC, which is consistent with our findings. Notably, in the Korean study27 89% of patients underwent distal gastrectomy in the laparoscopic group. Only 16% of these patients had AGC, and only 56% of them underwent ≥ D2 lymphadenectomy. In our study, percentages of pT1 disease and distal gastrectomy in the LG group were 14% and 54%, respectively, allowing outcomes in the other subgroups to be addressed more powerfully.

While metastatic GC is often regarded resection-contraindicative34, many metastatic diseases are detected only during or after operation35. There are also some technically resectable metastatic diseases.3 Patients with metastatic disease are generally more vulnerable with more disturbed immunity and inflammatory statuses. Notably, distant metastasis is significantly associated with worse survival. While metastatic cancer patients undergoing LG had 5-year DSS of 14% in our center, this does not justify resection on patients with metastatic disease. The appropriateness of resection of metastatic disease could only be determined by prospective randomized controlled trials.

While recently researches on LG are becoming more active for elderly patients,36,37 they are often neglected in prospective or randomized studies, and are generally considered at higher risk when facing major surgeries, because of the decreased body functional reserve and more comorbidities. The application of LG for elderly GC patients has been investigated recently, but the investigations were mostly retrospective assessments of a particular procedure with recognized selection bias.38,39,40 It has been suggested that LG could be safely performed for older patients.36,37 We herein further showed that for patients ≥ 70 years, LG was associated with better DSS compared to OG, while DFS was comparable between arms. However, it should be noted that sample size in this subgroup was relatively small. Systemic stress and inflammatory response caused by OG contribute to overall higher risks in the elderly14. Laparoscopic surgery was associated with attenuated stress responses and improved preservation of immune function.41 Elderly patients might benefit from the lesser trauma accompanying laparoscopic surgery. An experimental finding42 reported that less surgical trauma associated with the use of laparoscopic techniques reduces tumor recurrence. However, the potential impact of carbon dioxide pneumoperitoneum on circulatory and respiratory dynamics warrants further investigation.

Interestingly, we observed that for the management of upper GC, LG was associated with both better DSS and DFS after case-matching. Proximal GC might be more biologically aggressive43. Surgical management of junction or cardiac cancers could be more challenging if adequate proximal margin and lymphadenectomy were to be assured. Total gastrectomy is preferred for upper GC treatment; however, no prospective trial has been reported regarding the corresponding laparoscopic approach. There remain some technical issues for anastomosis after laparoscopic total gastrectomy.13,44 Still, the relatively small subgroup size could increase the possibility of chance findings.

To our knowledge, our nomograms were the first LG-specific ones developed based on a large prospective cohort of patients undergoing D2 gastrectomy. Tumor size was excluded herein after backward selection, which is consistent with a previous nomogram for D2 open gastrectomy45. Using the nomograms with significant clinicopathologic variables including age, hepatitis B infection, performance status, gastrectomy type, tumor histology, and vascular invasion, individual patient survival could be significantly more precisely and accurately predicted compared to the TNM classification. We believe that the LG-specific nomograms can be useful in this era when D2 gastrectomy is more widely accepted worldwide. Notably, the nomograms were not applicable for openly operated patients, highlighting the importance of developing such LG-specific nomograms in this laparoscopic era. For generalized use of the nomograms by other institutions or in other regions, it is important to externally validate the nomogram and to minimize the effect of differences in surgical strategy. While hepatitis B was included in the nomogram as a prognostic factor, this does not impact the generalizability of the nomogram since in areas with low HBV infection rates it would only be needed to select the “No” category for this variable.

Our study was majorly limited by its observational retrospective nature which was associated with various biases (e.g., those caused by the fact that larger tumors were less often laparoscopically managed), and was potentially confounded by factors not registered (e.g., patient preference and socioeconomic status). The results of this study might not be applicable to GC patients treated in Western countries due to the differences in patient population, surgical expertise, and non-surgical treatment. It is an institutional study from a high-volume specialized national tertiary hospital, and included patients might be selected, potentially limiting the generalizability of the findings. Importantly, BMI of 22 kg/m2 could hardly be seen in the West. The EUS was not used as part of the preoperative staging in our study, which further limits the comparison of our findings with the results of the western series. The differences in patient population and treatment principles limit the use of the nomogram in western populations. While potentially more accurate staging and meticulous histopathologic evaluation of surgical specimens and improved surgical skills and techniques over time might introduce bias to the analyses, diagnosis time was matched and adjusted for to account for the temporal shifts. Significances in some subgroups might be limited by the relative paucity. Observed differences could be attributed to selection bias. To overcome the potential asymmetric distribution of patient and/or tumor characteristics and resection extent between groups caused by different indications for selection, we performed PSM46 adjusting for important confounding factors. We extensively identified and included preoperative factors related to the selection of treatment approaches. After careful data cleaning, patients receiving OG were matched with those undergoing LG at a 1:1 ratio. Notably, even after PSM, there still remained some variables which were significantly different between groups (e.g., hepatitis and ECOG status), although the comparability markedly increased. We further applied extensive stratifications and multivariable adjustment. Moreover, genetic or molecular information was not available.

Despite the limitations, this work is a well-designed comprehensive investigation focusing on long-term outcomes with adequate power to offer novel hypotheses regarding application of LG with D2 lymphadenectomy in various subgroups, most of which have not been adequately addressed previously. It is the first to show the comparable long-term results for laparoscopic versus open procedures using large prospective cohorts including nearly 2000 patients over a 12-year period, and to suggest non-inferior survival associated with LG in patients with inferior characteristics, providing solid background data for potential future RCTs. All data were collected uniformly, and cases were matched for ~ 20 factors potentially associated with surgical procedure selection, to offer reliable conclusions to the extent possible by an observational study.

Surprisingly, in subgroups with significant associations, DSS appeared to be more profoundly impacted by surgical approaches than DFS. This might suggest that the laparoscopic and open surgeries mostly performed equally in oncologic clearance, but that the smaller LG-associated trauma might contribute to disease-specific survival benefits especially in vulnerable populations. Further investigations into reasons for the differences observed between the associations of treatment with DSS and with DFS are warranted. Although indication for LG in GC is currently limited to patients with early-stage diseases of the distal stomach, we envision that our results might serve as preliminary evidence for RCTs validating the efficacy of LG for patients with AGC and those with other unfavorable conditions.

In conclusion, for GC management LG was associated with long-term outcomes non-inferior to those of OG overall and in various subgroups by age group, performance status, tumor stage and location, and gastrectomy type using the PSM method, in a large prospective cohort from a high-volume specialized Eastern GC center. There could still be biases even after PSM due to confounders not accounted for in this observational study. The findings should be validated by well-designed RCTs. Nomograms predicting long-term DSS and DFS after laparoscopic D2 gastrectomy for GC were then developed and internally validated. For the generalized use of the nomograms, validation by external cohorts is encouraged.