Introduction

Although multidisciplinary treatment with radiation and/or chemotherapy has been developed for esophageal cancer, surgery remains the primary modality for treating localized cancer [1,2,3]. Compared with procedures performed for other gastrointestinal cancers, the postoperative complications after esophagectomy can be severe or fatal and may compromise postoperative quality of life (QOL) [4, 5]. Postoperative complications and decreased QOL are both associated with worse prognoses [6, 7]. Although several predictors for postoperative complications have been reported, identifying reliable and useful predictors in clinical practice remains challenging [8,9,10].

The Japanese National Clinical Database (NCD) is a nationwide web-based registration database that was created in 2010 and began registering surgical and other medical data in January 2011 [11]. The NCD has registered more than 95% of all surgical cases from over 5000 institutions in Japan, and its data reflect the status of the real world [11]. Takeuchi et al. [12]. developed a risk model for esophagectomy using the data of 5354 patients registered in the NCD in 2011. The aim of the present study was to develop the best possible model for predicting mortality using preoperative data from the NCD data registered between 2012 and 2017, to help reduce mortality in the future.

Methods

Data collection

A total of 32,779 cases registered in the NCD, of esophagectomy performed via a thoracic approach for malignant esophageal tumors, with reconstruction using other organs between January, 2012 and December, 2017, were included in this analysis. Patients who underwent surgery without thoracic manipulation or with two-stage reconstruction were excluded. All variables and definitions for the NCD were accessible to participating institutions on the NCD website (http://www.ncd.or/jp).

Endpoints

The primary outcome in this study was determination of the 30-day mortality rate after esophagectomy. The secondary outcomes were the rates of operative mortality and postoperative complications such as pneumonia, anastomotic leakage, recurrent laryngeal nerve palsy, atelectasis, chylothorax, postoperative blood transfusion, postoperative artificial respiration for over 48 h, unplanned intubation, and unplanned reoperation within 30 days. We defined 30-day mortality as death within 30 days of surgery and operative mortality as death within the period of hospitalization or 90-day mortality.

Risk factors

The risk factors considered as covariates were as follows: age, sex, history of smoking within 1 year before the surgery, preoperative requirement of assistance in activities of daily living (ADL) 30 days before surgery, weight loss of more than 10% within 6 months before surgery, transfers to the emergency room, radiotherapy within 90 days before surgery, diabetes mellitus, excessive alcohol consumption, respiratory distress within 30 days before surgery, chronic obstructive pulmonary disease (COPD), preoperative pneumonia on the operative day (pneumonia diagnosed using chest X-ray and/or computed tomography, or positive sputum bacterial culture), esophageal varices within 6 months before surgery, hypertension within 30 days before surgery, congestive heart failure within 30 days before surgery, angina pectoris within 30 days before surgery, previous cardiac surgery other than for pacemaker insertion, symptoms of peripheral vascular disease (PVD), previous surgery to treat symptoms caused by PVD, history of percutaneous coronary intervention, preoperative dialysis within 14 days before surgery, advanced cancer with multiple metastases, chronic steroid use, a bleeding disorder just before surgery, blood transfusion within 72 h before surgery, white blood cell counts, hemoglobin values, hematocrit values, platelet count, albumin, total bilirubin, aspartate aminotransferase, alanine aminotransferase, alkaline phosphatase, urea nitrogen, creatinine, serum sodium, C-reactive protein, prothrombin time, activated partial thromboplastin time (APTT), prothrombin time–international normalized ratio, American Society of Anesthesiologists physical status classification (ASA-PS), and clinical T factor of primary tumor and N factor of regional lymph nodes according to the classification of the Union for International Cancer Control, 7th edition.

Statistical analysis

Risk models for the operative morbidities were developed and validated using NCD data registered between 2012 and 2017. While constructing the risk models, the extracted data (32,779 records) was divided randomly into training data (29,501 records; 90%) and test data (3278 records; 10%) by stratified sampling using the dependent variable of each risk model. A frequency distribution table and cross tables of the outcomes and the candidate risk factors were compiled. Categorical variables were expressed as numbers and percentages, and comparisons were performed using the chi-squared or Fisher’s exact test. Continuous variables are presented as medians with ranges or means with standard deviations.

For the training data set, multivariable stepwise logistic regression analysis was performed with the positive or negative occurrence of each endpoint as the dependent variable, and the candidate explanatory variables (risk factors) in the covariate section as the explanatory variables. A forward–backward stepwise selection method with the Akaike’s Information Criterion (AIC) was used to develop each risk model, and the model that minimized AIC to predict the occurrence of each outcome was chosen, respectively. The stepAIC function in the MASS package of R was used for this procedure. The estimated odds ratio (OR) and its 95% confidence interval (CI) for each risk factor were calculated. The discriminative performance of each model was evaluated by the receiver operating characteristic (ROC) curve and the area under the ROC curve (AUC). The C-index, the same as the AUC, was one of the measures of goodness of fit in logistic regression, with a value closer to 1 indicating a better fit. The calibration was evaluated by a calibration plot. To evaluate the generalization performance of the risk models, the discriminative performance and the calibration were evaluated with not only the training data set, but also the test data set.

All tests were two-tailed, and the significance level was set at 0.05. R version 3.6 or later (R Foundation, Vienna, Austria, http://www.r-project.org/) was used for the statistical analyses.

Results

Risk profile of the study population

Table 1 and Supplementary Table 1 show the characteristics of the study population and the associated risk profile. In total, 82.2% of the NCD-registered esophagectomy patient population was older than 60 years, and 83.2% were male. Among the total patients, 38.6% had smoked within 1 year before the surgery, and 64.9% had a history of habitual alcohol consumption. Furthermore, 1.4% of patients required assistance with ADL before surgery, 7.8% had lost more than 10% of their body weight within 6 months before surgery, and 7.9% had ASA-PS scores higher than class 3. Finally, preoperative comorbidities included diabetes mellitus in 14%, respiratory distress within 30 days in 1.3%, COPD in 7.7%, and hypertension in 33.1%. In terms of clinical stage of cancer, 61.6% of patients had a greater than clinical T2 primary tumor and 54.6% had lymph node metastasis.

Table 1 Patients’ clinical characteristics

Morbidity

The 30-day mortality rate after esophagectomy was 1.0% for the study population and the operative mortality rate was 2.3% (Supplementary Table 2). Postoperative complications included pneumonia (13.8%), anastomotic leakage (13.2%), recurrent laryngeal nerve palsy (11.1%), atelectasis (4.9%), and chylothorax (2.5%). Furthermore, postoperative blood transfusion and postoperative artificial respiration over 48 h were required in 8.1 and 7.8% of patients, respectively, whereas unplanned intubation and unplanned reoperation within 30 days postoperatively were performed for 6.2 and 6.0%, respectively.

Model results

In this study, 30-day mortality, operative mortality, pneumonia, anastomotic leakage, recurrent laryngeal nerve palsy, postoperative blood transfusion, postoperative artificial respiration over 48 h, unplanned intubation, unplanned reoperation within 30 days postoperatively, atelectasis, and chylothorax were selected as outcomes in the risk model. Supplementary Table 3 shows the occurrence rates of the predictive risk factor candidates for each outcome included in the training data.

As a result of variable selection using AIC, a measure of good generalization performance of a predictive model, the risk factors were selected for each outcome (Table 2 and Supplementary Tables 4–6). Variable selection was done for each risk model for each outcome, so the risk factors chosen for each risk model may differ. In Table 2, variables with OR of “-’’ (smoking within 1 year and APTT in operative mortality) indicate that they were not selected as risk factors for that outcome. Multivariable logistic regression analysis revealed that old age, especially 75–80 years of age, was a common risk factor for 30-day mortality, operative mortality, pneumonia, postoperative artificial respiration, and unplanned intubation (Table 2 and Supplementary Tables 4–6). In terms of 30-day mortality, 75–80 years of age was the most significant risk factor (OR: 5.68; 95% CI: 3.31–10.4), followed by advanced cancer with multiple metastases (OR: 5.38; 95% CI 2.72–9.70) (Table 2). The 30-day mortality rate of patients with advanced cancer with multiple metastases was 5.5% (12/218), which was higher than that of other patients (0.9%; 302/32561). The number of patients with 30-day mortality was 314, including 3.8% (12/314) who had advanced cancer with multiple metastases and 87.6% (275/314) with postoperative complications. The logistic regression models with ORs for respiratory complications including pneumonia, postoperative artificial respiration, and unplanned intubation are summarized in Supplementary Tables 4–6.

The prediction equation is expressed by the following formula, \(\hat{p} = \frac{1}{{1 - \exp \left( { - \theta^{T} x} \right)}}\) where \(\widehat{p}\) is an estimate of the predicted probability of an event occurring, \(\theta\) is the vector of regression coefficients, and \(x\) is the vector of explanatory variables chosen for the predictive model. The valuable \(x\) includes the bias term (intercept).

The explanatory variables chosen for the predictive model are shown in the table for each outcome. The odds ratios for the explanatory variables \({x}_{i}\) shown in the table were obtained by \({\mathrm{OR}}_{i}=\mathrm{exp}({\theta }_{i})\). Categorical variables with multiple levels were entered into the prediction equation as binary variables (dummy variables) with the number of levels minus one. Variable selection was done as a set of the original categorical variables, not separately for those dummy variables.

Model performance

To evaluate the performance of the risk models, both the C-index and the model calibration across the risk groups were evaluated using the training data. The C-index, which is the measure of model discrimination, was represented by the AUC of the ROC curve. In this study, anastomotic leakage, recurrent laryngeal nerve palsy, unplanned reoperation within 30 days postoperatively, atelectasis, and chylothorax were unable to produce risk models with sufficient predictive performance. Therefore, we limited our analysis to mortality and respiratory complications including pneumonia, postoperative artificial respiration for over 48 h, and unplanned intubation. The C-indices were 0.738 for 30-day mortality (95% CI: 0.707–0.768), 0.744 for operative mortality (95% CI: 0.725–0.763), 0.647 for pneumonia (95% CI: 0.638–0.656), 0.658 for postoperative artificial respiration (95% CI: 0.647–0.670), and 0.654 for unplanned intubation (95% CI: 0.641–0.667) (Supplementary Fig. 1).

To further validate the reliability of the risk models, the ROC curves were calculated using the test data, and the C-index was evaluated. The C-indices were 0.694 for 30-day mortality (95% CI: 0.603–0.784), 0.712 for operative mortality (95% CI: 0.650–0.774), 0.610 for pneumonia (95% CI: 0.582–0.637), 0.594 for postoperative artificial respiration (95% CI: 0.556–0.632), and 0.609 for unplanned intubation (95% CI: 0.568–0.650) (Fig. 1). The calibration data used for training and testing the risk model are presented in Supplementary Figs. 2 and 3, respectively.

Table 2 Risk model for mortalities of patients undergoing esophagectomy
Fig. 1
figure 1

Receiver operating characteristic curves of mortality and complications in the test data set. A 30-Day mortality, B operative mortality, C pneumonia, D postoperative artificial respiration, E unplanned intubation

Discussion

In this study, our analysis focused primarily on patients who underwent esophagectomy via a thoracic approach for malignant esophageal cancer, which is a major procedure in Japan. The results of this report are based on real-world data, not on selected patients, as in a randomized controlled study, and thus, reflect the current status of esophagectomy in Japan. Most randomized controlled studies involve eligibility criteria, so that patients who are elderly or at risk for surgery are not enrolled. Therefore, a useful risk model for clinical practice was subsequently developed. The risk calculator currently used to predict mortality and morbidity after esophagectomy is based on analyses of data in 2011 [12]. However, according to NCD data, the complication rate for esophagectomy in Japan has increased gradually, from 17.9% in 2011 to 22% in 2017 and 2018 [11]. These results suggest that the risk calculator based on the previous risk model does not accurately predict the risk of current cases. Therefore, the development of a new and improved risk model will be meaningful and beneficial for future clinical use.

In the present study, the 30-day mortality and operative mortality rates were lower than those in our previous report (1.0% vs. 1.2%, and 2.3% vs. 3.4%, respectively) [12]. Furthermore, the 30-day mortality rates (1.0%) obtained in our study are lower than those in other national databases: 3.5% in the Netherlands, 4.3% in England, and 4.7% in the United States [13,14,15]. However, with the accumulation of new data, the outcomes reported for other national databases may have improved. In fact, Kajaer et al. [16] reported that the 30-day mortality rate in Denmark had decreased from 4.5% in 2004 to 1.7% in 2013. In terms of changes over time, the number of esophagectomy cases in Japan has increased from 4916 in 2011 to 6,207 in 2018, according to NCD data [11]. Although the 30-day mortality rate has not changed significantly, the 90-day mortality rate has decreased from 3.2% in 2011 to 1.9% in 2018 in Japan [11]. Conversely, the complication rates associated with the risk factors assessed in the present study, such as pneumonia (13.8% vs. 15.4%, respectively) or anastomotic leakage (13.2% vs. 13.3%, respectively), did not show much improvement from the previous analysis [12]. It is suggested that appropriate management of complications have improved, leading to a decrease in mortality, even though the complication rates have not decreased compared with previous results.

There was a notable change between the previous report based on 2011 data and the present study in relation to the frequency of artificial respirator management over 48 h after surgery (11.4% vs. 7.8%, respectively). This may be a result of the contribution of a previous risk model to perioperative risk assessment and operative management. Another reason for this may be the increased application of minimally invasive esophagectomy (MIE) [17]. MIE was performed in 32.7% of patients in 2011 and in 52.5% of those between 2012 and 2016 [12, 17]. In some reports, MIE resulted in fewer pulmonary complications than open esophagectomy [18, 19]. Yoshida et al. reported that MIE in patients who received no preoperative treatment was significantly associated with a lower incidence of artificial respirator management over 48 h compared with patients who underwent open esophagectomy in their analysis of NCD data between 2012 and 2017 [17]. Moreover, perioperative management may have changed because of the growing popularity of the concept of enhanced recovery after surgery [20]. In addition, more patients have been extubated and mobilized early for rehabilitation in recent years [21, 22].

The 30-day mortality and operative mortality risk models showed that old age, especially 75–80 years of age, was the most significant risk factor (Table 2). Advanced age is independently associated with mortality or cardiopulmonary complications after esophagectomy [23, 24]. These data suggest that there may be a selection bias because patients aged > 80 years in good health and with a low operative risk might have been selected carefully for surgery, or a less invasive procedure might have been used for these patients. Elderly patients with multiple comorbidities present a high operative risk and they do not have the associated survival benefits of younger patients [25]. Another potential reason for the lower OR of patients aged > 80 years vs. 75–80 years might be that many surgeons used the risk calculator more often for older patients than younger patients because the indication for esophagectomy in patients aged > 80 years should be considered carefully. If the OR for patients aged > 80 years decreased with more consistent use of the risk calculator, this would support its applicability. We hope that the new risk model-based calculator will be applied for patients aged 75–80 years in the same way, which will improve mortality rates. However, the number of esophagectomies being performed in patients aged > 75 years is increasing as NCD data show that the rate has risen from 16.9% in 2011 to 23.3% in 2018 [11]. This might explain the increase in complications over time in the NCD data [11]. It is generally accepted that elderly patients are more prone to complications as their age increases [26]. Even for patients aged 75–80 years, preoperative risk should be assessed using a risk calculator, and operative procedures should be considered carefully to reduce invasiveness in some patients.

The association between clinical stage and mortality was not examined in the 2011 study because the TNM status was not included in the NCD data at the time [12]. The NCD has since become more comprehensive, and data on the TNM stage are now available. The present study elucidated that besides old age, advanced cancer with multiple metastases is a risk factor for 30-day mortality and operative mortality. In several cases of advanced cancer with metastasis, the patient’s general preoperative condition might have deteriorated because of cachexia and poor nutritional status. In this pneumonia risk model, the OR of T3 was lower than that of T0 and T1 (OR: 0.88) (Supplementary Table 4). Several risk factors for pneumonia have been reported, including habitual smoking, decreased respiratory function, aspiration of oral bacteria, and preoperative malnutrition [27]. Preoperative rehabilitation, maintaining oral hygiene, and nutritional intervention are all beneficial for preventing respiratory complications [27]. In Japan, preoperative chemotherapy is the standard treatment for advanced stage II/III esophageal squamous cell carcinoma [3]. Prehabilitation and multidisciplinary care during preoperative chemotherapy are supplemental in preventing pneumonia and might reduce the incidence of pneumonia in patients with advanced cancer [27]. However, there are some limitations to developing a risk model based on preoperative factors alone, since postoperative pneumonia may be influenced by other factors, such as surgical technique or perioperative management. Our findings suggest that pneumonia may not be correlated with tumor progression.

The C-indices of the 30-day mortality and operative mortality models in the test data were 0.694 and 0.721, respectively (Fig. 1). These results suggest that the risk model formulated in this study for mortality may be reliable in clinical practice. Takeuchi et al. reported that the C-indices of the 30-day mortality and operative mortality models in the validation data were 0.767 (95% CI: 0.654–0.880) and 0.742 (95% CI: 0.666–0.819), respectively [12]. The results of the present study seem to have inferior prediction accuracy, probably because the mortality has improved in recent years and the prediction performance has decreased slightly in line with the reduced mortality. These results were calculated based on real-world data in Japan. A more realistic prediction may be provided based on only preoperative information. We believe that these data will be very useful for adjusting the preoperative risk for comparison among hospitals. Furthermore, C-indices of the models for postoperative respiratory complications, such as pneumonia, postoperative artificial respiration over 48 h, and unplanned intubation were approximately 0.6, indicating the relative accuracy of the respiratory prediction model that was developed as a simple model using only preoperative data. Conversely, no useful predictive models for other operative complications, such as anastomotic leakage, recurrent laryngeal nerve palsy, or chylothorax, could be developed. Hence, precise preoperative prediction of these surgical complications is still difficult, because they might be more contingent on the surgical technique and operative factors than on preoperative patient status or comorbidities. As for the calibration data, relatively good results were obtained for the training data (Supplementary Fig. 2). Moreover, while some of the test data were not well calibrated, they were in an acceptable range (Supplementary Fig. 3). Overall, the population had a high predicted probability and events tended not to occur as frequently as predicted. In other words, this risk model could overestimate the risk.

This study has some limitations. First, the details of the surgical procedures and reconstruction methods were not analyzed. Differences in surgical approach (such as the use of thoracoscopy or open thoracotomy), field of lymph node dissection (such as two- or three-field lymphadenectomy), and abdominal approach (such as laparoscopy or laparotomy) could have affected the outcomes of esophagectomy. Second, intraoperative factors such as operative time and blood loss could also result in complications, but these factors were also not analyzed, because data on these variables were unavailable before surgery. We aimed to develop a model for preoperative prediction using only preoperative factors. Third, appropriate perioperative management is important for reducing complications as well as for preventing them. Each institution has its own approach to perioperative management, which might have affected the reduction of complications. Surgeon or hospital volume has a negative association with operative mortality rates [28]. These factors were also not considered. The prediction accuracy would be improved if surgical factors and hospital factors were included in the analysis of the risk model. However, the main aim of this study was to improve upon the existing risk calculator in Japan using only factors that could be obtained preoperatively and not selected by the surgeon, such as operative procedures. Intraoperative factors are, of course, factors that are not known preoperatively. Hospital factors were not included, because this risk model is designed to be used as an average risk calculator for all high- and low-volume hospitals in Japan to establish indications for surgery. If the hospital factors were included as risk factors, then the risk model could not be used to compare performance between hospitals.

In conclusion, we developed a new risk model for esophagectomy using data from the NCD. This model is relatively useful for predicting 30-day and operative mortalities preoperatively, obtaining informed consent, and making decisions about the optimal surgical procedures for patients with some preoperative risk factors for mortality.