Oesophageal cancer is the seventh most common malignancy and the sixth leading cause of cancer-related death globally.1 The current standard of care for resectable oesophageal cancer involves multimodal therapy, combining perioperative chemotherapy or chemoradiotherapy with surgery.2,3 While surgery is the mainstay of curative treatment, even when performed in high-volume centres, oesophagectomy is associated with a high morbidity of 17–74% and mortality in up to 3–5%.4,5,6,7

Morbidity following oesophageal resection is in part due to the substantial iatrogenic trauma of the surgery but also is related to the overall fitness of patients undergoing treatment.8 Clinicians therefore consider physical health status, particularly cardiopulmonary fitness, as a key determinant of suitability for such a major operation.9

Accurate preoperative risk stratification is important to predict and manage postoesophagectomy outcomes.10 Risk stratification plays an important role in patient selection for surgery and guiding preoperative optimisation.11 While several prognostic tools have been developed to predict adverse events after major surgery, not all of these have demonstrated good discriminative ability.12,13,14,15,16

Cardiopulmonary exercise testing (CPET) is an objective method of assessing physical fitness based on cardiopulmonary function.17 This test measures the patient’s peak oxygen uptake (\(\dot{{\text{V}}}\)O2peak) as well as their sustainable aerobic activity, defined as the physiological point at which anaerobic metabolism exceeds aerobic metabolism (AT). In theory, CPET determines the ability to meet the metabolic demands of surgery based on recording oxygen delivery and utilisation in response to a graded increase in exercise intensity.18 Previous systematic reviews have demonstrated that a low \(\dot{{\text{V}}}\)O2peak and AT value correlate strongly with increased morbidity and mortality risk across other surgical disciplines.11,19,20 However, despite a growing number of centres implementing CPET before planning an oesophageal resection, its predictive value for postoperative outcomes in this setting has not yet been validated in a systematic review.

The objective of this review is to measure the predictive value of \(\dot{{\text{V}}}\)O2peak and AT as determined by CPET in the prediction of postoperative complications, unplanned intensive care unit (ICU) admissions, and 1-year survival in patients undergoing oesophagectomy.

Methods

Search Strategy

A review of the literature was conducted in October 2019 to search for articles relevant to the association between preoperative CPET variables, \(\dot{{\text{V}}}\)O2peak and AT, and postoperative outcomes following oesophagectomy. The CINAHL, Cochrane Library, EMBASE, MEDLINE, PubMed, and Scopus databases were included. The following key search terms were applied in multiple different combinations: “Cardiopulmonary exercise testing”; “CPEX”; “CPET”; “\(\dot{{\text{V}}}\)O2peak”; “Anaerobic threshold”; “Esophagectomy”; and “Oesophagectomy.” A complete search strategy for a single database defined with all keywords and subject headings is included (Supplementary data, Appendix S1). A manual search was performed on references of relevant published studies. Screening of articles and their selection was performed by two authors (JS, HS).

Eligibility Criteria

For inclusion in the review, studies were required to meet the following selection criteria: (a) involve subjects undergoing CPET before an oesophagectomy; (b) measurement of outcomes of interest, including both cardiopulmonary complications and noncardiopulmonary complications, unplanned ICU admissions, and 1-year survival; (c) report on comparisons between preoperative CPET variables and outcomes of interest; (d) be an original paper with independent data; and (e) be published as full-text articles in a peer-reviewed journal in English. Studies were excluded according to the following criteria: (a) clinical data unavailable for the study; (b) research data repeated with other studies; and (c) editorial letters and conference abstracts. No study design restriction was used.

Critical Appraisal

Two authors (JS, HS) independently assessed the quality of the methodology and the risk of bias for eligible studies using the Quality in Prognosis Studies (QUIPS) tool,21 with any discrepancies resolved with discussion and consensus. This scale assesses the quality of prognostic studies across six domains: study participation, attrition, prognostic factor measurement, outcome measurement, study confounding, and statistical analysis and reporting.

Data Extraction

Data extraction was conducted by the lead investigator (JS). The recorded data included study design, patient characteristics, method of CPET, type of procedure, mean values for \(\dot{{\text{V}}}\)O2peak and AT, as well as the incidence for any outcomes of interest. Outcomes of interest included cardiopulmonary complications, noncardiopulmonary complications, unplanned ICU admissions, and survival at 1 year postoperatively. Authors were contacted to obtain an original data set to improve the uniformity of the results.

Statistical Analysis

Data were analysed using an inverse-variance random-effects model. We decided a priori to use the random-effects model for meta-analyses because we assumed clinical heterogeneity across the included studies. Results were presented as standardised mean difference (SMD) with 95% confidence interval and described using a forest plot. SMD is used as a standardisation statistic that accounts for the variable methods of reporting the same outcome across the multiple studies. This expresses the size of the intervention effect in each study relative to the variability observed in that study. Where possible, individual data for included studies was analysed to ensure normal distribution of data using histograms and Quantile–Quantile (QQ) plots. We assessed heterogeneity between studies by calculating tau-squared (τ2) and I-squared (I2). Tau-squared indicated the variance of the true effect sizes and the I2 statistic indicates the proportion (as expressed through percentage) of variance remaining if sampling error is removed. Values of 24%, 50%, and 75% could be considered low, moderate, and high, respectively. All calculations were performed using Stata 15.1 (Stata Corporation, College Station, TX).

Standards of Reporting

This review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.22 The review protocol was registered in the PROSPERO international prospective register of systematic reviews (CRD42019147102) on September 11, 2019.

Results

The search strategy yielded 410 studies, and after the removal of 58 duplicates, and exclusion of 204 articles based on abstract screening, 148 full-text publications were reviewed (Fig. 1). A total of 7 consecutive cohort publications, with a combined sample size of 912 patients undergoing oesophagectomy, were deemed suitable for the qualitative analysis (Table 1).23,24,25,26,27,28,29 One study was conducted in Japan, whereas the remainder were undertaken in the United Kingdom. Corresponding authors of three studies were able to provide individual patient data.27,28,29 Patel et al. also were able to provide an updated dataset of an additional 43 patients that had been analysed since their original publication, which has been included in this analysis.27

Fig. 1
figure 1

PRISMA flow diagram

Table 1 Study characteristics

All seven papers studied the role of CPET for preoperative risk assessment in patients undergoing oesophagectomy. One study combined this cohort with patients also undergoing gastrectomy, and to account for this, a sensitivity analysis was performed to assess for any discernible effect when this study was excluded.26 CPET variables, AT and \(\dot{{\text{V}}}\)O2peak, were reported in all studies, whereas the ventilatory equivalents for carbon dioxide (\(\dot{{\text{V}}}\)E/\(\dot{{\text{V}}}\)CO2) was only measured in two studies and therefore not included in this analysis. Across all seven studies, CPET was performed with a stationary cycle ergometer and involved an incremental exercise protocol until the patient’s maximum tolerated level was reached. In two studies,25,27\(\dot{{\text{V}}}\)O2peak was determined from the highest or average value achieved in the final 30 s of the test. Another two studies defined it as the maximum \(\dot{{\text{V}}}\)O2peak value during the entire exercise program 23,26; and three studies did not specifically report how this was value was derived.24,28,29 The value for AT was determined using the V-slope method in five studies,23,24,25,27,28,30 whereas two studies did not detail how this value was achieved.26,29 While all other studies presented values for CPET variables in standard units of mL/kg/min, Nagamatsu et al. divided these values by the square meters (m2) of body surface area to minimise any variability due to difference in physiques.23 CPET was undertaken preoperatively in all studies; four of them reported performing the test at the time of diagnosis prior starting neoadjuvant therapy,24,26,28,31 one study performed the test immediately prior to surgery and after the completion of any neoadjuvant systemic therapy,29 and two studies did not disclose the specific timing of CPET.23,25

Mean values for \(\dot{{\text{V}}}\)O2peak and AT are presented for each study outcome in Table 2. All outcomes were defined from the time of surgery until follow up or death. The seven studies used a variety of classifications for defining complications, including the Common Terminology Criteria for Adverse Events,24,26,32 Clavien-Dindo classification,25,27,33,34 Accordion score,28,35 or otherwise a self-defined measure of adverse cardiac or respiratory events dependent on whether treatment was required.23 Where overall complications were recorded, these were further divided by their organ system to be defined as cardiopulmonary or noncardiopulmonary. One study did not detail the definition or classification system for recording complications.29 Given that standard practice is for esophagectomy patients to be managed in critical care postoperatively as Level 2 or 3 care, unplanned ICU admissions were defined as a return to ICU after discharge from critical care. The quality of included studies was assessed using the QUIPS tool (Table 3). All studies were considered to be of good quality; however, there was moderate risk of confounding bias in a majority of them.

Table 2 Study estimates for each outcome
Table 3 Risk of bias assessment: QUIPS

The studies included in the meta-analysis contain large sample sizes per group and therefore can assume normal distribution based on the Central Limit Theorem.36 For the three original datasets provided,27,28,29 histogram assessments approximating a normal curve and QQ plots demonstrate data points relatively close to the straight line, indicative of normality (Supplementary data, Appendix S2).

Cardiopulmonary Complications

The results from seven studies were pooled to examine the relationship between \(\dot{{\text{V}}}\)O2peak levels and the incidence of cardiopulmonary complications (Fig. 2).23,24,25,26,27,28,29 Mean \(\dot{{\text{V}}}\)O2peak values in those that developed cardiopulmonary complications was significantly lower compared with those that did not develop cardiopulmonary complications (SMD = − 0.43; 95% CI − 0.77 to − 0.09; test for overall effect: z = − 2.476; p = 0.013), although significant heterogeneity was noted (I2 = 80.4%; τ2 = 0.19). Because the study from Moyes et al. included results from gastrectomy patients as well as oesophagectomy patients, a sensitivity analysis was conducted where the model was conducted again without this study. The results were very similar to those found from the model including all studies (SMD = − 0.45; 95% CI − 0.84 to − 0.05; test for overall effect: z = − 2.206; p = 0.027; I2 = 83.6%; τ2 = 0.19). Mean AT levels from six studies were pooled to determine the summative effect of AT levels and cardiopulmonary complications (Fig. 2).23,24,25,27,28,29 The mean value for AT in those that developed cardiopulmonary complications was not found to be significantly different compared with those that did not develop cardiopulmonary complications (SMD = − 0.17; 95% CI − 0.42 to 0.09; test for overall effect: z = − 1.248; p = 0.212), with moderate heterogeneity detected (I2 = 62.0%; τ2 = 0.06).

Fig. 2
figure 2

Forest plot—difference in \(\dot{{\text{V}}}\)O2peak and AT between the patients who did and did not experience cardiopulmonary complications. a \(\dot{{\text{V}}}\)O2peak: test for overall effect: z = − 2.476; p = 0.013. b AT: test for overall effect: z = − 1.248; p = 0.212

Noncardiopulmonary Complications

As demonstrated from the five studies in Fig. 3,24,25,27,28,29\(\dot{{\text{V}}}\)O2peak in patients who developed noncardiopulmonary complications was not significantly different to those who did not develop noncardiopulmonary complications (SMD = − 0.02; 95% CI − 0.18 to 0.13; test for overall effect: z = − 0.303; p = 0.762; I2 = 0.0%; τ2 = 0.0). This same group of studies were used to compare AT levels in patients that did and did not develop noncardiopulmonary complications.24,25,27,28,29 AT was not significantly different between either group (SMD = 0.01; 95% CI − 0.15 to 0.17; test for overall effect: z = 0.107; p = 0.915; I2 = 0.0%; τ2 = 0.0).

Fig. 3
figure 3

Forest plot—difference in \(\dot{{\text{V}}}\)O2peak and AT between the patients who did and did not experience noncardiopulmonary complications. a \(\dot{{\text{V}}}\)O2peak: test for overall effect: z = − 0.303; p = 0.762. b AT: test for overall effect: z = 0.107; p = 0.915

Unplanned Return to ICU

Figure 4 shows that in three studies \(\dot{{\text{V}}}\)O2peak was significantly lower in those that required an unplanned admission to ICU compared with those that did not (SMD = − 0.34; 95% CI − 0.60 to − 0.08; test for overall effect: z = − 2.555; p = 0.011; I2 = 0.0%; τ2 = 0.0).24,27,28 Similarly, these studies also showed that AT levels in patients that did have an unplanned return to ICU were significantly lower compared with their counterparts (SMD = − 0.34; 95% CI − 0.61 to − 0.07; test for overall effect: z = − 2.543; p = 0.014; I2 = 0.0%; τ2 = 0.0).24,27,28 No heterogeneity was determined in either test for this outcome.

Fig. 4
figure 4

Forest plot—difference in \(\dot{{\text{V}}}\)O2peak and AT between the patients who did and did not have an unplanned return to ICU. a \(\dot{{\text{V}}}\)O2peak: test for overall effect: z = − 2.555; p = 0.011. b AT: test for overall effect: z = − 2.543; p = 0.014

One-Year Survival

A survival assessment was also performed by comparing preoperative CPET variables in patients that were alive at 1-year post-oesophagectomy. While other studies measured survival as an outcome, this was not reported in their final publication due to the low number of mortality events available for analysis.24,25 As a result, only three studies with individual patient data provided by authors were able to be assessed.27,28,29 Presented in Fig. 5, \(\dot{{\text{V}}}\)O2peak levels in patients that survived until one year postoperatively were significantly higher than those who had not survived to this point (SMD = 0.31; 95% CI 0.02–0.61; test for overall effect: z = 2.003; p = 0.045), with no heterogeneity present (I2 = 0.0%; τ2 = 0.0). The association between survival and AT followed a similar trend; AT in those who had survived was significantly higher compared with those that had mortality within 1-year postoperatively (SMD = 0.34; 95% CI 0.00–0.68; test for overall effect: z = 1.969; p = 0.049), with low heterogeneity noted (I2 = 7.4%; τ2 = 0.01).

Fig. 5
figure 5

Forest plot—difference in \(\dot{{\text{V}}}\)O2peak and AT between the patients who did and did not survive until 1-year postoperatively. a \(\dot{{\text{V}}}\)O2peak: test for overall effect: z = 2.003; p = 0.045. b AT: test for overall effect: z = 1.969; p = 0.049

Discussion

Preoperative CPET is a dynamic assessment of functional capacity that measures gas exchange during exercise.37 This evaluates the integrated function of cardiac, circulatory, respiratory, and metabolic systems during physiological stress.38 Its role was originally defined in cardiothoracic surgery but is now being implemented in many centres as part of a risk stratification tool before undergoing major abdominal and thoracic surgery.

The physiological principle underpinning the association between CPET and post-oesophagectomy outcomes is that surgery places a significant burden on the patient’s cardiopulmonary reserve. Oxygen demand increases by 40–50% in the early postoperative period, requiring an increase in ventilation and cardiac output.39 Patients who are unable to compensate for this demand are thought to be at an increased risk of cardiopulmonary complications.40,41 Preoperative measurement of cardiopulmonary reserve may help to stratify how well patients tolerate the physiological insult of an oesophageal resection and reconstruction. The two key CPET variables in our study assess the cardiopulmonary reserve through recording the patient’s efficiency in delivering oxygen from the environment to cellular mitochondria (\(\dot{{\text{V}}}\)O2peak) and the ability to cope with increased peripheral oxygen requirements without entering a state of anaerobic metabolism (AT).19 While these two variables may correlate with one another, they also are distinct measures as \(\dot{{\text{V}}}\)O2peak relates to performance during maximum activity and AT relates to endurance during sustained nonmaximum activity.42

This meta-analysis demonstrates that with respect to patients undergoing oesophagectomy, preoperative CPET variables correlate with postoperative cardiopulmonary complications, unplanned critical care admissions, and survival at 1 year after surgery.

Cardiopulmonary-specific complications are common following oesophagectomy in part due to the physiological stress of surgery on the patient’s cardiopulmonary reserve. The Esophageal Complications Consensus Group (ECCG) reported the postoperative incidence of cardiac and pulmonary complications as 16.8% and 27.8%, respectively.7 The most common of these complications include new-onset arrhythmias and pneumonias, often occurring in combination with each other.7 The predictive value of CPET for cardiopulmonary complications demonstrated by this meta-analysis is consistent with findings using CPET in other surgical disciplines. A systematic review on the role of CPET in hepato-pancreatico biliary surgery demonstrated a strong correlation between a low AT value and increased incidence of complications, although this did not specifically evaluate cardiopulmonary complications.43 There was limited data available in the included studies to appropriately evaluate the effect of \(\dot{{\text{V}}}\)O2peak on postoperative morbidity. A systematic review and quantitative analysis of colorectal cancer surgery patients undergoing preoperative CPET identified low anaerobic threshold as predictive of cardiovascular and pulmonary complications.44 A low \(\dot{{\text{V}}}\)O2peak was able to predict for postoperative complications in general; however, no analysis was performed with respect to cardiopulmonary complications.

Postoperative cardiopulmonary complications also are the largest factor contributing to unplanned ICU admissions following oesophagectomy.45,46 This is likely to account for the relationship between CPET variables and unplanned ICU admissions. The findings of the present analysis also may suggest that CPET variables may be an indicator of patients’ physiological capacity for tolerating complications and therefore predict the likelihood of returning to ICU. Prior reviews in hepato-pancreatico biliary surgery and colorectal surgery have failed to report on the prognostic value CPET-derived metrics and unplanned ICU admissions. However, a prospective consecutive series study from Older et al. evaluating patients undergoing major intra-abdominal surgery validated AT as a robust predictor of whether patients may be safely managed on the ward compared with requiring a critical care bed.47 In a recent survey of centres in the United Kingdom with CPET facilities, many report already using the results to determine allocation of appropriate postoperative level of care across all surgical disciplines.48

The ability for \(\dot{{\text{V}}}\)O2peak and AT to prognosticate survival also is partially explained by the fact that cardiopulmonary complications account for up to 70% of mortality events after oesophagectomy procedures.49 In comparison, both \(\dot{{\text{V}}}\)O2peak and AT were found in a previous systematic review to be beneficial in predicting mortality for patients undergoing hepatic resection for cancer.50 With regards to pancreatic cancer surgery, preoperative AT exhibited no significant relationship with mortality, and \(\dot{{\text{V}}}\)O2peak was not assessed in any available studies.50 A meta-analysis assessing CPET in patients undergoing surgery for colorectal cancer was unable to determine its efficacy in predicting mortality due to the low number of events.44 Although poor physical status, as represented by \(\dot{{\text{V}}}\)O2peak and AT, are directly linked with poor survival postoperatively,51 recognising this relationship between CPET and mortality may have reduced relevance given the low incidence of deaths.

CPET is a risk prediction tool that supplements preoperative clinical assessment, because it identifies patients who may require further optimization or prehabilitation in the form of inspiratory muscle training and endurance exercises and also may support sensible decision making for patients who should avoid oesophageal surgery if their operative risk exceeds potential benefit from surgery.52,53 Several international guidelines have endorsed the implementation of CPET as a preoperative stratification tool and have provided recommendations for its appropriate use and interpretation across all surgical disciplines.54,55,56,57

Despite the predictive value of CPET, several shortcomings limit the universal application as part of routine workup before oesophagectomy. First, no single benchmark value is able to delineate between positive and negative outcomes following oesophagectomy. In this meta-analysis, this could not be defined due to the lack of individual patient data available for a pooled analysis, as well as the lack of consistent thresholds between studies in defining an abnormal CPET result. The cutoff values for defining high-risk patients from articles in the present review include an AT value of 9–11 mL/kg/min and \(\dot{{\text{V}}}\)O2peak value of 14 mL/kg/min; however, these values were selected after use in nonsurgical studies.58,59 To improve the clinical applicability of CPET to surgical patients, future research should define an optimal decision threshold for CPET variables based on a receiver operating characteristic curve analysis. While other articles across all surgical disciplines have adopted an arbitrary cutoff of 11 mL/kg/min for AT and 14 mL/kg/min for \(\dot{{\text{V}}}\)O2peak to predict favourable outcomes following surgery, there is no definitive evidence that these individual values are useful in discriminating high-risk from low-risk patients.9,24,26,60,61

Second, as neoadjuvant therapy frequently forms part of the treatment paradigm for oesophageal cancer, especially for patients with locally advanced disease,2,62,63 this may be a potential confounder for the clinical application of CPET. Previous studies have demonstrated that systemic treatment in this cohort diminishes preoperative cardiopulmonary reserve with a significant reduction in \(\dot{{\text{V}}}\)O2peak and AT in patients who underwent neoadjuvant chemotherapy.64,65,66 With regards to the seven studies included in this review, CPET was performed at variable timepoints in the preoperative period, either before or after neoadjuvant chemotherapy. This emphasises the importance of performing CPET at an appropriate and standardised time in the patient’s treatment pathway.

Finally, it also is important to consider that performing this test is relatively time consuming and expensive.67 This is due to need for equipment and trained staff who can perform the test as well as interpret the results. These costs should be weighed against the potential savings of reducing complications in this cohort.68 As a result of these expenses, there is commonly poor access to CPET facilities outside of high-volume centres.48

The results of this study should be interpreted in the setting of several limitations. Most of the studies were from centres in the United Kingdom, and as such, these population differences may be a potential source of selection bias. This may restrict the external validity of our findings to other centres worldwide. Four of the included studies were retrospective in and thus subject to potential observational bias in the data collection process. All articles did provide strict guidelines for their data collection process in an attempt to minimise this bias. There was variation in the classification method for postoperative complications between studies, and this also may contribute to the risk of bias in comparing these studies. This meta-analysis yielded high heterogeneity with regards to evaluating cardiopulmonary complications, and this may have been due to the differences in the study cohorts relating to demographics, comorbidities, and operative approach. This was accommodated for through the use of a random effects model. While \(\dot{{\text{V}}}\)O2peak and AT are the emphasis of CPET research studies, CPET also involves the routine collection of other cardiorespiratory variables—the ventilatory equivalents for oxygen (\(\dot{{\text{V}}}\)E/\(\dot{{\text{V}}}\)O2) and carbon dioxide (\(\dot{{\text{V}}}\)E/\(\dot{{\text{V}}}\)CO2)—and spirometry variables. A future analysis that incorporates a combination of these variables to determine a patient’s fitness for surgery may be more reliable than solely assessing \(\dot{{\text{V}}}\)O2peak and AT.

Conclusions

Preoperative CPET is a useful predictor of adverse postoperative outcomes following oesophagectomy, including cardiopulmonary complications, unplanned ICU admissions, and 1-year mortality. CPET can play a role as part of multidisciplinary preoperative assessment and optimisation before oesophagectomy. The clinical usefulness of CPET testing is limited by the lack of a defined cutoff value for \(\dot{{\text{V}}}\)O2peak and AT to delineate between high-risk and low-risk patients.