Introduction

Patients with esophageal carcinoma have a dismal prognosis, with an overall 5-year survival rate estimated at 12 %. The majority of patients with esophageal cancer will have locally advanced disease at diagnosis. Curative surgical treatment of esophageal cancer involves the complete macroscopic and microscopic removal of the tumor (R0-resection). Even for patients with stage II or III disease who undergo surgical resection, 5-year survival rates are poor, estimated at 34 and 15 %, respectively [1]. This may be due to occult metastatic disease present at the time loco-regional therapy is attempted.

In an attempt to downstage tumors, achieve better local tumor control, and improve outcome, preoperative chemotherapy and radiotherapy have been subjected to multiple clinical trials [211]. An updated meta-analysis of 24 clinical trials showed that there is strong evidence for a survival benefit of neoadjuvant chemoradiotherapy (NACR) or chemotherapy over surgery alone in patients with esophageal carcinoma [12]. A couple of prospective studies have also shown a significant benefit for neoadjuvant therapy in this esophageal and gastroesophageal junction tumors. In one of these trials, the 3-year survival was improved for patients who underwent chemoradiation before surgery, as compared to those who had surgery alone (32 vs. 6 %, p = 0.01) [10]. A recently published prospective, randomized control trial including 366 patients with potentially resectable esophageal cancer has shown a significantly higher rate of R0 resection and improved median overall survival (49.4 vs. 24 months) for patients who undergo preoperative chemoradiation [11].

Previous studies have demonstrated that the maximal benefit from NACR is for patients who achieve a complete pathological response; however, this occurs in only 15–30 % of cases [4, 13]. A possible explanation for the improved outcome in this group of patients is that a significant pathologic response at the primary tumor site implies treatment of occult micrometastases as well.

The management of patients with locally advanced esophageal cancer (LAEC) is evolving. A randomized controlled trial including 259 patients with LAEC with response to induction chemoradiation therapy (defined as decrease in length of tumor of at least 30 % and improvement in dysphagia) has shown that in this patient population, especially those with epidermoid subtype, there is no benefit for the addition of surgery after chemoradiation compared with the continuation of additional chemoradiation. Patients with squamous cell carcinoma of esophagus are usually elderly, often with a clinical history of alcohol and tobacco abuse and comorbidities, limiting therapy tolerance. Therefore, given the potential morbidity and mortality associated with surgery, accurate assessment of response to neoadjuvant therapy may be beneficial in identifying a subset of patients who would most benefit from surgical intervention [14], or when considering chemoradiation or chemotherapy as the sole therapeutic modality.

Positron emission tomography has been shown to improve staging of patients with esophageal cancer. The major advantages of 18F-fluorodeoxy-glucose (FDG) PET/CT are improved nodal staging and better detection of metastatic disease beyond the celiac axis [15]. As compared with CT staging, at baseline, staging with FDG-PET may upstage up to 20 % of patients. However, conflicting data exist on the performance of FDG-PET in non-invasively predicting response of esophageal cancer to neoadjuvant therapy [1626]. Prior studies have evaluated the use of PET in early response assessment to chemoradiation therapy, or after completion of neoadjuvant therapy. Variable results have been published with sensitivities and specificities ranging from 20 to 100 and 30 to 100 %, respectively [27, 28]. Study methodology and interpretation methods have varied, and inconsistent semiquantitative parameters have been suggested to differentiate responders from nonresponders. Recently, Wahl et al. [29] proposed the PET response criteria in solid tumors (PERCIST) as a standardized method for semiquantitative assessment of metabolic response to therapy. The purpose of the current study was to correlate metabolic response to neoadjuvant chemoradiotherapy (NACR) on FDG-PET/CT using PERCIST-based criteria to pathologic and clinical response, and survival in patients with LAEC.

Patients and methods

Patient population and study design

This is a retrospective analysis of 45 patients with LAEC who underwent PET/CT before and after neoadjuvant therapy prior to planned surgical resection. Demographic data are summarized in Table 1. The definition used for LAEC was malignant disease limited to the esophagus or gastroesophageal junction and regional lymph nodes. Thus, the clinical stages included were: T1 N1 M0, T2–3 N0–1 M0, or T1–3 N0–1 M1a, according to the American Joint Committee on Cancer (AJCC, 6th edition) TNM staging system. The neoadjuvant protocol comprised induction chemotherapy with irinotecan and cisplatin, combined chemotherapy and external beam radiotherapy, and radiotherapy boost phase. Study design is summarized in Fig. 1. Approval was obtained from the institutional ethics review board and informed consent was waived.

Table 1 Demographic data
Fig. 1
figure 1

Study design flow chart. Mean time interval I was 27.9 days (median 28; range 6–60). Surgery was performed in all patients except one who had progressive disease and was considered inoperable. Mean time interval II was 22.4 days (median 20.5; range 6–45)

PET/CT: acquisition and interpretation criteria

All whole-body PET scans were performed in 3D mode with a dedicated in-line PET/CT scanner (Siemens, Biograph). Patients were asked to fast for at least 6 h before undergoing the examination. Data were acquired 63 ± 10 min after an intravenous injection of approximately 5 MBq/kg body weight of FDG (up to 550 MBq). Mean uptake time on follow-up exams was within 9.7 ± 8.5 min of uptake time on baseline scan. First, a spiral CT scan from the neck to the pelvis was obtained using the following parameters: 130-kV peak; 105 mAs; scan width, 5 mm; and feed/rotation, 8.4 mm. Immediately on completion of the CT, PET scans of the same area were acquired for 3 min per bed position, with 5–7 bed positions per patient.

Interpretation criteria

Interpretation criteria used in this study were based on PERCIST, version 1.0 [29]. Readers were blinded to pathologic response or clinical outcome data. Measurement of tumor standardized uptake values (SUV) were standardized to lean body mass (SUL). SULpeak (=SUL) was obtained from all tumors using a region of interest (ROI) measuring 1.2 cm in diameter centered on the most metabolically active portion of the lesion and two measurements were obtained from normal liver, with a 3 cm ROI. In order to minimize partial volume effect, for small tumors and at follow-up when diameter of residual tumor was estimated to be less than 1.6 cm (25 % below ROI diameter), a smaller ROI was used, not exceeding three-quarters of the tumor diameter.

At baseline and at follow-up, SUL of tumor, as well as SUL of liver from two separate regions of interest in the right lobe of the liver were recorded. At follow-up after NACR, presence or absence of focal residual uptake was recorded. Focal FDG uptake at the site of tumor was considered suspicious for residual disease if it exceeded background physiological uptake of FDG in the esophagus or gastroesophageal junction. Regardless of subjective assessment, SUL measurements were obtained from tumor site. For patients without visible focal residual uptake, SUL measurements were obtained from the location of tumor as determined by the baseline scan. For all tumors, SUL tumor-to-liver ratio (SUL TLR) and % change in SUL before and after NACR (%∆SUL) was calculated. Furthermore, any new sites of disease on the follow-up exams, including nodal or distant metastases were recorded.

Response assessment criteria were as follows: complete metabolic response (CMR) = no focal residual FDG uptake above background, physiological uptake (Fig. 2); partial metabolic response (PMR) = residual focal uptake ≥30 % below baseline (and decrease of at least 0.8 units in SUL; Fig. 3); stable metabolic disease (SMD) = uptake similar to baseline (≤30 %); progressive metabolic disease (PMD) = uptake increasing in intensity or extent, or new sites of disease (Fig. 4).

Fig. 2
figure 2

77-year-old man with poorly differentiated squamous cell carcinoma of distal esophagus. PET/CT at baseline (a): marked focal uptake of FDG identified in mid-esophageal tumor (arrow). Follow-up PET/CT (b) showing complete metabolic response, with degree of uptake at tumor site indistinguishable from physiological uptake (arrow)

Fig. 3
figure 3

58-year-old man with moderately differentiated adenocarcinoma of the distal esophagus. PET/CT at baseline (a): marked focal uptake of FDG is identified in gastroesophageal tumor (arrow). Follow-up PET/CT (b) showing partial response to therapy with decrease in extent and degree of FDG uptake (arrow). Baseline SULpeak = 17.9; follow-up SULpeak = 5.2

Fig. 4
figure 4

61-year-old woman with well-differentiated adenocarcinoma of mid-thoracic esophagus. PET/CT at baseline (a): marked focal uptake of FDG identified in mid-esophageal tumor (arrow). Follow-up PET/CT (b) showing progressive disease: more extensive tumor infiltration along the esophagus (arrows), and new uptake in a small left supraclavicular lymph node (arrowhead)

Standard of reference

The PET data were compared to 3 reference standards. Esophageal tumor regression after neo CR was assessed in the surgical pathology specimens. One patient had locally progressive disease on follow-up, proven endoscopically, and surgery was contraindicated. For all patients who underwent esophagectomy (n = 44), regression of tumor in surgical specimens was evaluated using a 5-point tumor regression scheme (pathology regression grade, PRG; Table 2) [30]. It has been previously shown that FDG-PET is unable to detect minimal residual tumor burden versus no tumor burden [31, 32]; therefore, for the purpose of final analysis, grade 1 and 2 were grouped together (CR or near-complete CR). Furthermore, grade 3 and 4, both of which represent partial response to therapy with varying degrees of fibrosis were also grouped together (PR).

Table 2 Pathology regression grade (PRG)

Pathology regression grade alone may not be indicative of the patient’s true disease status, as patient may develop nodal or distant metastatic disease during therapy. Therefore, clinical response assessment score incorporating local tumor regression on pathology, nodal status at surgery and data on disease status on restaging procedures (second PET/CT and contrast-enhanced CT) was also recorded. Evidence of persistent nodal disease despite complete or near-complete regression of the esophageal tumor was considered partial response; new sites of disease, regardless of local tumor regression were considered progressive disease.

Furthermore, to determine the prognostic significance of metabolic imaging parameters collected, we compared them with survival data. Surveillance data were available for all patients (range 116–3030 days; median 646 days). At time of data censoring, 16 of the 45 study patients (35.6 %) were alive.

Statistical analysis

Statistical analyses were performed using SPSS software version 20 (IBM SSPS, Chicago, Ill). Demographic variables and time intervals are described with mean, standard deviation and range. Kruskal–Wallis test was used to assess the association between quantitative PET parameter, pathology regression grade and clinical response [33]. Fisher’s exact test was used to assess the association between metabolic response, pathology regression grade and clinical response. Association between pathology regression grade and survival time was examined via Kaplan–Meier survival analysis method. The same method was used to examine the association between clinical response and survival time. Kaplan–Meier estimates of the mean survival time along with its standard error (SE) and 95 % confidence interval (CI) are reported for each category of pathology regression grade and clinical response. Further, a comparison of the survival function between the levels of the pathology regression response was conducted via the Log-Rank test. Cox regression was used to assess the correlation between pre- and post-quantitative PET parameters and survival, hazard ratios along with 95 % confidence intervals are reported. A p value <0.05 was considered statistical significant.

Results

Baseline and post-treatment semiquantitative measures

The correlation between baseline and post-therapy SUL, SUL ratio and % change in these parameters after NACR and the correlation between pre- and post-treatment quantitative PET parameters and survival are presented in Tables 3 and 4, respectively. SUL of liver was measured on both PET examinations and the median difference was 0.23 (7.2 %) and the mean was 0.27 (10.3 %). None of baseline PET parameters had significant correlation to pathology or clinical response. A positive correlation was found between post-therapy SUL ratio, %∆ SUL and %∆ SUL ratio and clinical response (p = 0.025, 0.035, 0.030, respectively). A weak correlation was found between post-therapy SUL ratio to PRG (p = 0.049), but %∆ SUL and %∆ SUL ratio were not significant. There was a trend towards positive correlation between absolute post-therapy SUL and clinical response, although this did not reach statistical significance (p = 0.053). Pretreatment PET parameters are correlated with survival. In particular, the hazard is reduced by 11 % for every increase of one unit in SULm measurement and by 20.7 % for every increase of one unit SULm ratio. Post-treatment PET parameters are not correlated with survival.

Table 3 Correlation between PET parameters, PRG and clinical response
Table 4 Correlation between pre- and post-treatment quantitative PET parameters and survival

Response assessment criteria

A strong correlation was found between the metabolic response score and PRG (p = 0.002) as well as between metabolic response score and clinical response (p < 0.001) (Table 5).

Table 5 Metabolic response versus pathology regression grade and clinical response

Survival data

Correlation between PGR versus survival and metabolic response versus survival data is provided in Table 6. There was no clear correlation between PGR and survival (p = 0.183). There was a significant difference between the distributions of survival time of the various metabolic response groups (p value <0.0001) (Fig. 5). Patients with CMR or PMR had significantly longer survival as compared to those with SMD or PMD.

Table 6 Correlation between PRG and metabolic response to survival
Fig. 5
figure 5

Kaplan–Meier (KP) survival analysis curve. Cum Survival cumulative survival. Survival is displayed in days

Discussion

There is a relatively prolonged time for measurable tumor shrinkage to occur after cell death, limiting early prediction of response to therapy using tumor size. Anatomic imaging modalities may also be limited in distinguishing viable residual tumor from fibrosis, and it is not uncommon in post-therapy evaluation of esophageal cancer to have significant residual thickening of the esophagus on CT, regardless of treatment outcome. Metabolic changes, which often precede change in tumor size, may enable earlier and more reliable response assessment. The inconsistent results of prior studies evaluating PET in predicting local and systemic treatment of LAEC [1626] may be attributed to lack of standardization in obtaining SUV measurements and lack of standardized metabolic response assessment criteria. SUV measurements are dependent on multiple variables, including uptake time, blood glucose level, body weight, injection technique, camera calibration, reconstruction method, matrix size, and partial volume effect [34].

PERCIST 1.0 criteria, recently introduced by Wahl et al. offer some standardization. The use of SUL (lean body mass-normalized SUV) reduces dependence on patient weight as compared to the standard body weight normalized SUV (SUVbw) and SULpeak reduces potential inconsistencies of single pixel measurements due to noise [35]. PERCIST 1.0 also promotes use of a uniform ROI which may allow for more reproducible measurements [35, 36]. Furthermore, comparison to reference tissue (such as normal liver, or blood pool) may account for time point variability in radiotracer uptake.

The metabolic response criteria in the current study incorporate qualitative and quantitative data. Modifications to the diagnostic criteria proposed by Wahl et al. [29] were made to account for specific features of esophageal cancer. First, by PERCIST 1.0, a complete metabolic response (CMR) is defined as visual disappearance of all metabolically active tumors and a drop in SULpeak to that of the background. It has been previously shown that in healthy individuals, there is a wide range of physiological uptake of FDG in the esophagus, with SUVmean ranging from 1.13 to 3.23 (SD = 1.61 ± 0.61) [28]. Given the wide variability in reported SUV in the normal esophagus, in the current study protocol, only qualitative, pattern-based assessment of resolution of focal abnormal FDG uptake was used to differentiate between CR and other response groups (PR, SD or PD). Other PERCIST1.0 criteria to differentiate partial responders from those with stable or progressive disease were unaltered. Our data show that a qualitative screen to determine presence or absence of residual focal uptake results in a stronger correlation with clinical outcome than SUL measurements alone, likely due to the improved stratification of patients as having complete or near-complete response from others. A further modification to PERCIST1.0 was to allow for change in size of ROI for small residual tumors where the standard ROI would result in partial volume averaging from surrounding tissues. As in the literature, most patients showed complete or partial response to NACR, often resulting in small residual focal uptake of FDG, if at all. Although we do acknowledge that measurement with a smaller ROI may be less reproducible, the use of an ROI with a diameter of 1.2 cm in these patients, would have underestimated the lesions’ SUL.

The study shows a correlation between certain post-therapy quantitative parameters (SUL ratio, ∆SUL, ∆SUL ratio) to clinical outcome and a strong correlation between the metabolic response score and PRG as well as overall clinical response. Using qualitative assessment to determine whether there has been a CR or not and then stratifying PR from SD and PD according to quantitative parameters correlated well with overall clinical response and survival. PET tended to overstage therapy response more than understage. For those with CR (n = 14) and PR (n = 25) by clinical response criteria, 5 (35.7 %) and 4 (16 %) patients were overstaged by PET, respectively. This may be attributed to therapy-induced inflammation or insufficient interval between end of NACR and second PET.

Therapy for LAEC has been evolving in recent years and researchers have been evaluating use of chemotherapy or chemoradiation therapy alone in the management of patients with LAEC, especially in patients with multiple comorbidities and high surgical risk who show some response to initial therapy. If PET is to be utilized as a tool to guide therapy in these patients, overstaging residual disease may be less detrimental than understaging and omitting surgery in a patient with residual malignancy. Our findings are in line with a recently published retrospective comparison of RECIST and PERCIST criteria in evaluating response to neoadjuvant chemotherapy alone in 51 patients with LAEC [37]. In that study, PERCIST was found to be the strongest independent predictor of patient outcome and authors concluded that PERCIST might be considered more suitable for evaluation of chemotherapeutic response to esophageal cancer than RECIST. However, these results and the current study results need to be interpreted with caution and further confirmation in a large-scale, prospective trial would be needed before the use of PET can be endorsed for risk-adapted management of LAEC.

The current study has several limitations. First, it is retrospective. As such, there was variability in study parameters. For example, although the mean FDG uptake time and difference in uptake times between baseline and follow-up studies is within the recommended range as defined by PERCIST 1.0, there were outliers. Furthermore, there was variability in time interval between NACR and PET and between PET and surgery; however, therapy protocol and imaging parameters (PET scanner and protocol used) were uniform. Second, due to the relatively small sample size, and as most patients respond, at least partially, to current neoadjuvant therapy regimen, there were a small number of patients with SD or PD. Third, as it is not expected that PET would identify microscopic or small volume of residual disease, PRG 1 and 2 (complete and near-complete responders on pathology) were grouped. Although this may be a significant limitation of PET, the metabolic response criteria outperformed PRG alone in predicting patient outcome and overall survival. This may be due to the ability of PET to assess disease status beyond the esophagus or the surgical specimen.

In summary, PERCIST-based metabolic response assessment to NACR in LAEC may correlate with clinical outcome and survival. In the current study, patients with CMR or PMR had a significantly longer survival than those with SMD or PMD. It remains to be determined in a prospective, large-scale trial, whether FDG-PET/CT can be used to guide risk-adapted management in patients with LAEC.