Introduction

Esophagectomy remains the mainstay of treatment for patients with locally advanced esophageal cancer. Neoadjuvant therapy followed by surgery, however, has recently been found to improve survival compared with surgery alone in patients with esophageal cancer [14]. The standard strategy for locally advanced esophageal cancer in Western countries consists of neoadjuvant chemoradiotherapy followed by surgery, whereas the standard strategy in Japan consists of chemotherapy followed by esophagectomy [4, 5]. Although cisplatin plus 5-fluorouracil (5-FU) (CF) is the standard chemotherapeutic regimen for esophageal cancer, it may not substantially improve outcomes [6]. Triplet chemotherapy, consisting of docetaxel in combination with CF (DCF), recently demonstrated good efficacy as induction chemotherapy in patients with node-positive esophageal cancer [7]. We also reported that DCF showed strong antitumor activity in esophageal cancer and may confer survival benefits when used as preoperative chemotherapy [8], without increasing any postoperative morbidity [9].

Responses to treatment can be assessed clinically using the response evaluation criteria in solid tumors (RECIST) [10] and pathologically by evaluating pathological response grade (pGrade), based on the ratio of viable residual tumor cells in postoperative cancerous tissue [11]. RECIST has limitations in evaluating responses in patients with esophageal cancer, as the primary tumor sites cannot be targeted, and small metastatic lymph nodes are often not evaluable. pGrade also has limitations in evaluating responses precisely, as fibrosis caused by chemotherapy can confuse the evaluation and pGrade may not always reflect the size of the original tumor before chemotherapy.

A draft framework has been proposed for positron emission tomography (PET) RECIST (PERCIST) [11], a new standardized method for the quantitative assessment of metabolic tumor response [12]. Although PERCIST has been reported to be useful for evaluating esophageal cancer, its use has been limited in some institutions because of the need to calculate peak standardized uptake values (SUVpeak) and lean body mass. Calculations of SUVpeak require specialized software that is not widely available, and determining lean body mass for each fluorodeoxyglucose (FDG)-PET examination is time-consuming for clinicians. We therefore propose a simple and convenient evaluation method using maximum SUV (SUVmax) evaluation. Furthermore, although PERCIST 1.0 defined complete metabolic response (CMR) relative to SUV in the right lobe of the liver, this parameter may be affected by liver damage in patients with esophageal cancer. We therefore formulated a novel method that uses the normal esophagus as a comparator (i.e., tumor/normal esophagus [T/N] ratio) to determine CMR.

This study was designed to determine the accuracy and usefulness of SUVmax evaluation with T/N ratio, and to identify the most appropriate method other than PERCIST 1.0, for evaluating response to neoadjuvant DCF chemotherapy followed by esophagectomy in patients with esophageal squamous cell carcinoma (ESCC). This study compared the ability of SUVmax evaluation with T/N ratio with that of other evaluation methods to predict pathological response and patient prognosis.

Patients and methods

Patients

This retrospective analysis evaluated data from 73 patients with histologically confirmed squamous cell carcinoma (SCC) of the thoracic esophagus who underwent FDG-PET/computed tomography (CT) before and after two cycles of neoadjuvant chemotherapy prior to planned surgical resection at the Department of Gastroenterological Surgery, Graduate School of Medical Sciences, Kumamoto University, between November 2008 and March 2014. Of the 336 patients who underwent esophagectomy during this time period, 263 were excluded, including 196 who did not receive preoperative therapy, 31 who received radiation therapy, 24 who did not receive two courses of DCF, six who were not diagnosed with SCC, and six who did not complete FDG-PET (Fig. 1).

Fig. 1
figure 1

Patient flow diagram of SUVmax evaluation

At baseline, all 73 included patients underwent upper endoscopy, esophagography, and enhanced CT from the neck to the abdomen, with tumors staged according to the American Joint Committee on Cancer (6th edition) TNM staging system. Patients with any tumor depth (clinical T1–3) and regional lymph node involvement (N1) without distant metastases were considered eligible for neoadjuvant chemotherapy. All patients were younger than 80 years and had normal heart, liver, kidney, and bone marrow functions. The study protocol was approved by the institutional Ethics Review Board of Kumamato University, and all patients provided written informed consent.

Treatment

All patients received neoadjuvant DCF chemotherapy [8], consisting of 60 mg/m2 docetaxel on day 1, administered intravenously for 2 h; 350 mg/m2 5-FU on days 1–5 as a 24-h continuous intravenous infusion; and 6 mg/m2 cisplatin on days 1–5, administered intravenously for 1 h. This regimen was administered every 3 weeks, and two scheduled courses were administered before esophagectomy. Responses were evaluated by PET/CT 2 weeks after finishing chemotherapy. Surgery, consisting of en bloc esophagectomy, reconstruction with a gastric tube, and two- or three-field lymphadenectomy, was performed 3 weeks after completing neoadjuvant chemotherapy.

PET/CT imaging

All patients underwent co-registered PET/CT tomography using a hybrid PET/CT imager (Gemini GXL16, Philips Medical Solutions, Amsterdam, the Netherlands), consisting of a dedicated GSO full-ring PET scanner and a 16-slice helical CT scanner. Patients fasted for at least 5 h prior to the examination. Emission scans were acquired in 3-D mode with a 144 × 144 matrix 60 min after intravenous injection of 185–300 MBq 18F-FDG, immediately after urination. PET/CT transmission data were acquired for the area from the base of the skull to the proximal thighs.

Image analysis using SUVmax

SUVmax of the tumor was determined by drawing spherical regions of interest in all slices containing the tumor and determining the highest SUVmax. Each baseline (pretreatment) tumor SUVmax had to be 1.5 times the mean liver SUV plus two standard deviations (SDs). If the liver was diseased, the minimal metabolically measurable tumor activity according to PERCIST 1.0 was defined as 2.0 times the blood-pool 18F-FDG activity plus two SDs in the mediastinum. T/N ratio was calculated using the SUVmean of the normal cervical esophagus in each patient. SUVmax in subsequent scans could be located in a lesion differing from that with SUVmax at baseline, as long as that lesion had been present at baseline. If SUVmax at baseline did not exceed the background value, the patient was not eligible for response evaluation. If SUVmax had decreased at the first follow-up, the response was calculated as ΔSUVmax between baseline and follow-up, divided by baseline SUVmax × 100 %. If SUVmax had increased, the response was calculated as ΔSUVmax between the lowest registered and actual follow-up, divided by the lowest registered SUVmax × 100 %. SUVmax response was classified as complete metabolic response (CMR), defined as complete resolution of 18F-FDG uptake by all lesions to a level less than or equal to mean liver activity and indistinguishable from background blood-pool levels; partial metabolic response (PMR), defined as at least 30 % reduction in SUVmax; progressive metabolic response (PMD), defined as at least 30 % increase in SUVmax or a new 18F-FDG-avid lesion; and stable disease (SD), defined as intermediate between PMR and PMD. Response based on SUVmax with T/N ratio was classified similarly, with CMR defined as complete resolution of 18F-FDG uptake within all lesions to a level less than or equal to mean normal cervical esophagus activity and indistinguishable from background blood-pool levels; and PMR, SD, and PMD defined according to the same criteria as for SUVmax alone.

Image analysis according to CT evaluation of tumor shrinkage

The lesion with the highest SUVmax was considered the primary tumor site. In addition to evaluating 18F-FDG, the change in size of the primary site was determined. The longest diameter was measured, and enhanced CT and PET/CT images were transferred to workstations. The long axis of each tumor on coronal CT images was measured on the monitor using calipers, with reference to multiplanar reconstruction images (axial, coronal, and sagittal imaging) and enhanced CT images. On the basis of CT evaluations of tumor shrinkage, patients were classified as responders if tumor diameter was reduced by at least 30 % or as non-responders.

Pathologic response grade (pGrade)

Pathologic response was evaluated according to the Japanese Classification of Esophageal Cancer (10th edition) and based on the percentage of viable cancer cells per tumor. Grade 0 was defined as the absence of any recognizable cytological or histological effect; grade 1a as viable cancer cells accounting for two-thirds or more of the tumor; grade 1b as viable cancer cells accounting for one-third to two-thirds of the tumor tissue; grade 2 as viable cancer cells accounting for less than one-third of the tumor tissue; and grade 3 as the apparent absence of any viable cancer cells (pathologic complete response; pCR) [13].

Statistical analysis

Categorical variables were compared using χ 2 tests [14]. Disease-free survival (DFS) was calculated by the Kaplan–Meier method and compared by log-rank tests. Associations between DFS and clinicopathological features were evaluated by univariate Cox proportional hazards regression analysis. Parameters significant by univariate analysis were included in multivariate Cox proportional hazards regression analysis. All statistical analyses were performed using JMP Version 9 (SAS Institute, Tokyo, Japan) and Excel 2010 (Microsoft, Redmond, WA, USA). A P value of 0.05 or less was considered statistically significant.

Results

The characteristics of the 73 patients included in this study are summarized in Table 1. In evaluating tumor shrinkage by CT, we found that the longest lesion diameter could not be measured by CT in 22 patients because these tumors were not visible on the CT images. Of the remaining 51 patients, 24 were classified as responders, on the basis of CT evaluation of tumor shrinkage, and 27 as non-responders. Assessment of pathologic responses to chemotherapy showed that 49 patients achieved grade 1a, seven achieved grade 1b, ten achieved grade 2, and achieved seven grade 3. Responses based on CT evaluation of tumor shrinkage did not correlate significantly with pathologic responses (Table 2).

Table 1 Characteristics of 73 patients who received neoadjuvant chemotherapy for esophageal squamous cell carcinoma
Table 2 Comparison of treatment response assessments between pathologic response and clinical evaluations

SUVmax values were determined by PET/CT in all 73 patients. Table 2 shows the relationships between pGrade and SUVmax evaluations, with and without T/N ratio. pGrade was significantly correlated with SUVmax evaluation, both with (P = 0.015) and without (P = 0.027) T/N ratio. The sensitivity, specificity, and accuracy of SUVmax for pGrade 3 were 71.4, 71.2, and 71.2 %, respectively, whereas the sensitivity, specificity, and accuracy of SUVmax evaluation with T/N ratio were 57.1, 89.3, and 86.3 %, respectively.

DFS rate stratified by CT-determined tumor shrinkage and pGrade are shown in Fig. 2. Classification based on CT evaluation of tumor shrinkage did not significantly correlate with prognosis, and 22 patients were unevaluated. Although no patient classified as pGrade 3 experienced tumor recurrence, the other grades did not correlate significantly with DFS (P = 0.11, data not shown).

Fig. 2
figure 2

Kaplan–Meier analyses of disease-free survival (DFS) stratified by a clinical response (CT evaluation of tumor shrinkage) and b pathological response (pGrade). Groups were compared by log-rank tests

Figure 3 shows DFS rates following curative resection in patients classified by SUVmax, with and without T/N ratio. Among patients classified by SUVmax alone, those who achieved CMR had a poorer prognosis than those who achieved PMR (Fig. 3a). In contrast, when patients were classified by SUVmax evaluation with T/N ratio, responses correlated significantly with DFS (P = 0.0011) (Fig. 3b). Only one patient who achieved CMR, as determined by SUVmax evaluation with T/N ratio, experienced a relapse during the observation period. In addition, receiver operating characteristic curve analysis showed that SUVmax evaluation with T/N ratio was the best predictor of pGrade 3 (Fig. 3c).

Fig. 3
figure 3

Kaplan–Meier analyses of DFS stratified by (a) SUVmax and (b) SUVmax evaluation with T/N ratio. Groups were compared by log-rank tests. c Receiver operating characteristic curves for pGrade-based RECIST, SUVmax, and SUVmax evaluation with T/N ratio

We also examined the correlations between patient classification by SUVmax evaluation with T/N ratio and clinicopathological features and prognosis (Fig. 4a, Fig. S1). pGrade 2 and 3 patients were classified as achieving CMR or PMR; therefore, we investigated the association between SUVmax evaluation with T/N ratio and prognosis in pGrade 1 patients. CMR/PMR stratified by SUVmax evaluation with T/N ratio was significantly associated with a better prognosis (Fig. 4b). Table 3 shows the associations between DFS and clinicopathological features among pGrade 1 patients, determined using univariate and multivariate regression analyses. Resection level and SUVmax evaluation with T/N ratio were significantly associated with DFS. Multivariate analysis that included factors significant in univariate analysis found that resection level [hazard ratio (HR), 4.85; 95 % confidence interval (CI) 2.108–11.57; P = 0.0002] and SUVmax evaluation with T/N ratio (HR 3.01; 95 % CI 1.318–6.839, P = 0.0095) were significant independent predictors of DFS.

Fig. 4
figure 4

Kaplan–Meier analyses of disease-free survival (DFS) in complete/partial responders and non-responders, as determined by SUVmax evaluation with T/N ratio, among a all patients and b patients classified as pGrade 1. Groups were compared by log-rank tests

Table 3 Univariate and multivariate Cox proportional hazards regression analysis for prediction of DFS

Discussion

PET/CT-based evaluation of response to treatment is valuable in patients with several types of solid tumors [1517]. This method is especially useful for assessing the effects of treatments that induce changes not necessarily represented by tumor shrinkage [18]. It is therefore important to understand the differences among response evaluations that use existing PET/CT criteria. Although RECIST 1.1 [10] is widely used to evaluate tumor response, it has limitations associated with its dependence on morphologic changes [12]. PET/CT, however, can overcome these limitations by assessing metabolic changes, which are closely related to the malignant potential of tumors. Thus, PET/CT may be the most accurate noninvasive imaging modality for initial staging and response assessment in patients with esophageal cancer. PERCIST 1.0 criteria, including a determination of SUVpeak, have recently been proposed as a standardized method for evaluating metabolic tumor response. We hypothesized that SUVmax evaluation with T/N ratio may be equally accurate, while overcoming the need to determine SUVpeak. Of the 73 patients in this study, 22 had primary lesions not evaluable by CT; therefore CT could not evaluate therapeutic responses to neoadjuvant chemotherapy by determining tumor shrinkage in these patients. However, SUVmax, with or without T/N ratio, could evaluate therapeutic responses in all 73 patients. Moreover, of the three methods of evaluating tumor shrinkage, CT, SUVmax, and SUVmax evaluation with T/N ratio, the last of these was the best predictor of pGrade 3, with a specificity of 89.3 %. Patients who achieve pGrade 3 can avoid invasive surgery after neoadjuvant chemotherapy. It is difficult to determine whether or not to perform surgery following SUVmax evaluation. These patients, especially those who achieve CMR, may require assessments by other diagnostic modalities, including esophagoscopy, endoscopic ultrasonography, and biopsy. This is particularly necessary for patients at high risk for surgery. Furthermore, SUVmax evaluation with T/N ratio may be a useful tool in choosing other treatment options, such as radiotherapy or other chemotherapy regimens, prior to surgery.

SUVmax evaluation classified 24 patients in the present study as having achieved CMR, although pathological examination classified only five as achieving pGrade 3. We therefore proposed using the normal esophagus as a reference (T/N ratio) for diagnosing CMR, enhancing the ability of SUVmax to evaluate CMR. Only one patient who achieved CMR, as assessed by T/N ratio, relapsed, compared with seven of the 24 assessed as CMR by SUVmax evaluation relative to liver SUVmean. Patients classified as achieving CMR on the basis of T/N ratio had a significantly better prognosis than those who did not achieve CMR (Fig. S2). SUVmax-determined CMR is based on the mean SUV in the right lobe of the liver. Liver damage, resulting from alcohol intake, fatty liver, and viral infection, however, is not uncommon in patients with esophageal cancer [19, 20]. Therefore, the mean SUV of their livers may be higher than that of normal liver, which may affect the determination of CMR. Using a normal site of the primary cancerous organ as a comparator is also useful, because baseline blood sugar concentrations are the same in the normal and cancerous esophagus. In using the normal esophagus as a reference, it is better to use the upper or middle esophagus because of the potential effects of reflux esophagitis in the lower esophagus.

Although no patient classified as pGrade 3 experienced tumor recurrence, the other pathological grades were not correlated with patient prognosis. Thus, although pGrade 3 may accurately predict improved survival, the prognosis of other grades does not reflect their pathological responses. In this study, pGrade 1 responders (CMR and PMR) classified by SUVmax with T/N ratio had significantly better prognoses, suggesting that SUVmax evaluation with T/N ratio might be a useful predictor in patients with non-pathological CR. In addition, both univariate and multivariate regression analyses found that T/N ratio was a significant prognostic factor.

The limitations of this study included its retrospective nature. However, the chemotherapy and PET/CT protocols and timing were similar in all patients, and all patients underwent PET/CT scans during the first week after completion of chemotherapy.

Conclusions

This study demonstrated that SUVmax using the normal esophagus as a reference may be more suitable than CT determination of tumor shrinkage in evaluating response to neoadjuvant chemotherapy in patients with esophageal cancer. To date, however, there is no consensus on the quantification of PET/CT responses, indicating the need for further prospective studies evaluating the ability of PET/CT to improve the treatment of patients with esophageal cancer.