Introduction

Locally advanced squamous cell oesophageal cancer has a dismal prognosis, mostly due to the high rate of metachronous distant metastases (DM), but also due to local failures [1,2,3]. The optimal primary treatment approach is controversial. Both definitive radiochemotherapy (RCT) and preoperative RCT followed by radical surgery are comparable regarding overall survival (OS) of patients; however, conclusive clinical data from phase III studies are lacking. Moreover, some recent population-based analyses suggest better OS after trimodality treatment [4, 5]. Since local recurrences appear to occur more frequently after definitive RCT, trimodality treatment is usually the treatment of choice in medically fit patients; however, in some patients, the location (mostly cervical) or extent of the primary tumour impedes radical tumour excision, or the patient is unwilling to undergo surgery. As up to one third of patients show pathological complete remission after low-dose preoperative RCT, identification of these highly RCT-sensitive patients should be a pivotal issue for future approaches to treatment individualization [6].

PET using the tracer 18F-FDG at the end of preoperative RCT is often suggested to guide the treatment decision (stop RCT after the neoadjuvant dose, or continue to higher-dose definitive RCT). In this regard, data for PET parameters are very inconclusive. Conventional tumour PET parameters, including standardized uptake values (SUV) can probably not be used to guide treatment [7,8,9]. One drawback for response assessment by FDG PET during RCT is inflammation of the tissue surrounding the tumour including non-tumour-affected oesophagus (NTO). As inflammation (i.e. radiation-induced mucositis/oesophagitis) leads to increased FDG uptake it hampers accurate delineation of tumour on restaging PET [10, 11].

Several retrospective studies have suggested that pronounced acute side effects of RCT, including mucositis and oesophagitis, may be associated with a favourable local tumour response and better OS [12, 13]. In patients receiving trimodality treatment, increased NTO FDG uptake is strongly associated with a favourable outcome [14]. As clinical scoring is highly observer-dependent and only indirectly reflects mucosal inflammation in oesophageal cancer, we have used FDG PET of NTO as a potentially more objective tool for assessing oesophageal inflammation. In the current study we sought to determine if FDG uptake in the NTO may be used for future treatment guidance. The aim of this study was to assess if this parameter is also prognostic in patients treated with definitive RCT at another institution and with different patient characteristics.

Materials and methods

Patient characteristics

In the present study 72 consecutive patients (59 men, 13 women; mean age 58 years, range 42 to 75 years) with FDG PET/CT-staged oesophageal carcinoma were analysed retrospectively. All patients received definitive RCT with curative intent between May 2009 and October 2013. The patients were a subgroup of patients analysed recently [15], matching the following inclusion criteria: age >18 years, histologically confirmed oesophageal squamous cell carcinoma, FDG PET/CT before and during week 5 of RCT, no evidence of DM on initial PET/CT and definitive RCT with curative intent (defined as prescribed radiation dose of at least 50 Gy and concomitant administration of chemotherapy). Additionally a pretreatment FDG PET/CT scan and a first interim FDG PET/CT scan during treatment (PET2 in the original publication) had to be available for further analysis. At this time the administered radiation dose was similar to that investigated in the above-mentioned neoadjuvant cohort (average dose 40.9 Gy). Evaluation of the data was approved by the Institutional Ethics Committee (EA2/122/17) and all patients provided signed written informed consent.

Treatment and follow-up

Patients were treated with normofractionated RCT with a single dose of 1.8 or 2 Gy per fraction. Gross tumour volume (GTV) was delineated separately for the primary tumour and affected lymph nodes using information from high-resolution contrast-enhanced CT and FDG PET. The clinical target volume (CTV) of the primary tumour was generated by enlarging the primary GTV by 4 cm along the oesophageal wall and by 0.5 cm radially. Additionally, regional lymph node regions were included in the nodal CTV, and all CTVs were adapted to anatomical structures (excluding bones, lungs or large vessels). The planning treatment volume 50 Gy (PTV 50Gy) comprised the CTV with additional margins of 0.5 cm. After administration of 50 Gy, an additional radiation boost of 4–16 Gy (average 58.9 Gy) was prescribed to a reduced treatment volume compromising only the GTV with reduced safety margins (PTV boost). Radiation treatment was mostly delivered as intensity-modulated radiotherapy (65%), and other radiation planning methods including 3D conformal techniques or a mix of different techniques were less frequently applied (35%). Concomitant chemotherapy consisted of two cycles of cisplatin (25 mg/m2/day, days 1–3 and days 29–31) and either paclitaxel (135 mg/m2/day, day 1 and day 29) or 5-fluorouracil (500 mg/m2/day, days 1–5 and days 29–33). Follow-up consisted of an FDG PET CT scan 1 month after completing radiotherapy and usually clinical examination and CT scans of the thorax and abdomen every 3 to 6 months thereafter. Additional diagnostic procedures including endoscopic examinations were performed as indicated at the discretion of the treating physician.

FDG PET/CT protocol

All patients underwent a hybrid 18F-FDG PET/CT scan prior to therapy. Scans (3D PET acquisition, 90 s per bed position) were performed with a Discovery STE (General Electric Medical Systems, Milwaukee, WI, USA). A second scan was performed during the last week of RCT using the same PET/CT device. Data acquisition was started 67 ± 22 min (range 50–140 min) after injection of 142–548 MBq 18F-FDG. PET data were reconstructed using CT-based attenuation-weighted OSEM reconstruction (two iterations, 20 subsets, 6 mm FWHM gaussian filter). The resulting image data had a voxel size of 5.47 × 5.47 × 3.27 mm.

Data analysis

Tracer uptake in the NTO was determined using a roughly cylindrical volume of interest (VOI) which was manually delineated as described previously [14]. The minimum longitudinal distance to the tumour or affected lymph nodes was 20 mm. The VOI had to be in the high-dose elective treatment volume and the minimum volume was 5 ml (minimum longitudinal length 20 mm). Supplementary Fig. 1 shows pretreatment and restaging PET delineations in an example patient. The delineating observer (S.Z.) was blinded to patient outcome. For the resulting VOIs, SUVmax, SUVmean and metabolic tumour volume (MTV) were computed. Since in this retrospective study tracer uptake time was not standardized, all SUVs were corrected for scan time to T0 = 75 min after injection using the following formula:

$$ {\mathrm{SUV}}_{\mathrm{tc}}=\mathrm{SUV}\times {\left(\frac{T_0}{T}\right)}^{\left(1-b\right)} $$

where SUVtc is the time-corrected SUV, T is the time at which the SUV was actually measured and b = 0.31 describes the shape and decrease of the arterial input function over time [16]. Since only time-corrected values were investigated the index ‘tc’ is omitted in the following. The fractional differences in SUVmax and SUVmean between the first and second scan were computed as follows:

$$ \Delta \mathrm{SUV}=100\times \frac{{\mathrm{SUV}}_2-{\mathrm{SUV}}_1}{{\mathrm{SUV}}_1} $$

where the indices 1 and 2 refer to pretherapy and restaging PET scans, respectively.

For comparison, we also determined the SUVmax of the primary tumour on the pretherapy and subsequent restaging PET scans and computed the fractional difference as described above. In the following we refer to these quantities as SUVlesion/1, SUVlesion/2 and ∆SUVlesion. For determination of tumour parameters the metabolically active part of the primary tumour was delineated on the pretherapy and restaging PET scans by an automatic algorithm based on adaptive thresholding considering the local background [17, 18]. Delineations were visually inspected, checked for plausibility and, if necessary, manually corrected by an experienced observer (S.Z.). Manual correction of tumour delineations was required in 46 of 144 scans (32%, all but one on the restaging PET scan) exhibiting only low diffuse tracer accumulation in the respective lesion. For the resulting VOIs the MTV and total lesion glycolysis (TLG = MTV × SUVmean) were computed. VOIs were defined and analysed using the ROVER software, version 3.0.34 (ABX GmbH, Radeberg, Germany).

Statistical analysis

Survival analysis was performed with respect to OS, local tumour control (LC), freedom from DM and treatment failure (TF, defined as any recurrence or occurrence of DM) from the start of therapy to death and/or event. Patients who did not keep follow-up appointments and for whom information on survival or tumour status was thus unavailable were censored at the date of the last follow-up examination. The associations between endpoints and clinically relevant parameters (gender, age, tumour grade, T stage, N stage, UICC stage and localization) as well as quantitative PET parameters were analysed using univariate Cox proportional hazards regression in which the PET parameters were included as binarized parameters. The clinical parameter Karnofsky performance status was not included because the values were asymmetrically distributed (Table 1). The cut-off values used for binarization were calculated by performing a univariate Cox regression for each measured value. The value leading to the hazard ratio (HR) with the highest significance was used as the cut-off value. To avoid groups being too small, only values within the interquartile range were considered as potential cut-off values. The cut-off values were separately computed for all endpoints. For cut-off-values leading to p < 0.05, a stability test was performed. In this test the range of cut-off values still leading to a significant effect in univariate analysis was computed by successively decreasing/increasing the cut-off value (starting at the optimal value) and repeated univariate Cox regression.

Table 1 Patient and tumour characteristics

The probabilities of survival were computed and rendered as Kaplan-Meier curves. The independence of NTO parameters and tumour parameters (PET and clinical) was analysed by multivariate Cox regression. Statistical significance was assumed for p values less than 0.05. Statistical analysis was performed using R version 3.4.3 [19].

Results

The 2-year, 3-year, and 5-year OS rates were 47%, 42% and 27%, respectively. These values are in line with data from current literature [20]. Overall, 74% of the patients died during the observation period. The median follow-up time of the survivors was 63 months (range 52 to 102 months). The rates for LC, freedom from DM, and absence of TF at 5 years were 42%, 69%, and 28%, respectively. In the univariate Cox regression ∆SUVmax NTO and MTV were prognostic factors for all investigated clinical endpoints (OS, TF, LC, DM). In the univariate analysis SUVmax NTO, ∆SUVmean NTO, SUVlesion/1, SUVlesion/2 and ∆SUVlesion were also prognostic for OS, ∆SUVmean NTO, SUVlesion/1, ∆SUVlesion and TLG were prognostic for TF, SUVmax NTO and ∆SUVmean NTO were prognostic for LC, and TLG was prognostic for DM (see Tables 2 and 3). Kaplan-Meier curves for ∆SUVmax NTO and ∆SUVmean NTO are shown in Figs. 1 and 2.

Table 2 Univariate Cox regression analysis with respect to overall survival and treatment failure
Table 3 Univariate Cox regression with respect to local control and distant metastases
Fig. 1
figure 1

Kaplan-Meier curves for overall survival (OS), absence of treatment failure (TF), local control (LC) and freedom from distant metastases (DM) in patients stratified by ∆SUVmax of non-tumour-affected oesophagus (NTO)

Fig. 2
figure 2

Kaplan-Meier curves for overall survival (OS), absence of treatment failure (TF), local control (LC) and freedom from distant metastases (DM) in patients stratified by ∆SUVmean of non-tumour-affected oesophagus (NTO)

In the multivariate analysis ∆SUVmax NTO and MTV were included. Confounding factors were UICC stage and SUVlesion/2 for OS, N stage and SUVlesion/1 for TF, SUVlesion/2 for LC, and SUVlesion/1 for DM. In all four analyses ∆SUVmax NTO was an independent prognostic factor (HR = 2.5, p = 0.002, for OS; HR = 2.24, p = 0.025, for TF; HR = 4.75, p < 0.001, for LC; HR = 3.92, p = 0.019, for DM; Table 4). In repeated multivariate analysis with high-risk and low-risk groups resulting from the application of the cut-off values determined by Zschaeck et al. [14] ∆SUVmax NTO was an independent prognostic factor for OS (HR = 1.88, p = 0.038, cut-off 32%), TF (HR = 2.11, p = 0.048, cut-off 0.5%) and DM (HR = 3.02, p = 0.047, cut-off −9.1%). Supplementary Table 1 shows the results of the multivariate analysis using the original cut-off values.

Table 4 Multivariate Cox regression for noncorrelated parameters with at least a trend in univariate testing

Since ∆SUVmax NTO cut-off values varied considerably for different endpoints, cut-off stability tests were performed, which are shown in Supplementary Table 2. As can be seen, the previously published cut-off values are within the cut-off range of ∆SUVmax NTO. Additionally, a cut-off value of 0% (i.e. no increased FDG uptake during treatment) is clearly within the cut-off range of ∆SUVmax NTO for all investigated clinical endpoints and also within the range of ∆SUVmean NTO for OS, TF and LC. The results of univariate Cox regression using this cut-off value for discrimination between high-risk and low-risk patients are shown in Table 5.

Table 5 Univariate Cox regression of ∆SUV NTO parameters. In all cases low risk was defined as ∆SUV NTO >0

Discussion

We investigated the prognostic value of FDG uptake in the NTO during RCT in a cohort of patients treated by a nonsurgical approach for (mostly locally advanced) squamous cell oesophageal cancer. The major finding of our analysis was that FDG uptake in the NTO, especially when measured as the fractional difference between pretherapeutic and restaging (response assessment) PET, has a significant and strong impact on LC, DM and OS. Most interestingly, the prognostic value of FDG uptake in the NTO was found to be independent of other parameters including clinical characteristics and various PET parameters investigated in this study. Additionally, evaluation of FDG uptake in the NTO seems to be relatively robust, as not only did a plethora of parameters show prognostic significance (SUVmax, SUVmean and ∆ values) but also previously published cut-off values could be applied to successfully distinguish between low-risk and high-risk patients in this cohort.

FDG PET/CT is an established imaging biomarker for the evaluation of response to induction chemotherapy for oesophageal adenocarcinoma [21] and the first retrospective data indicate that PET-driven treatment modification might be beneficial in nonresponders [22]. However, data are much more inconsistent for PET-based treatment evaluation after or during neoadjuvant RCT [9, 23,24,25]. Since radiation fields are relatively large and radiation-induced inflammation is seen as early as at the end of the second week of treatment, restaging PET scans are often difficult to interpret due to pronounced inflammation of surrounding tissue. We have recently reported data showing a strong association between increased FDG uptake in the NTO and favourable treatment outcomes in a mixed cohort of patients with squamous cell and adenocarcinoma undergoing trimodality treatment [14]. We postulated that due to its high prognostic value this phenomenon may be used to stratify patients with squamous cell carcinoma (continue RCT to higher cumulative doses if inflammation of the NTO on PET is pronounced versus stop neoadjuvant treatment followed by surgery).

Therefore, we sought to determine if the FDG uptake cut-off values in the NTO could be applied in patients with squamous cell carcinoma treated at another institution with definitive RCT, and having the restaging PET scan at a similar time during radiotherapy. Using previously published cut-off values (optimized for the corresponding group) NTO parameters were still independent prognostic factors for OS, TF and DM. This is a remarkable result, since the two patient groups were markedly different (in terms of tumour site, ethnic background, histology and treatment). However, this only holds for the percentage change in uptake between the staging and restaging PET scans. The results for tracer uptake in the NTO in the restaging PET scan alone were much less convincing.

For ∆SUVmax NTO and ∆SUVmean NTO we found very stable cut-off values (see Supplementary Table 2), meaning that significant effects were found for a wide range of cut-off values for all investigated clinical endpoints (except ∆SUVmean NTO for DM). These cut-off value ranges include not only the previously published values but also a cut-off value of 0, and the application of a cut-off value of 0 still led to clinically relevant effect sizes for both ∆SUVmax NTO and ∆SUVmean NTO (see Table 5). However, it is important to note that ∆SUVmean NTO, as any SUVmean, strongly depends on the ROI delineation. In the current study NTO was delineated manually, and it is well known that manual delineation is prone to interobserver variability. Thus, our results for ∆SUVmean NTO might not be reproducible by other observers. On the other hand maximum values can be determined unambiguously for a given target structure. Oesophagus delineation (and therefore determination of SUVmax NTO) can be regarded as well reproducible as it is a common organ at risk in thoracic radiotherapy [26]. Therefore, an obvious interpretation of our results is that ∆SUVmax NTO >0 is the parameter of choice when using tracer uptake in the NTO for risk stratification. This, of course, needs to be confirmed by further investigations.

Our results are in line with those of another recent study in which the prognostic values of PET parameters were validated and which showed that fractional changes are the most robust parameter, especially if patient baseline characteristics differ considerably [27]. Off note, the prognostic significance of baseline MTV and TLG reported recently [15] in patients without concomitant chemotherapy, was confirmed in this independent evaluation including only patients with concomitant chemotherapy and with longer clinical follow-up information. The high prognostic impact of MTV has also been found in other studies [28, 29]. A recent study investigated patients undergoing definitive RCT for oesophageal cancer with a restaging FDG PET scan when about 50 Gy radiation dose had been administered. Some patients with an incomplete metabolic response received an additional boost dose of 10–20 Gy. This increased radiation dose led to an improved OS in partial responders compared with the standard dose regimen of 50.4 Gy [30]. Another innovative study combined lymph node and primary tumour responses on PET. Patients with favourable response characteristics had similar outcomes after definitive RCT to those treated with a trimodality approach [31].

These studies indicate that individualized treatment based on PET response evaluation might be a promising approach for definitive RCT in patients with oesophageal cancer. However, as evaluation of tumour response is restricted by RCT-induced inflammation, very conflicting data exist regarding the prognostic significance of restaging tumour PET parameters, as summarized in a recent review by Cremonesi et al. [32]. Besides inflammation, other factors relating to insufficient PET standardization may have led to these contradictory findings, especially as most studies were performed retrospectively. Besides correction of scan time, as performed in our analysis, further standardization calculating tumour-to-blood ratios may further improve the diagnostic accuracy of PET restaging [16, 33, 34]. The combination of optimal baseline and restaging tumour and NTO PET parameters for treatment stratification is currently under evaluation in a retrospective multicentre study (ZSF201720) that aims to identify the ideal combination of tumour and nontumour PET parameters to guide treatment decisions.

Although our study had several limitations that are inherent to retrospective analyses and due to the limited proportion of patients analysed, the study’s findings increase the evidence for the prognostic relevance of radiation-induced uptake in non-tumour tissues. The underlying biological reason for this phenomenon is unknown but may be related to similar genetically determined radiosensitivity of tumours and tissue of tumour origin, radiation-induced immunological reactions or both, as discussed previously [35]. Another limitation of this study includes the different radiation treatment techniques that were used. However, two thirds of the patients received intensity-modulated radiotherapy, and retrospective data suggest that there are no survival differences between 3D conformal and intensity-modulated techniques, and also some planning studies have shown only slight differences in normal tissue radiation exposure between the two techniques [36,37,38].

Using established PET parameters including SUV, MTV and TLG, the prognostic value of metabolic tumour parameters and interim PET data of the NTO delivers independent prognostic information. However, further analysis of the tumour by textural analysis and deep learning algorithms has the potential to improve the stratification between low-risk and high-risk patients considerably [39]. However, these methods have several drawbacks and need to be validated in larger cohorts of patients. Furthermore, a recent study has shown that textural analysis may only have limited additional prognostic value [40]. Due to these and other limitations textural analysis was not performed in the current study but will be the subject of ongoing research. Additionally, further analysis of textural features may be able to identify patients with high radiation-induced uptake in the NTO on CT imaging as suggested by a recent study [41].

Conclusion

The results of our study indicate that inclusion of the measurement of FDG uptake in the NTO on PET/CT allows discrimination between high-risk and low-risk patients in a multicentre setting with heterogeneous patient, tumour and treatment characteristics. Additionally, NTO parameters were independent of tumour parameters and therefore provide important additional prognostic value. In particular, ∆SUVmax NTO is a potential biomarker for risk stratification. Further investigations are necessary to confirm these promising results.