Introduction

Neoadjuvant therapy (NT) is the standard treatment for patients with locally advanced gastro-oesophageal cancer (GEC) to improve local disease control, making radical surgery possible and improving survival [1]. The response to NT is variable and highly individual [2]. Non-responsiveness to NT is associated with a worse prognosis [3], related to therapy-induced side effects and delay to surgery [1]. It is fundamental to distinguish patients that are responsive to conventional treatment from those that are unresponsive, preferably early during NT: patient-tailored treatments are needed to improve their outcome.

Response to NT is difficult to assess: clinical response remains undefined, correlating poorly with survival [4]; dimensional response is often used in clinical practice, but it is a relatively late event which may not precisely express the residual viable tumour because of the presence of fibrotic/necrotic tissue. Moreover, the reproducibility of tumour size assessment in GEC may be affected by visceral distension and cancer volume changes during NT. Most trials evaluated fluorodeoxyglucose (FDG) positron-emission tomography to assess the response to NT, with conflicting results [2, 57]. Moreover, about 20–25 % of oesophageal carcinomas and 40 % of gastric carcinomas are not FDG-avid [8, 9] and, therefore, metabolic response assessment is impossible in these tumours [9]. Endoscopic ultrasound has shown high accuracy during the initial staging of GEC, but no experience of the assessment of response to NT has been reported. Therefore there is a need to define a universally accepted technique to evaluate responsiveness to treatment.

Diffusion weighted imaging (DWI) is based on the degree of mobility of water protons, quantifiable by the apparent diffusion coefficient (ADC). The ADC measures the degree of free diffusion of water molecules within tissues, which is mainly influenced by the cell organisation, size and density. Cell death leads to a loss of cell membrane integrity and density, which determines an increase in ADC values. This explains why the ADC has recently emerged as a potential biomarker of the response to cancer therapy [10, 11]. Although the application of DWI to predict and monitor treatment response has been investigated in different types of neoplasm [1216], no data in the literature have reported the correlation between ADC modifications and objective histological parameters of treatment response and investigated GEC response to NT using DWI.

Tumour regression grade (TRG) is a five-grade scoring system, based on the percentage of viable residual neoplastic cells in relation to fibrosis/necrosis, that proved to be a prognostic marker for patients affected by locally advanced oesophageal [17] and rectal cancer [18, 19].

The aim of our study was to determine if DWI features can help to define GEC responsiveness to NT and to assess whether changes in ADC values after NT correlate with the histopathological response expressed by TRG.

Materials and methods

Our investigation followed the Standards for Reporting of Diagnostic Accuracy (STARD) guidelines. Our institutional review board approved this prospective study and written informed consent was obtained from all patients.

Patients

Between November 2009 and May 2012, 32 consecutive patients affected by biopsy-proven GEC (24 men, 8 women; mean age, 60 years; age range, 33–76 years) were enrolled and subjected to the following protocol: (1) pre-NT 1.5-T magnetic resonance imaging (MRI) including DWI; (2) in case of locally advanced-disease (T ≥ 3 or suspected positive lymph nodes) patients underwent NT; (3) post-treatment MRI; (4) radical surgery with histopathological evaluation, including TRG.

The inclusion criteria were: (1) histological diagnosis of GEC; (2) no contraindications to surgery; (3) no contraindications to NT; (4) written informed consent; (5) time period between second MRI and surgery up to 30 days.

The exclusion criteria were: (1) previous NT; (2) stage IV disease; (3) peritoneal seeding (demonstrated by peritoneal washing); (4) MRI contraindications.

Lesion locations were: middle oesophageal third (4/32, 12.50 %), distal oesophageal third (3/32, 9.38 %), gastro-oesophageal junction (GEJ) (9/32, 28.12 %), divided according to the Siewert classification [20] in Siewert I (2/32, 6.25 %), Siewert II (1/32, 3.12 %) and Siewert III (6/32, 18.75 %), and gastric (16/32, 50 %: fundus 4/32, 12.50 %; angulus 1/32, 3.12 %; antrum 5/32, 15.62 %; smaller and greater curvature 6/32, 18.75 %). The final pathological diagnosis obtained in all cases from the surgical specimen was: squamous cell carcinomas in 6 out of 32 patients (18.75 %) and adenocarcinomas in 26 out of 32 (81.25 %).

Neoadjuvant treatment

Patients with oesophageal and Siewert I lesions were treated with combined radio-chemotherapy: 50.4 Gy/28 fractions and intravenous injection of cisplatin (75–100 mg/m2 of body surface area/day/28 days). Before radiotherapy, patients with oesophageal carcinomas underwent two cycles of cisplatin (60 mg/m2) and 5-fluoruracil (200 mg/m2/day). Patients with Siewert II–III or gastric adenocarcinomas received three preoperative cycles of intravenous cisplatin (60 mg/m2), epiadriamycin (50 mg/m2), every 21 days, and continuous infusion of 5-fluoruracil (200 mg/m2/day) or oral capecitabine (1,250 mg/m2/day) for 21 days. Median time between the end of NT and MR was 10 ± 3 and between post-NT MR and surgery/histology was 8 ± 7 days. MRI was performed between 7 and 15 (10 ± 2) days after the end of NT and 21 ± 6 before surgery.

Histological reference

Surgical specimens were evaluated by a dedicated pathologist (L.A.), experienced in GEC. TRG score was adopted to grade therapeutic response [17]: TRG 1 corresponds to complete response without histologically residual cancer and extensive fibrotic reaction; in TRG 2 and 3 the fibrosis is higher than the neoplastic cellularity; in TRG 4 the residual tumour is outgrowing fibrosis and TRG 5 shows a complete absence of NT response.

MRI technique

All patients were examined using a 1.5-T MR system (Achieva; Philips Medical Systems, Best, The Netherlands), using a five-channel phased-array cardiac coil positioned according to tumour location. Visceral distension was obtained by oral administration of 300–500 ml water and Ferumoxsil (Lumirem; Guerbet, Roissy, France); intramuscular injection of scopolamine-butylbromide (20 mg, Buscopan; Boehringer, Ingelheim, Germany) was performed after patient positioning, in the absence of contraindication.

The MRI protocol (Table 1) consisted of T2-weighted multiplanar single-shot fast spin-echo sequences, with and without fat suppression, T2-weighted fast-spin-echo with cardiac- and respiratory-gating, DWI using single-shot echo planar imaging with cardiac- and respiratory-gating (b factors of 0 and 600 s/mm2), and dynamic T1-weighted 3D gradient-echo with fat-suppression during intravenous injection of 0.1 ml/kg body weight of gadobutrol, (Gadovist; Bayer Schering Pharma, Berlin, Germany) with an automatic injector (Spectris MR; Medrad Europe, Maastricht, The Netherlands) at a rate of 2 ml/s. Total imaging time was approximately 40 min.

Table 1 MRI standard protocol

MRI and ADC analysis

MR images were analysed by two experienced radiologists (F.D.C. and M.C., with 18 and 5 years of experience in abdominal MRI, respectively), blinded to clinical information and histopathological results.

The ADC maps were calculated with a dedicated workstation (Viewforum; Philips Medical Systems, Best, The Netherlands). Image quality was sufficient to evaluate ADC in all patients. ADC were obtained from regions of interest (ROIs) traced on lesion borders on T2-weighted images and automatically transferred to an ADC map, section by section. ROIs were traced around the entire lesion and, in the case of necrotic components, only around the solid components identified on the contrast-enhanced images. Mean tumour ADC values were calculated by averaging the tumour ROI ADC from each of the sections. Tumour volumes (V) were automatically calculated on a second remote multi-technique workstation (Vitrea Vital, Minnetonka, MN, USA) summing up cross-sectional volumes obtained by tracing manually the lesion borders on each section, considering both T2-weighted images and contrast-enhanced dynamic study.

Readers determined: pre-NT ADC; post-NT ADC; percentage changes in ADC (ΔADC), calculated as: \( \left( {\mathrm{post}-\mathrm{NT}\;\mathrm{ADC}-\mathrm{pre}-\mathrm{NT}\;\mathrm{ADC}} \right)/\mathrm{pre}-\mathrm{NT}\;\mathrm{ADC}\times 100 \); pre-NT V; post-NT V; percentage changes in volume (ΔV), calculated as: \( \left( {\mathrm{post}-\mathrm{NT}\;\mathrm{V}-\mathrm{pre}-\mathrm{NT}\;\mathrm{V}} \right)/\mathrm{pre}-\mathrm{NT}\;\mathrm{V}\times 100 \).

Statistical analysis

Continuous variables are presented as mean ± standard deviation. Categorical data are presented as frequencies and percentages. Inter-observers consensus and agreement in measuring ADC was evaluated by means of the Spearman’s correlation coefficient and the intraclass correlation coefficient (ICC). Confidence intervals (CIs) at level 0.95 were evaluated by bootstrap with adjusted percentile. Then, measurements were averaged between the two observers for further analyses. Differences between means of responder and non-responder patients were verified by means of t-type test statistics. Receiver operating characteristic (ROC) curve analysis was performed to determine the overall performance of pre- and post-NT ADC values, ΔADC and ΔV in differentiating responders from non-responders. The optimal cut-off was selected as the one minimising the distance between the curve and the ideal performance (sensitivity = 1; specificity = 1). P values were computed by means of permutation methods to avoid any distributional assumption.

The probability estimation of the patients’ outcome was obtained by fitting a logistic model optimising the Akaike Information Criteria (AIC) which retained as independent variables the tumour localisation and ΔADC.

The model optimising the AIC was selected via stepwise selection.

Sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV) and accuracy of the optimal model were evaluated by means of the leave-one-out cross-validation.

To assess the association between ΔADC and TRG values and between ΔV and TRG values, Spearman’s rank correlation coefficients were calculated. All statistical analyses were performed in the R environment. P values < 0.05 were considered statistically significant.

Results

Clinicopathological analysis

All patients underwent radical surgery after NT: nine patients (9/32; 28.12 %) had an Ivor Lewis oesophagectomy and the others (23/32; 71.87 %) a total gastrectomy.

Regarding TRG, the results of histopathological analysis were: 2/32 (6.25 %) TRG 1, 4/32 (12.5 %) TRG 2, 11/32 (34.37 %) TRG 3, 9/32 (28.12 %) TRG 4, and 6/32 (18.75 %) TRG 5. According to TRG, patients were divided into responders (n = 17: TRG 1-2-3) and non-responders (n = 15: TRG 4-5).

MR analysis

ICC and Spearman’s correlation coefficient between the two readers were calculated for pre- and post-NT ADC values. The interobserver reproducibility was very good both for pre-NT (Spearman’s rho = 0.8160, CI = 0.5646–0.9313; ICC = 0.8993, CI = 0.8070–0.9586) and post-NT (Spearman’s rho = 0.8357, CI = 0.5981–0.9320; ICC = 0.8663, CI = 0.7618–0.9276) measurements (Fig. 1).

Fig. 1
figure 1

Bland-Altman plots representing the interobserver reproducibility between the two readers for (a) pre-NT and (b) post-NT ADC values

In basal pre-NT MR analyses, no significant differences in ADC values were obtained either between gastric versus oesophageal cancers or the two different histotypes (squamous and adenocarcinoma). After NT, ADC increased more significantly in oesophageal tumours and in squamocellular carcinomas (Table 2). No differences in tumour volume values and changes were found between responders and non-responder; conversely, significant differences were found evaluating ADC (Table 3). The pre-NT ADC values in responders were significantly lower and increased significantly after NT. These results were similar when analysing separately the two groups based on the tumour site, demonstrating consistent results for gastric and oesophageal cancer (Table 4).

Table 2 Mean values of pre- and post-NT tumour volumes and ADC, ΔV, and ΔADC for the overall population, for the oesophageal and gastric tumours and for the different histological subgroups
Table 3 Mean values of pre- and post-NT tumour volumes and ADC, ΔV, and ΔADC considering the responder and non-responder patients
Table 4 ΔV and ΔADC values analysed considering the tumour site

Table 5 shows the results for the multivariate logistic regression analysis.

Table 5 Multivariate logistic regression analysis fitted for the patients’ outcome prediction based on leave one out cross validation

A highly significant strong inverse correlation was found between ΔADC and TRG values: (r = −0.71, P = 0.000004; Fig. 2), while no evidence of correlation between ΔV and TRG was found (r = −0.02, P = 0.883; Fig. 3).

Fig. 2
figure 2

Spearman’s rank correlation line between tumour regression grad (TRG) and ΔADC showing strong inverse correlation. All patients, except one, showing a decrease in ADC values after treatment, correspond to TRG 4 and 5, while patients with an increase in ADC values correspond to lower TRG grade

Fig. 3
figure 3

Spearman’s rank correlation line between TRG and ΔV showing the absence of correlation

Evaluating ROC curves in Fig. 4, responders may not be reliably differentiated from non-responders on the basis of pre-NT ADC values and ΔV. In fact, with a cut-off for pre-NT ADC below 1.50 × 10−3 mm2/s, responders may be detected with a sensitivity of 35.29 %, specificity of 60 %, PPV of 50 %, NPV of 45 % and accuracy of 46.87 % (AUC 0.688; P = 0.070) (Fig. 4a). Patients with a volume decrease of >57 % are responders with sensitivity = 35.29 %, specificity = 66.66 %, PPV = 54.54 %, NPV = 47.61 %, accuracy = 50 %; AUC = 0.6431; P = 0.635; Fig. 4b).

Fig. 4
figure 4

Analysis of ROC curves, to find an optimal cutoff to distinguish responders from non-responders on the basis of the pre-neoadjuvant treatment (NT) ADC value (a), ΔV (b), post-NT ADC value (c) and ΔADC (d)

Analysing post-NT ADC values with ROC curves, trying to discriminate responders from non-responders we found a cut-off of 1.84 × 10−3 mm2/s: patients with post-NT ADC values above this cut-off are responders (sensitivity = 70.6 %, specificity = 80 %, PPV = 80 %, NPV = 70.6 %, accuracy = 75 %; AUC = 0.837; P = 0.0007; Fig. 4c). Therefore, post-ADC values may help to discriminate responders and non-responders.

Trying to discriminate the two groups considering the ΔADC, patients with an ADC increase rate of over 13.6 % are responders, (sensitivity = 88.2 %, specificity = 86.7 %, PPV = 88.2 %, NPV = 86.7 %, accuracy = 87.5 %; AUC = 0.909; P = 0.000001; Fig. 4d).

Discussion

This study investigated volumetric modifications and changes in the water diffusivity of gastro-oesophageal cancer after NT and was aimed at assessing their accuracy in differentiating between responders and non-responders, considering the histological TRG as the standard of reference.

Volume reduction after NT of responders and non-responders was not significantly different, and no correlation was found between ΔV and TRG, confirming that dimensional criteria alone are not good indicators of treatment response in GEC. This result may be explained by different factors: first, volume analysis can be affected by tumour shape irregularity, as already stated for rectal cancer [21], and by different grades of visceral distension; second, dimensional criteria are unable to differentiate residual viable tumour from fibrosis.

The novelty of our study is that, to the best of our knowledge, for the first time a strong inverse correlation between ΔADC and TRG was found because, in our opinion, ADC can provide specific information about tumoral structure and its cellular density. Our results are in line with those of a trial by Lambrecht et al. [22] in 20 patients with locally advanced rectal cancer (LARC), who observed that volumetric measurement showed lower PPV and NPV than ADC measurements in predicting complete response to therapy.

Responders showed significantly lower pre-NT ADC than non-responders. To the best of our knowledge, no similar data have been published on gastric cancer; regarding oesophageal lesions, we only found one study [23] with apparent opposite results, stating that patients with pre-NT high ADC showed a better survival rate and a better response to chemo/radiotherapy than patients with lower ADC. However, in the study by Aoyagi et al. [23], patients were divided between responders and non-responders, using RECIST criteria as reference standard instead of histological marker of response (TRG) employed in our study; moreover, the two studies are poorly comparable due to a substantial difference in terms of tumour histology. Our results are consistent with several previous studies on rectal cancers [21, 22, 24, 25]: Lambrecht et al. [22], reported a pre-treatment ADC of 0.92 ± 0.12 × 10−3 mm2/s for responders, significantly lower than that of the non-responders (1.19 ± 0.22 × 10−3 mm2/s). Also, Sun et al. [24], in 37 LARCs, observed that the pre-NT ADC (1.07 ± 0.13 × 10−3 mm2/s) of the T-downstaged group was lower than that (1.19 ± 0.15 × 10−3 mm2/s) of the T-non-downstaged group and Dzik-Jurasz et al. [25], in a study including 14 LARC, showed significant inverse correlation between pre-NT ADC values and tumour response. A similar correlation was also found in different types of neoplasms and metastases [2629]: the most accredited hypothesis still remains that high pre-treatment ADC values are related to the presence of necrotic components, poor perfusion, and hypoxic environment, leading to a reduced sensitivity to NT. Therefore, DWI shows the potential to identify pre-treatment features affecting the tumour response to NT.

After treatment, all responders, except one, showed a significant increase in the ADC values (Fig. 5), with higher values than non-responders (Fig. 6). This increase can be explained by the loss of tumour structure integrity in response to therapy, with apoptosis, necrosis and cellularity reduction. Our results are consistent with those of a previous study [30], including different types of neoplasms, also oesophageal and gastric cancer. Rising ADC values following successful NT have also been observed in different cancer types, such as liver metastases [26], breast cancer [31], cervical cancer [14], soft tissue sarcomas [29], head and neck lesions [32], but only a few studies correlated ADC changes with histopathological parameters; most studies used lesion size modifications as the reference, according to the RECIST [33]. GEC is not considered “measurable” by RECIST, so theoretically it cannot be assessed by those criteria; moreover, one-dimensional measurement of gastric wall thickness is critically dependent on stomach distension during examination [1], with important limitations of measurement reproducibility. Considering the difficulty of morphological evaluation in differentiating the viable residual tumour from fibrosis, we thought it was more appropriate to consider the TRG as an indicator of response to treatment.

Fig. 5
figure 5

Axial T2 image showing a lesion of the middle oesophageal wall (a); we drew an ROI on DWI (b value, 600 mm/s2) (b) along the lesion border, and then copied it to the ADC map (c) calculating a pre-NT ADC value of 1.21 ± 0.26 × 10−3 mm2/s. Restaged after NT, we observed a reduction in wall thickening (d); drawing an ROI (b value, 600 mm/s2) (e) and then copying it to the map, we observed a significant rise in the ADC values (2.11 ± 0.33 × 10−3 mm2/s) (f). This patient was found to be a responder with TRG 2

Fig. 6
figure 6

Axial T2 with (a) and without (b) fat suppression showing a lesion of the subcardial region extended to the gastric fundus; we drew an ROI on DWI (b value, 600 mm/s2) along the lesion border (c) and then copied it to the ADC map (d) calculating a pre-NT ADC value of 1.73 ± 0.29 × 10−3 mm2/s. Restaged after NT, we observed a slight reduction of the wall thickening (e, f); drawing an ROI (b value of 600 mm/s2) (g) and then copying it to the map, we observed no significant rise in the ADC values (1.58 ± 0.23 × 10−3 mm2/s) (h). This patient was found to be a non-responder with TRG 4

Our calculated optimal post-NT ADC and ΔADC cut-offs for the definition of responders were 1.84 × 10−3 mm2/s and 13.63 %, respectively. DWI could represent a reliable tool to detect tumour necrosis, as necrosis increases ADC values. This consideration is fundamental to the clinical outcome because necrosis indicates therapeutic efficacy. The relationship between early ADC changes and clinical outcome seems more relevant than ADC changes after a longer treatment period; this conclusion could be due to tissue repair mechanisms such as the decrease in oedema and the organisation of necrosis, showing that ADC changes may represent a good indicator of early evaluation of clinical response.

There are some limitations to this study: first, the small population examined; second, the different NT regimen used in patients; third, the presence of only one time point of imaging assessment, at the end of treatment. In future studies, it would be interesting to assess patients at different time points during NT and necessary to investigate DWI usefulness in the assessment of early treatment response, as suggested in many trials [26, 3436].

The 7th edition of the TNM classification [37] clearly segregates Type I to III gastro-oesophageal cancers from gastric malignancies, and the NT of these two entities is different; at the same time, even squamous cell cancers require a different neoadjuvant approach from adenocarcinomas, making our study population heterogeneous.

Interestingly, non-significant differences were present between ADC values regarding location (gastric vs oesophageal) and histotypes (squamous vs adenocarcinoma) in basal pre-NT MR. After NT, ADC increased more significantly in oesophageal tumours and in squamocellular carcinomas suggesting that these tumours better respond to NT.

Considering the different anatomical sites of the tumours, the same statistically significant differences in ΔADC for the entire group and also for both oesophageal and gastric cancers were found; this result was not found for ΔV, suggesting that ADC changes could be more reliable than dimensional criteria in the assessment of NT.

From the histopathological point of view, we divided our population in two groups obtaining statistically significant differences for ΔADC and not for ΔV in adenocarcinomas, while no statistically significant differences were found for either ΔADC or ΔV in squamous cell carcinomas; in our opinion, this result is due to the small number of patients affected by squamous cell carcinoma (6/32, 18.7 %).

For this reason, we combined the two populations to carry out statistical analysis and obtain statistically significant results.

In conclusion, ADC changes can be considered a reliable non-invasive indicator of GEC treatment response to NT compared with histological assessment based on TRG. Moreover, our results suggest that patients with lower pre-NT ADC values have a greater chance of respond to NT, but the pre-NT ADC value alone is a poor predictor of NT response in the single patient.

These findings open up a new window of opportunity in the assessment of NT response in patients affected by GEC, in order to provide tailored treatment regimens.