Introduction

Neoadjuvant therapy (NT) has been widely accepted as a standard treatment option for locally advanced esophageal carcinoma [1]. With the development of paclitaxel, neoadjuvant chemotherapy (NAC) has become more widely used in East Asian countries [2]. Many studies have reported that NAC can reduce tumor size, increase surgical resectability, and improve prognosis of patients with esophageal squamous cell carcinoma (ESCC) [2, 3]. However, response to NAC is widely variable [4]. The prognosis of those who do not respond to NAC is worse compared with those who respond, partially due to therapy-induced side effects and delay in surgical resection [5]. Therefore, it is of essential to accurately predict the response of patients with esophageal carcinoma to NAC. This will help clinicians optimize therapeutic options, by modifying or discontinuing ineffective treatment.

At present, the evaluation of the response to NAC for esophageal carcinoma mainly relies on the measurement of tumor size by traditional modalities such as endoscopic ultrasonography (EUS) and computed tomography (CT), which have shown high accuracy in diagnosis and staging of esophageal carcinoma. However, they are limited in distinguishing between residual tumor, fibrosis, and inflammation [6, 7]. In addition, traditional imaging modalities cannot evaluate the therapeutic efficacy until at least 4 weeks after the end of therapy, and cannot predict early response to NT. Therefore, to date, assessment of response to NT is still unsatisfactory. Fluorine-18 fluorodeoxyglucose positron emission tomography-computed tomography (18F-FDG PET/CT) for response evaluation showed conflicting results and insufficient accuracy, along with the potential of high radiation exposure and increase cost [8, 9]. It is essential to define a safe, non-invasive and effective imaging modality to assess and predict the response to NAC accurately.

Functional MR imaging techniques can reflect biological and microstructural characterization of tumors, so they can quantify therapy-induced changes before anatomic variation in tumor size. Dynamic contrast-enhanced magnetic resonance images (DCE-MRI) can predict early response to chemoradiotherapy in esophageal carcinoma [10, 11]. However, DCE-MRI requires complex pharmacokinetic modelling to calculate the vascular volume fraction of angiogenesis [12]. Diffusion-weighted imaging (DWI) is a functional imaging technique that is widely used in characterizing malignant tumor, monitoring, and predicting therapeutic efficacy [13,14,15,16]. However, the DWI-derived apparent diffusion coefficient (ADC) cannot separate diffusion and perfusion information [13].

Intravoxel incoherent motion (IVIM) can differentiate and quantify pure water molecular diffusion and microcirculatory perfusion of the tissue through three parameters calculated by the bi-exponential model: true diffusion coefficient (D), pseudodiffusion coefficient (D*), and pseudodiffusion fraction (f) [17, 18]. The IVIM technique is increasingly being investigated and has been proven to be more valuable than the ADC value in tumor evaluation and in prediction of therapeutic efficacy of NT [19,20,21,22,23,24,25]. However, compared with the conventional DWI sequence, the IVIM sequence requires higher quality imaging and higher signal-to-noise ratio (SNR). To date, research on the application of IVIM sequence in esophageal cancer is still scarce [26].

Although functional MR imaging techniques show powerful capabilities in assessment of tumor microenvironment, histopathological evaluation remains the gold standard. Based on the percentage of viable residual neoplastic cells in relation to fibrosis/necrosis, tumor regression grade (TRG) can classify tumor pathological response after NT. Compared with tumor size and pathologic lymph node status, only TRG can predict disease-free survival in patients with esophageal cancer, and TRG should be considered when evaluating therapeutic efficacy [27]. Because TRG classification requires postoperative pathological specimens, it cannot be used to predict treatment response to NT before treatment.

The primary purpose of our study was to investigate the value of IVIM-DWI sequence for the assessment and prediction of pathologic response to NAC in locally advanced ESCC.

Materials and methods

Study population

This prospective study was approved by the Ethics Committee of our Hospital and all participants signed a written informed consent. Between September 2015 and May 2018, sixty-three consecutive patients pathologically diagnosed with locally advanced ESCC who underwent NAC followed by radical resection were prospectively enrolled.

The inclusion criteria were (1) histological diagnosis of ESCC (stage III); (2) written informed consent; (3) pre-NAC and post-NAC magnetic resonance imaging (MRI) and IVIM sequence (4) without previous history of any malignancy or anticancer treatment; (5) radical surgery for ESCC, with histopathological evaluation.

The exclusion criteria were (1) MRI contraindications; (2) contraindications to NAC; (3) uncompleted NAC; (4) inferior quality of IVIM images (e.g., numerous motion artifacts, low SNR) that would be considered insufficient for further analysis.

Neoadjuvant chemotherapy protocol

All patients underwent two cycles of standard NAC regimen, which consisted of weekly intravenous administration of nedaplatin (50 mg/m2) and paclitaxel (120~150 mg/m2). Radical surgery was performed 3 to 4 weeks (median 23 days) following the completion of NAC.

MRI techniques and imaging analysis

All patients underwent two series of MR scanning within 1 week (median 3 days) before NAC, and 2 to 3 weeks (median 16 days) after completion of NAC, prior to surgery. The IVIM sequence was implemented during the same MR examination. All examinations were performed on a 3-T MR system (MAGNETOM Skyra, Siemens Healthcare) with an anterior 18-element body coil and in-built posterior 32-element spine coil array. All patients were carefully trained for shallow, slow breath before the examination to reduce respiratory motion artifacts. Raceanisodamine hydrochloride injection (Ningbo Dahongying Pharmaceutical Co.) with a dose of 10 mg was injected intramuscularly 15–20 min before MRI scanning to decrease esophageal peristalsis.

The MRI protocol (detailed parameters in Table 1) consisted of transverse T1-weighted images using volumetric interpolated breath-hold examination (VIBE) sequence, respiratory and electrocardiographically gated T2-weighted fast spin echo with fat suppression. The acquisition times of transverse T1WI and T2WI are 18 s and 3:45 min, respectively. IVIM was acquired using a prototyped integrated specific slice dynamic Shim (iShim) sequence. This sequence first acquires 2D multigradient echo images for each imaging slice with its FOV and orientation adapted from the respective imaging slice. The echo time difference of the first and last echo was chosen such that fat and water alias. Then, a phase difference image was calculated from these two echoes. Nine b values from 0 to 800 s/mm2 (0, 25, 50, 75, 100, 200, 400, 600, and 800 s/mm2) were applied [28]. The time for the acquisition of the field map was approximately 540 milliseconds per slice. The total acquisition time of IVIM was 3:30 min.

Table 1 MRI standard protocol

All IVIM images were analyzed independently by two radiologists (T.S. with 15 years and Q.Y. with 6 years of experience in MRI), who were blinded to clinical and pathological outcomes. IVIM raw data was transferred to a work station and processed by MADC software in the FuncTool software package. Region of interest (ROI) of the esophageal lesion was manually delineated on the IVIM-DWI with a b value of 400 s/mm2 using axial T2-weighted imaging as a reference. The radiologists were instructed to include the entire tumor, except areas of cystic or necrotic degeneration with very high T2 signal. The area of ROI must be greater than 50 mm2. ROI of the tumor was automatically transferred to each pseudo color map of IVIM parameters. The values of ADC, D, D*, and f were automatically calculated using MADC software. The changes in IVIM parameters (Δ parameter) before and after NAC were calculated as Δ parameter = post-NAC parameter − pre-NAC parameter.

Response evaluation

An experienced pathologist with 14 years of experience in tumor pathology diagnosis, who was blinded to the IVIM-DWI data, evaluated the pathologic response according to the AJCC (7th edition)—tumor regression grade (TRG) system. The diagnostic criteria were as follows: TRG 3 (poor response), minimal or no tumor kill, extensive residual cancer; TRG 2 (minimal response), residual cancer remaining, but with predominant fibrosis; TRG 1 (moderate response), only small clusters or single cancer cells; TRG 0 (complete response), no remaining viable cancer cells.

Statistical analysis

All statistical analyses were performed using SPSS version 22.0. Kolmogorov-Smirnov was performed to test whether the measurement data conform to the normal distribution. Continuous variables are presented as mean ± standard deviation (SD). Measurement data that do not conform to the normal distribution are represented as the median (upper and lower quartile). Categorical data are presented as frequencies and percentages.

Comparisons of continuous and categorical clinicopathological factors were processed by t test and chi-square test, respectively. The inter-observer consensus in measuring ADC, D, D*, and f was evaluated using the intraclass correlation coefficient (ICC) with 95% confidence interval (CI). The criteria are as follows: 0.00–0.20, poor correlation; 0.21–0.40, fair correlation; 0.41–0.60, moderate correlation; 0.61–0.80, good correlation; and 0.81–1.00, excellent correlation [29].

Compared Student’s t test (normal distribution) and Wilcoxon rank sum tests (non-normal distribution) were performed to evaluate the change of ADC, D, D*, f pre- and post-NAC. The differences of IVIM-DWI values (i.e., pre- and post-NAC ADC, D, D*, f, and ΔADC, ΔD, ΔD*, Δf) in different TRG classifications were verified by means of t test of independent sampler (normal distribution) or Wilcoxon rank sum test (non-normal distribution). Values with statistical differences between different TRG groups were further processed by receiver operating characteristic (ROC) curve analysis to calculate the best cutoff thresholds. Specificity, sensitivity, and the area-under-the-curve were computed. A statistically significant difference was identified when p < 0.05.

Result

Study population

A total of sixty-three patients were consecutively included in our study. According to the exclusion criteria, two patients were excluded due to MRI contraindications; four patients were excluded due to contraindications to NAC; six patients were excluded due to incomplete NAC; eleven patients were excluded due to inferior quality of IVIM images, of which five were excluded before NAC, and six were excluded after NAC. The remaining forty patients were enrolled. Patients had a mean age of 61 ± 8 years (age range 44–74 years), and 31 of them were male. The pathological type was all squamous cell carcinoma. The clinicopathologic characteristics of 40 patients showed in our study are detailed in Table 2.

Table 2 Baseline clinicopathologic characteristics and their predictive effects (n = 40)

Comparison of clinicopathologic characteristics according to TRG

Twenty-two patients were identified as TRG2, and eighteen as TRG3 in pathologic evaluation. No patients were classified as TRG 0 or TRG 1. Patients were divided into responders (n = 22: TRG 2) and non-responders (n = 18: TRG 3) with reference to TRG. Baseline clinicopathological characteristics and their statistical association with TRG were summarized in Table 2. No significant differences in clinicopathologic characteristics were found between responders and non-responders (p > 0.05).

Comparison of ADC and IVIM parameters before and after NAC

The inter-observer reproducibility was excellent for pre-NAC parameters (ADC: ICC = 0.898, 95% CI = 0.815–0.945; D: ICC = 0.925, 95% CI = 0.864–0.960; D*: ICC = 0.918, 95% CI = 0.851–0.956; f: ICC = 0.848, 95% CI = 0.732–0.917) and post-NAC parameters (ADC: ICC = 0.843, 95% CI = 0.723–0.914; D: ICC = 0.867, 95% CI = 0.762–0.927; D*: ICC = 0.822, 95% CI = 0.687–0.901; f: ICC = 0.823, 95% CI = 0.690–0.902).

As detailed in Table 3, the value of ADC increased significantly after NAC (1.97 ± 0.50 vs 2.51 ± 0.49 × 10−3 mm2/s, p < 0.001), and this trend was consistent in D value (1.28 (1.10, 1.41) vs 1.77 (1.45, 1.92) × 10−3 mm2/s, p < 0.001) and f value (0.34 ± 0.14 vs 0.40 ± 0.15, p = 0.038). The change of D* value before and after treatment was not statistically significant (29.05 (16.80, 47.28) vs 32.20 (18.33, 50.13) × 10−3 mm2/s, p = 0.536).

Table 3 Mean values of pre- and post-NAC ADC, D, D*, and f for the overall population (n = 40)

Comparison of ADC, IVIM parameters and Δ parameters between responders and non-responders’ groups

The post-NAC D (1.82 ± 0.27 vs 1.52 ± 0.39 × 10−3 mm2/s, p = 0.007) and ΔD (0.65 (0.35, 0.82) vs 0.25(− 0.08, 0.31) × 10−3 mm2/s, p < 0.001) in responders were significantly higher than those in non-responders. The pre-NAC D values of responders were marginally lower than those in non-responders (1.22 ± 0.22 vs 1.42 ± 0.38 × 10−3 mm2/s, p = 0.058). ΔADC values in responders were slightly higher than those in non-responders (0.66 ± 0.53 vs 0.39 ± 0.36 × 10−3 mm2/s, p = 0.072), but the difference was not statistically significant. There was no statistical difference in pre-/post-NAC parameters (ADC, D*, and f) and Δparameters (ΔD* and Δf) between responders and non-responders (Table 4). Typical IVIM images for different pathologic responses are presented in Figs. 1 and 2.

Table 4 Mean values of pre- and post-NAC parameters, Δparameters, and their predictive effects (n = 40)
Fig. 1
figure 1

Typical IVIM pictures of a 68-year-old man with a mid-esophageal squamous cell carcinoma with TRG 2 (responder). a Axial diffusion-weighted image (b = 400 s/mm2) shows a hyperintense focal esophageal lesion. The ROI (green contours) of tumor was manually delineated on IVIM-DWI. be The corresponding ADC, D, D*, and f maps show the pre-NAC ADC value of 2.21 × 10−3 mm2/s, a D value of 1.38 × 10−3 mm2/s, a D* value of 10.7 × 10−3 mm2/s, and an f value of 0.32 of the lesion. f After NAC, we observed a reduction in tumor size. g We observed a rise in the ADC value (post-NAC ADC = 2.50 × 10−3 mm2/s, ΔADC = 0.29 × 10−3 mm2/s). h Post-NAC D value was increased to 1.99 × 10−3 mm2/s (ΔD = 0.61 × 10−3 mm2/s). i Post-NAC D* value was increased to 32.5 × 10−3 mm2/s (ΔD* = 21.8 × 10−3 mm2/s). j Post-NAC f value was increased to 0.40 (Δf = 0.08)

Fig. 2
figure 2

Typical IVIM pictures of a 60-year-old woman with a mid-esophageal squamous cell carcinoma with TRG 3 (non-responder). a Axial diffusion-weighted image (b = 400 s/mm2) shows a hyperintense focal esophageal lesion. The ROI (green contours) of tumor was manually delineated on IVIM-DWI. be The corresponding ADC, D, D*, and f maps show the pre-NAC ADC value of 2.56 × 10−3 mm2/s, a D value of 0.99 × 10−3 mm2/s, a D* value of 16.4 × 10−3 mm2/s, and an f value of 0.63 of the lesion. f After NAC, we observed a slight reduction in tumor size. g We observed a rise in the ADC value (post-NAC ADC = 3.40 × 10−3 mm2/s, ΔADC = 0.84 × 10−3 mm2/s). h Post-NAC D value decreased slightly to 0.83 × 10−3 mm2/s (ΔD = − 0.16 × 10−3 mm2/s). i Post-NAC D* value increased to 32.5 × 10−3 mm2/s (ΔD* = 16.1 × 10−3 mm2/s). j Post-NAC f value increased to 0.73 (Δf = 0.10)

Prediction of pathologic response to NAC in locally advanced ESCC

With an area under curve (AUC) of 0.722, post-NAC D exhibited specificity of 66.7%, and sensitivity of 77.3%, when 1.685 × 10−3 mm2/s was used as the cutoff value. With a cutoff value for ΔD above 0.350 × 10−3 mm2/s, responders may be detected with a specificity of 83.8% and a sensitivity of 77.3% (AUC = 0.859) (Fig. 3).

Fig. 3
figure 3

Analysis of ROC curves, to find an optimal cutoff to distinguish responders from non-responders on the basis of the post-NAC D value and ΔD

Discussion

Our results demonstrated that post-NAC D and ΔD values were significantly higher in responders than in non-responders. According to ROC curve analysis, high post-NAC D and ΔD values may help to predict favorable pathologic response in locally advanced ESCC. Moreover, ΔD showed better performance (AUC = 0.859) than post-NAC D (AUC = 0.722) in predicting pathologic response.

In this study, patients with high post-NAC D (> 1.685 × 10−3 mm2/s) and high ΔD values (> 0.350 × 10−3 mm2/s) had favorable response to NAC, which could be explained by the principle of IVIM introduced by Le Bihan et al [17, 18]. Compared with conventional DWI, the IVIM-derived D values are calculated using the bi-exponential model and reflect the true water molecular diffusion. The increased extracellular spaces due to effective chemotherapy may have caused decrease restriction of diffusion of water molecules, which would then yield higher D values than those obtained before NAC. ΔD values have high predictable performance in identifying treatment response in locally advanced ESCC, with sensitivity and specificity of 77.3% and 83.8%, respectively. Compared with other imaging modalities such as FDG-PET/CT and DCE-MRI, the diagnostic performance of ΔD value (AUC = 0.859) is superior to the maximum standardized uptake value (SUVmax) (AUC = 0.76), Ktrans (AUC = 0.741) and Kep (AUC = 0.796), as reported in prior studies (Table 5) [8, 10].Similar to our results, a previous study reported that the percentage change in D value could be utilized to assess response of ESCC to NT [26]. Unlike the study by Zheng et al [26], TRG was introduced as the criterion to evaluate the efficacy of NAC in the current study, compared with Response Evaluation Criteria in Solid Tumors (RECIST) that was used by Zheng et al TRG is a grade scoring system, based on the percentage of viable residual neoplastic cells with reference to necrosis or fibrosis, which can more accurately assess the response of tumor tissue to chemotherapy drugs compared with RECIST which is based on tumor size changes. TRG has proven to be a reliable prognostic marker for patients with locally advanced esophageal carcinoma [27] and rectal cancer [30]. To our knowledge, our study is the first to apply the IVIM sequence with iShim technique in esophageal cancer. Furthermore, the iShim technique can reduce the geometric deformation and increase the SNR [31, 32], which can improve the accuracy of IVIM parameters fitting effectively and provide excellent inter-observer reproducibility on ADC and IVIM-based parameters. Our finding concerning the D value was consistent with previous studies, which found that an increase in D value can predict chemotherapeutic response in a human gastric cancer mouse model [33] and locally advanced rectal cancer [34].

Table 5 Comparison of diagnostic performance between IVIM, DCE-MRI, and PET-CT

Previous studies reported that ADC value was useful in assessing and predicting pathologic response to NT in esophageal carcinoma [15, 26, 35, 36]. A prior study with small sample size (n = 20) reported that the ΔADC during NT for esophageal cancer seemed highly predictive of histopathologic response, with sensitivity and specificity of up to 82% and 100%, respectively (AUC = 0.92). However, the authors acknowledged that the diagnostic performance of ΔADC is likely overestimated to some extent [35]. Two other similar studies have also demonstrated that the ADC/ΔADC values have higher diagnostic performances when distinguishing between responders and non-responders, with accuracy rates of 71.4% and 87.5%, respectively [15, 36]. However, our study confirmed that although the ADC value of ESCC increased significantly after NAC, there was no significant difference in pre-/post-NAC ADC and ΔADC values between responders and non-responders’ groups. Unlike previous studies, the pathologic response to NAC in our study was relatively similar, without TRG 0 or TRG 1 group as controls, which may affect the diagnostic performances of D and ADC values in predicting the pathological response to NAC. The fact that there were observable differences in the minimal responders does suggest the value of IVIM technique. DWI-derived ADC value is calculated by the mono-exponential diffusion model, which is considered as Brownian diffusion under ideal condition and neglects the influence of microcapillary perfusion [37]. However, diffusion in vivo tissue is more complicated than the Brownian motion. The ADC value is often higher than the value of true water molecular diffusion and cannot reflect the true tissue diffusion. In this study, the diagnostic performance of the D value is significantly better than the ADC value in evaluating and predicting the pathological response to NAC. Compared with IVIM sequence, traditional DWI technology is stable and simple. Future large sample studies are warranted to assess the predictive value of ADC in assessing response of patient with ESCC to NAC.

Our results also confirmed that the f value of ESCC increased significantly after NAC. A previous study demonstrated a similar trend in f values [26]. This may be related to cellular apoptosis, inflammatory cell infiltration, and expansion of cellular spaces induced by chemotherapy, which may lead to a relative enlargement in microcapillary perfusion and an increase in f value [38]. However, a study on IVIM [33] showed that f value in a human gastric cancer mouse model decreased significantly after treatment, which is inconsistent with our results. Moreover, no significant difference in pre-/post-NAC f and Δf values were found between responders and non-responders’ groups, which were inconsistent with the results of the previous study [26]. A possible reason was that f value is more susceptible to measurement and noise variations, which result in poor reproducibility, limiting its clinical value. Therefore, the changes in f values after NT and the utility for evaluating therapeutic response remain controversial [26, 39].

Consistent with previous studies [14, 19, 26, 40], this study also found that the D* value and related parameters were unreliable in predicting the therapeutic results, probably due to its intrinsic susceptibility to noise and poor measurement reproducibility [39]. Bi-exponential fitting problems, number, and magnitude of b values may be other contributing factors [40]. Although IVIM-DWI is more technically challenging to perform than other modalities, we adopted the iShim technique, intramuscular injection of raceanisodamine hydrochloride, and breathing training to acquire the enough image quality for diagnosis, which can provide a useful reference for future related research.

Our study has several limitations. First, the sample size was relatively small. Second, in this study, we did not compare treatment response on IVIM-DWI to that obtained using other imaging modalities such as FDG-PET/CT or DCE-MRI and using conventional RECIST criteria. Future studies are necessary to compare the utility of various imaging techniques in assessing treatment response. Third, a total of eleven patients were excluded due to the inferior IVIM image quality before and after NAC. Therefore, the stability of IVIM image quality of esophageal cancer needs to be further improved. Finally, there is no uniform scanning specification for IVIM sequence, and no consensus has been reached on the number and magnitude of b values that ought to be applied in clinical studies [41, 42]. Therefore, further studies are required to optimize image acquisition and post-processing techniques for deriving sufficiently accurate IVIM parameters.

In conclusion, this pilot study demonstrated that IVIM-DWI may be a feasible, non-invasive, and valuable functional imaging technique in predicting pathologic response at advanced ESCC to NAC. The post-NAC D and ΔD values were effective biomarkers for predicting pathologic response to NAC in locally advanced ESCC and ΔD was the most valuable IVIM parameter.