Introduction

Concurrent chemoradiation (CCRT) is the recommended treatment for bulky and locally advanced cervical cancer (LACC) [1,2,3,4,5]. However, it has been found that 30% of patients experience treatment failure [6,7,8] and ultimately 40% of patients were found to have recurrence [9, 10]. Therefore, it would be beneficial to identify potential imaging biomarkers that predict patients with poor response to standard treatment for potential treatment escalation, selection for entry into clinical trials and more intensive follow-up regimen.

Quantitative evaluation of diffusion-weighted imaging (DWI) is increasingly utilised in cervical cancer due to its non-invasive nature. Previous studies have found that the apparent diffusion coefficient (ADC) was significantly different between tumour grades [11] and International Federation of Gynecology and Obstetrics (FIGO) stages [12]. ADC has also been shown to be useful in evaluating treatment response as one previous study demonstrated that baseline ADC was significantly different between patients with different overall survival [13]. Furthermore, ADC measured at baseline and in the middle of CCRT were significantly lower in patients who suffered from recurrence [14].

Previous studies showed that the diffusional signal measured in cervical cancer was better ascribed to the intravoxel incoherent motion (IVIM) phenomenon [15, 16]. This biexponential model gives rise to three parameters: the pure diffusion coefficient (D), perfusion fraction (f) and pseudo-diffusion coefficient (D*). D aims to represent water diffusion without perfusion effects, f is a measure of the flowing blood fraction, while D* aims to represent perfusion-related diffusion in the microcapillary network. The f and D* were significantly different between histological subtypes of cervical cancer [17] and could monitor tumour changes during CCRT [18]. A change in D in the pelvic marrow between baseline and after CCRT could predict patients who would suffer from haematological toxicity as a result of CCRT [19].

One disadvantage of IVIM is that most acquisition protocols require more than eight b-values, leading to long acquisition times [20, 21]. Optimised subsampling, a process which seeks to find the most important b-values for IVIM estimation, has previously been suggested to reduce the number of b-values needed. The estimated IVIM parameters from optimised subsampling had good concordance to the full acquisition estimates and retained diagnostic capabilities [22, 23]. Shorter IVIM acquisition protocol could potentially allow for clinical integration.

The purpose of this study was to evaluate the association of IVIM parameters with treatment response to CCRT in LACC and determine if an optimised subsampled IVIM had similar findings.

Materials and methods

Patients

This retrospective study was approved by the local Institutional Review Board and in accordance with the Helsinki Declaration. The study involved anonymized human data without identifying information that had already been collected waiving the need for informed consent.

Potential candidates for this retospective study were drawn from our institute’s local database. Inclusion criteria were (a) cervical carcinoma at least FIGO stage IB2 (based on FIGO 2018 revision); (b) received CCRT and subsequently intracavitary brachytherapy; and (c) had MRI examinations, once prior to treatment (pre-CCRT) and once after CCRT but before intracavitary brachytherapy (post-CCRT). Exclusion criteria were (a) poor image quality in either pre- or post-CCRT DWI; (b) prior pelvic surgery; (c) non-squamous cell carcinoma histology [24]; (d) small initial tumour volume of less than 1.5 cm3 as measured on T2-weighted (T2W) MRI; and (e) any previous history of malignant diseases apart for cervical carcinoma. All patients were restaged using the FIGO 2018 revision criteria.

Concurrent chemoradiotherapy

Patients were treated by whole-pelvic irradiation with a total cumulative dose of 40 Gy over 4 weeks with daily dose of 2Gy using 6 or 10 MV photon beams. Each week, patients also received a cycle of cisplatin (40 mg/m2).

Imaging acquisition

Patients were asked to fast for at least 6 h prior to the examination, and were given 20 mg hyoscine butylbromide (Buscopan, Boehringer) to suppress bowel peristalsis. Patients were scanned on a 3T system (Achieva 3.0T TX, Philips Healthcare). This study examined two routine clinical MRI sequences, axial and sagittal T2W, as well as DWI which were acquired using a 16-channel phased array torso coil (Table 1). DWI was conducted in a free-breathing environment with 13 b-values [0, 10, 20, 30, 40, 50, 75, 100, 150, 300, 500, 800 and 1000 s/mm2] and had a scan time of 7 min and 16 s.

Table 1 Summary of MRI scan parameters

Intravoxel incoherent motion analysis

Biexponential IVIM analysis was performed using non-negative least squares (NNLS) algorithm in MATLAB (MATLAB R2020a, Mathworks Inc) according to the equation:

$$ \frac{S_b}{S_0}=f{e}^{-b\left(D+D\ast \right)}+\left(1-f\right){e}^{- bD} $$

where Sb represents the mean signal intensity with diffusion gradient b, and S0 is the mean signal intensity when b = 0 s/mm2. Segmented fitting was used to estimate IVIM parameters which first assumed a monoexponential fit to get initial estimates of D and f and then subsequently used the initial estimates for biexponential curve fitting using a least squares estimator to refine estimates of D and f as well as estimating D*. However, D* was not considered for analysis in this study due to its poor reproducibility [25, 26]. The curve fitting was also constrained with D < 3 × 10−3 s/mm2 and f < 1.

First, all 13 b-values were used to calculate IVIM parameters and generate parametric maps. These served as the reference parameters and hereafter referred to as the full reference b-value distribution (BVD). A recent study demonstrated that an optimal subsampled BVD using only 6 b-values could reduce scan time while retaining diagnostic capabilities [23]. Thus, 6 b-values [0, 10, 30, 75, 300, 1000 s/mm2] were then used to calculate a second set of IVIM parametric maps. This abbreviated BVD would result in a scan time of 3 min and 18 s, representing a scan time reduction of 55%. The mean and median values of D and f were calculated from both sets of b-value distributions at both MRI examinations.

Tumour delineation

One radiologist (R1), board-certified with > 10 years’ experience in abdominopelvic MRI, manually drew volumes of interest (VOIs) using the freehand selection tool on ImageJ (ImageJ 1.52a, National Institutes of Health) to encompass the whole primary tumour on T2W images b1000 DWI images with reference to the T2W images and D parametric maps of pre- and post-CCRT images (Fig. 1). The radiologist was given both sets of images and was aware of the sequence of the MRI examinations. The T2W VOIs were used to measure tumour volume, while the DWI VOIs were then propagated to co-registered D and f parametric maps estimated with the full BVD and optimised subsample BVD (Fig. 2). The radiologist was also asked to measure the length of the tumour’s longest axis on T2W images for treatment response assessment.

Fig. 1
figure 1

Representative pre-concurrent chemoradiotherapy (pre-CCRT). a T2-weighted (T2W) images to aid in tumour delineation on. b Diffusion-weighted images (DWI), b = 1000 s/mm2. c Regions of interest were drawn by the senior radiologist to encompass the whole tumour area and was repeated on subsequent slices to include the entire tumour volume. The volumes of interest were propagated to co-registered pure diffusion coefficient (D) and perfusion (f) parametric maps. Tumour delineation was also done on post-CCRT images (d–f)

Fig. 2
figure 2

Representative pre-concurrent chemoradiotherapy (pre-CCRT). a Diffusion-weighted images (DWI), b = 1000 s/mm2 that were also overlaid with parametric maps of (b) pure diffusion coefficient (D) and (c) perfusion fraction (f) over the tumour as well as the corresponding (d–f) post-CCRT images and parametric maps. The cyan regions of interest (ROI) represent the first delineation by the senior radiologist on DWI which was copied to the D and f parametric maps

To measure observer repeatability, a second radiologist (R2), with 3 years’ experience in abdominopelvic MRI, delineates VOIs on pre- and post-CCRT MRI examinations as previously described. Additionally, the patient order was randomised and R1 was asked to delineate another set of VOIs on pre- and post-CCRT MRI examinations after a 1-month interval for all patients (Fig. 3).

Fig. 3
figure 3

Representative diffusion-weighted images (DWI) b = 1000 s/mm2 (a) before pre-concurrent chemoradiotherapy (pre-CCRT) and (b) after CCRT (post-CCRT). Regions of interest (ROIs) were drawn twice by a senior radiologist and once by a junior radiologist on both (c) pre-CCRT and (d) post-CCRT images. The segmentations by the senior radiologist are denoted by the cyan ROIs during the first reading session, and by the yellow ROIs for the second session. The delineations by the junior radiologist are denoted by red ROIs

Treatment response assessment

Patients were dichotomised by their treatment response based on the response evaluation criteria in solid tumours (RECIST) 1.1 using the longest axis measurements obtained from T2W images of the primary tumour [27]. Responders were patients with a greater than or equal to 30% reduction of the diameter of the longest axis between pre-CCRT and post-CCRT MRI examinations, or those with complete or partial response according to RECIST. Non-responders were thus patients with any of the following: a less than 30% reduction of the longest axis diameter, a 20% increment of the longest axis diameter, a new lesion, corresponding to stable or progressive disease according to RECIST.

Observer repeatability

Dice similarity coefficient (DSC), a statistic used to quantify the degree of spatial overlap between image segmentations ranging from 0 to 1 (no and complete overlap, respectively), was used to measure the degree of similarity of VOIs. A DSC of 0.70 is considered the threshold for good image segmentation repeatability [28]. Intraclass correlation coefficient (ICC) using a two-way mixed, single-score consistency model, a statistic often used to quantify the similarity of feature values, was used to measure the similarity of the calculated IVIM parameters. The feature values calculated from all three sets of VOIs were compared simultaneously to compute a single ICC value per feature. An ICC of 0.75 is considered the threshold for good measurement repeatability [29].

Statistical analysis

All statistical analyses were performed using R (R 3.6.2, R Core Team). The Fisher’s exact test was used to determine if FIGO stage was associated with treatment response. Then, two-sample Mann-Whitney U tests were used to determine if there were any significant differences in MRI features between treatment response groups; the non-responder group was set to 0 while the responder group was set to 1. Finally, receiver operative characteristic (ROC) analysis was applied on significant features to compute the optimal cut-offs, area under the curve (AUC), sensitivity and specificity.

Results

Patient demographics

Forty-five patients from June 2014–November 2018 were included in this study (Fig. 4) and patient demographics may be found in Table 2.

Fig. 4
figure 4

Diagram of patient selection

Table 2 Patient characteristics

Observer Repeatability

Pre-CCRT DSC was good at 0.81 while post-CCRT DSC was moderate at 0.67.

Pre- and post-CCRT ICC values were good and similar between the reference and subsampled parameters. A full tabulation of ICC values can be found in Table 3.

Table 3 Values and intraclass correlation coefficient (ICC) of pure diffusion coefficient (D) and perfusion fraction (f) before concurrent chemoradiotherapy (pre-CCRT) and after treatment (post-CCRT) derived from the full reference b-value distribution (BVD) and subsampled BVD, and from the different observers. R1-1 and R2-1 correspond to the IVIM parameters as estimated from the first set of volumes of interest (VOIs) from the first and second radiologist, respectively, while R1-2 corresponds to the parameters estimated from the second set of VOIs of the first radiologist

Associations with treatment response

Of the 45 patients, 27 had partial response, 17 had stable disease and 1 had progressive disease. None achieved complete response after CCRT. FIGO stage was not associated with treatment response (p = 0.60).

Pre-CCRT reference and subsampled fMean and f50 were observed to be significantly higher in responders compared to non-responders. No significant differences were observed in pre-CCRT reference and subsampled D between response groups. A full tabulation of pre-CCRT reference and subsampled parameters can be found in Table 4 and ROC performance metrics of significant features may be found on Table 5.

Table 4 Pure diffusion coefficient (D) and perfusion fraction (f) before and after concurrent chemoradiotherapy (pre-CCRT and post-CCRT, respectively) dichotomised by treatment response groups using the full 13 b-value distribution (BVD) and using an optimised subsample of 6 b-values. Responders were defined as patients whose tumour had at least 30% reduction in long axis diameter between MRI examinations
Table 5 Receiver operating characteristic (ROC) curve performance metrics of the mean and 50th percentile of perfusion coefficient (f) before concurrent chemoradiotherapy (pre-CCRT) using the full 13 b-value distribution (BVD) and using an optimised subsample of 6 b-values

There were no significant differences in any post-CCRT parameters evaluated between responders and non-responders (Table 4).

Discussion

In this study, we found that post-CCRT repeatability was lower than pre-CCRT in both tumour delineation and parameter value estimations. We also found that pre-CCRT f was significantly higher in responders. Furthermore, an optimal subsampled BVD estimated pre-CCRT f that had similar repeatability metrics as the full reference BVD also demonstrated the same significant difference between treatment response groups.

At baseline, we found that the DSC of tumour delineations had good agreement, which generally gave rise to good ICC in IVIM parameters. D appeared to have higher repeatability than f, which is in concordance with previous studies investigating renal tumours [30] and rectal cancer [31]. However, this is in conflict with one study investigating nasopharyngeal carcinoma that found D and f had similar repeatability [32]. It is possible that the good repeatability in D and f observed in this study may be due to volumetric delineation as a previous study found that whole-lesion VOIs had superior repeatability compared to single-slice regions of interest (ROIs) [33]. After treatment, we found that the DSC of tumour delineations was only moderate; despite this, the ICC of reference and subsampled post-CCRT D and f remained good. Post-CCRT assessment of cervical cancer is known to be challenging due to a host of irradiative changes and could be a reason for the lowered repeatability of post-CCRT tumour delineations and IVIM parameter values [34, 35]. The smaller tumour volumes presented on post-CCRT scans may be another contributing factor as there are fewer pixels to average IVIM parameter values over [36].

Values of D were not significantly different between treatment response groups at either time point. Similar studies have found that pre-CCRT ADC and D had limited value in predicting short- and long-term treatment response, though mid-treatment ADC was shown to have some prognostic value [37,38,39,40]. However, this is in contrast to a recent study that demonstrated pre-CCRT D was significantly lower in responders [41].

In this study, we found that pre-CCRT f was significantly higher in responders compared to non-responders. A previous cervical cancer study had similar findings, where the authors found that f was significantly higher in patients with complete response compared to those with partial response [42]. Furthermore, in a study examining long-term prognosis, higher post-CCRT f has been shown to be able to predict good prognosis in cervical cancer [39]. Given the significant correlations between IVIM and dynamic contrast-enhanced (DCE)-MRI parameters [43], our result concurred with DCE-MRI in that better perfused tumours had better locoregional control [44,45,46]. Elevated pre-treatment perfusion on MRI suggests better oxygenated tumours [47], a determinant of better locoregional control and also improved disease-free and overall survival [48]. It is thought that hypoxia induced genetic instability that leads to increased radioresistance in tumours and, thus, poor prognosis for tumours with low oxygenation measurements [49].

One limiting factor of IVIM is the long scan time due to the acquisition of substantially more b-values compared to conventional DWI. This study also examined the utility of an optimised subsampled BVD, which could potentially reduce the scan time needed for IVIM imaging. In terms of repeatability, the pre-CCRT ICC of the subsampled D and f were similar to those of reference D and f. This was in contrast with a previous study in renal tumours which found that the repeatability metrics of an abbreviated IVIM sequence was lower than a longer acquisition sequence [30]. However, in that study, the abbreviated IVIM sequence used a lower maximal b-value compared to the longer IVIM sequence which may have adversely affected parameter estimation in the abbreviated IVIM sequence [23]. Post-CCRT ICC of subsampled D was also comparable to that of reference D; however, interestingly, post-CCRT ICC of subsampled f was good while those of reference f were only moderate. Encouragingly, similar to pre-CCRT reference f, we found that pre-CCRT subsampled f was also significantly higher in responders compared to non-responders. This implies that subsampled parameters had similar discriminative abilities as the reference parameters while only requiring half the acquisition time.

There are several limitations in this study. Firstly, VOI segmentation was done manually. Fully or semi-automatic segmentation is means to improve delineation consistency; however, there are substantial challenges to these approaches in MRI due to the heterogeneity in acquisition parameters and sequences used [50]. Secondly, this study only included a relatively small set of patients from a single centre. However, all patients had the same treatment regimen and schedule and imaging protocol. Third, scan-rescan reproducibility could not be evaluated as this was a retrospective study. Additionally, other prognostic factors such as nodal status, lymphovascular invasion and parametrial involvement to predict treatment response were not considered; future studies combining other prognostic factors with IVIM-based features into a multivariable model could be of interest in improving the prediction of treatment response.

In conclusion, pre-CCRT f was significantly higher in responders compared to non-responders and had good observer repeatability. Furthermore, f estimated using an optimised subsampled BVD demonstrated the same association with treatment response and repeatability metrics as f estimated using the full reference BVD.