Introduction

MR imaging plays a crucial role in the clinical management of rectal cancer, with multiparametric imaging becoming part of the guidelines for diagnosis and follow-up [1, 2]. Within the context of locally advanced rectal carcinomas (LARC), the possibility to recognize small tumor foci at the completion of neoadjuvant treatment is of particular importance in organ-sparing strategies [3]. In this regard, the main limitation of conventional imaging, based on the T2-weighted (T2w) sequences, is the low accuracy in the differentiation between the residual tumor and the radiation-induced fibrosis [4]. Diffusion-weighted imaging (DWI), offer a potential solution: the presence of tumor tissue, causing a restriction of the diffusive motion of water molecules, results in a hyperintense signal, which benefits (small) tumor lesions detection on DWI [5, 6]. However, the interpretation of diffusion imaging requires a certain learning curve and is affected by the presence of potential pitfalls [7]. An important pitfall is the fact that non-tumoral structures with an inherently long T2-relaxation time (such as fluid in the rectal lumen) may exhibit high signal on DWI because of T2 ‘shine-through effects’, thereby reducing the tumor conspicuity and hampering the diagnostic performance [7].

In this context, studies in a variety of tumor types recently investigated whether the use of ultrahigh b values could overcome this problem by further suppressing the signal caused by these T2 shine-through effects to improve the contrast between the tumor and normal tissues compared to standard high b value DWI (typically in the range of b = 800–1000 s/mm2), leading to a better tumor conspicuity [8,9,10,11,12,13,14].

Aim of this study was to investigate the potential benefit of ultrahigh b = 2000 s/mm2 (b2000) DWI in rectal cancer by comparing the tumor detectability and conspicuity, which was qualitatively and quantitatively investigated, with standard b = 1000 s/mm2 (b1000).

Materials and methods

Patient selection

All procedures performed in this study involving human participants were in compliance with the ethical standards of the institutional and/or national research committee with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Patient-informed consent was waived.

Sixty-one patients for a total of 90 MR scans (44 primary staging and 46 restaging) of the rectum were retrieved from our institute’s picture archiving and communication system (PACS) between November 2016 and April 2019 to be potentially included in this retrospective study.

Inclusion criteria were

  • for patients with primary staging MR: pathology proven solid (non-mucinous) rectal cancer;

  • for patients with restaging MR (after neoadjuvant treatment before surgery): the availability of data regarding the final clinical outcome response;

  • for all patients: availability of 3T DWI-MR scan of the rectum including b1000 and b2000 images (flowchart with inclusion and exclusion criteria presented in Fig. 1).

    Fig. 1
    figure 1

    Flowchart of patients considered for inclusion, excluded patients and patients finally included in the study cohort

Among them, 6 patients for a total of 9 MR scans (4 primary staging and 5 restaging MR scans) were excluded: in detail, 6 MR scans (3 primary staging and 3 restaging MR scans) of 3 patients were excluded due to the presence mucinous tumor, as these are known to show different signal characteristics on DWI. One primary staging MR scan was excluded because the exam was performed on a 1.5 T scanner; other two restaging MR were excluded due to the unavailability of data on the final clinical outcome and the presence of severe susceptibility artifacts in the pelvis (hip replacement) respectively.

Finally, a total of 55 patients (81 MR scans) met the inclusion criteria and constituted the final study population (69 ± 12 years, mean age ± SD). Baseline demographic and clinical characteristics of patients included in the study were reported in Table 1.

Table 1 Baseline demographic and clinical characteristics of participants

MRI protocol

A state-of-the-art 3T scanner (Achieva, Philips Medical System, Best, the Netherlands), equipped with phased array surface coil, was used to perform MR exams. The MR protocol is described in detail in Table 2. Notably, we acquired transverse echo planar DWI including b1000 and b2000 as the two highest b values. The transverse T2 W and DWI sequences were acquired in identical planes, perpendicular to the long axis of the tumor. Patients received 20 mg of scopolamine butylbromide (Buscopan, Boehringer Ingelheim, Ingelheim am Rhein, Germany) intravenously to reduce bowel motility [1].

Table 2 MR sequences and parameters used in the study

Image analysis

DICOM data of all patients were exported on an off-line dedicated workstation where the images were assessed using an open-source medical image viewer, Horos Version 3.3.5 (https://horosproject.org). The image analysis was based on four main points:

  • diagnostic accuracy (restaging MR scans) in treatment response assessment;

  • qualitative conspicuity analysis (restaging MR scans);

  • quantitative conspicuity analysis (test set—restaging MR scans);

  • quantitative conspicuity analysis (validation set—primary staging MR scans).

In the first point, we compared the diagnostic accuracy of b1000 and b2000 in the differentiation between complete response and non-complete response. The second and the third point were focused on non-complete responder patients: the conspicuity was qualitatively and quantitatively compared between b1000 and b2000. Finally, a further quantitative comparison of the tumor cospicuity was performed on an independent set of primary staging manual segmentations to validate the results of the previus steps.

Diagnostic accuracy

An abdominal radiologist (Reader 1) with 6 years of experience in rectal MR and a third-year radiology resident (Reader 2), independently assessed the 41 restaging MR, blinded to patients’ clinical data. They were asked to assess the presence or absence of residual tumor considering b1000 and b2000. A complete response was defined as a completely normalized rectal wall MR signal on T2w images without any focal high signal on DWI or well defined fibrotic wall thickening at the previous tumor site on T2w images with no focal high signal on DWI. The presence of residual tumor (non-complete response) was defined as focal high DWI signal (possibly corresponding to suspicious residual mass on T2w MRI in case of obvious residual tumor) in the location of the former tumor bed [7].

Data were anonymized and scored in two reading sessions: first, the readers reviewed one b value (randomly chosen), and two weeks later, they reviewed the other b value. Both T2w and DWI images as well as the primary staging MR exam (if present in our database) were available to the readers in each reading session.

The reference standard was the tumor regression grade (TRG), assessed according to Mandard system on surgical specimen: TRG1, (complete response) absence of residual cancer and fibrosis; TRG2–TRG5 (non-complete response), presence of various degrees of residual cancer and fibrosis [15]. Alternatively, a sustained complete clinical response (with repeated negative MRI examinations and endoscopy with or without biopsy) was considered surrogate for a complete response for patients enrolled in watch-and-wait protocols [16].

Qualitative conspicuity analysis

The same two readers independently qualitatively assessed the restaging MR scans of non-complete responder patients that were identified in the previous step. In detail, they scored the tumor conspicuity on b1000 and b2000 images by using a 3-point scale: (1) hyperintense lesion, but only slightly demarcated from the background, (2) hyperintense lesion with good background suppression, tumor clearly demarcated from the background, (3) very high hyperintensity lesion with excellent background suppression. The T2w images were at the readers’ disposal for anatomical reference during the evaluation.

Data were scored in two reading session, as described for the diagnostic accuracy assessment.

Quantitative conspicuity analysis

Two weeks after the qualitative assessment, both readers were asked to manually segment the residual tumor of the 19 non-complete responder patients (test set) on b1000 and b2000. The manual segmentation of an independent set of 40 primary staging rectum MR (validation set) scans was further performed to validate the result of the non-complete responder set.

To quantitatively assess the tumor conspicuity, we compared the signal intensity distribution of the tumor and of the immediate sorrounding tissue. Tumor delineation was performed using an open-source medical image computing platform, 3DSlicer Version 4.8 (www.3dslicer.org). Both readers provided a whole tumor volume manual delineation on DWI images referring to T2-weighted images for anatomical reference. Subsequently, the region of interest of the tumor (ROI-T) was dilated by four voxels using the corresponding morphological binary operator (ROI-T “dilated”). The tumor surrounding tissue (ROI-TST) was computed by subtracting ROI-T from ROI-T “dilated” (Fig. 2). A circular 5 mm2 region of interest was also placed in the adipose tissue of the gluteal region or the ischiorectal fossa to normalize data (NORM-T, and NORM-TST respectively).

Fig. 2
figure 2

Regions of interests (ROI) selected in an MRI study of a 48-year-old woman with rectal cancer. The ROI representing the rectal tumor delineation (a, c) and the tumor surrounding tissue (b, d) were showed in the axial plane on b = 1000 s/mm2 (a, b) and b = 2000 s/mm2 (c, d), respectively. The whole tumor (e, g) and the whole tumor surrounding tissue (f, h) segmentations were shown in the sagittal plane on b = 1000 s/mm2 (e, f) and b = 2000 s/mm2 (g, h), respectively

Statistical analysis

Distribution normality was tested via Shapiro-Wilk test. A receiver-operating characteristic curve (ROC) with area under the curve (AUC) analysis was used to assess the diagnostic accuracy of b1000 and b2000 tumor detectability for both readers; a comparison between ROC curves was also performed for each reader. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (PPN) were also calculated. The qualitative scores and the quantitative features including mean signal intensity, variance, 10th percentile and 90th percentile, were compared using the Wilcoxon test. The interobserver agreement (IOA) for the qualitative and quantitative data was calculated through the use of Cohen’s Kappa and Intraclass Correlation Coefficient (ICC), respectively.

ICC was calculated using MedCalc software, version 16.8.4 (MedCalc Software, Ostend, Belgium). All other statistical analyzes were performed using IBM SPSS Statistics software, version 20 (IBM, Armonk, NY). A p value ≤ 0.05 was considered statistically significant.

Results

Patient characteristics

Of the 81 MR scans, 40 (49%) were primary staging and 41 (51%) were restaging at the end of neoadjuvant treatment (8 weeks after completion of neoadjuvant treatment). At baseline, all 40 patients had biopsy-proven rectal tumors. In 39/41 (95%) patients, the final outcome was assessed on surgical specimen while in 2/41 (5%) patients a sustained complete clinical response was considered surrogate for a complete response. In these 2 patients, the follow-up (mean ± SD mean) was 23 ± 11 months. Finally, 22 (54%) complete responders and 19 (46%) non-complete responders (5 TRG2, 7 TRG3, 7 TRG4) were included.

Diagnostic accuracy

Results of ROC curves analysis as well as sensitivity, specificity, PPV and NPV for each reader are shown in Table 3. On the 41 restaging MR scans performed after completion of chemo-radiotherapy (CRT), Reader 1 revealed an AUC of 0.93 for both b1000 and b2000 while the AUC of the Reader 2 was 0.85 for b1000 and 0.88 for b2000. For both readers, the ROC curves comparison did not show any significant difference between b1000 and b2000 (p > 0.05). Interestingly, in one case the Reader 2 correctly indentified the residual tumor only thanks to b2000 (Fig. 3). Two patients with residual tumors were misdiagnosed by both readers: in the first case the reason was the presence of focal intraluminal air-induced distortion just on the former tumor bed; in the second case, there was a minimal residual tumor (pT1). One and three patients were erroneously scored as non-complete responders by the Reader 1 and 2, respectively. In these cases, the presence of susceptibility artifacts caused by the air in the rectal lumen projecting over the rectal wall was responsible for the misdiagnosis.

Table 3 Receiver-operating characteristic curve (ROC) with area under the curve (AUC) analysis for the tumor detectability; sensitivity, specificity, PPV and NPV are also shown for each reader
Fig. 3
figure 3

T2-weighted image (a), b = 1000 s/mm2 diffusion-weighted image (b) and b = 2000 s/mm2 diffusion-weighted image (c) selected from a 47-year-old man MR scan performed at the end of neoadjuvant treatment for rectal cancer. The Reader 2 correctly detected the residual tumor on the anterior and right lateral rectal wall only looking at b2000 DWI. The corresponding hyperintensity on b1000 (white asterisk in b) was misinterpreted as T2 shine-through

Qualitative conspicuity analysis

Results of the qualitative analysis are shown in Table 4. The mean overall conspicuity scores were significantly higher for b2000 compared to the b1000 images (2.29 vs 1.65 Reader 1, p = 0.01; 2.18 vs 1.69 Reader 2, p = 0.04). Representative imaging examples are provided in Figs. 4-5. Interobserver agreement was equally good for both b values (k = 0.86 for b = 1000 s/mm2, k = 0.86 for b = 2000 s/mm2).

Table 4 Descriptive statistics and Cohen’s Kappa for the qualitative analysis
Fig. 4
figure 4

b = 1000 s/mm2 diffusion-weighted image (a), b = 2000 s/mm2 diffusion-weighted image (b) and fusion T2-weighted images and b = 2000 s/mm2 diffusion-weighted image (c) selected from a 65-year-old woman MR scan performed at the end of neoadjuvant treatment for rectal cancer. Patient was a non-complete responder. The two small residual tumor foci (white arrow in c) near the radiation therapy-induced fibrosis (*) were clearly depicted when using b = 2000 s/mm2 thanks to the better background suppression and they were pathologically confirmed (d)

Fig. 5
figure 5

T2-weighted image (a), b = 1000 s/mm2 diffusion-weighted image (b) and b = 2000 s/mm2 diffusion-weighted image (c) selected from a 79-year-old non-complete responder MR scan performed at the end of neoadjuvant treatment for rectal cancer. The residual tumor (white arrow in b, c) on the anterior wall of the rectum was correctly identified by both readers. However, thanks to the better background suppression, the conspicuity was assessed to be better on b2000 compared to b1000 for both readers (qualitative score 3 vs 2)

Quantitative conspicuity analysis

Results of the quantitative analysis are shown in Table 5, Table 6 and Fig. 6.

Table 5 Comparison of mean signal intensity between b = 1000 s/mm2 and b = 2000 s/mm2 for both readers. In the tissue surrounding the tumor, except for non-normalized ROI (ROI-TST) of the Reader II, the mean signal intensity was significantly lower when using b2000. On the contrary, the mean signal intensity of the tumor was significantly higher in b1000
Table 6 Variance, 10th percentile and 90th percentile for b = 1000 s/mm2 and b = 2000 s/mm2 for both readers. In the whole tumor, b2000 revealed significantly lower values in variance, 10th percentiles and 90th percentiles. In the same way, variance and 90th percentiles were significantly lower in the healthy surrounding tissues. Regarding 10th percentiles in the same area, significant differences were shown (lower values in b2000) only for the Reader 1 in the validation set
Fig. 6
figure 6

Box plots comparing the mean signal intensity, variance, 10th percentile and 90th percentile of b = 1000 (s/mm2) and b = 2000 (s/mm2) considering the whole tumor delineation and the tissue surrounding the tumor. In the tissue surrounding the tumor, the mean signal intensity was significantly lower when using b2000. Moreover, in the whole tumor, the variance, 10th percentiles and 90th percentiles were significantly lower in b2000 meaning that signal intensity was more homogenous (variance) and less affected by outliers (10th percentiles and 90th percentiles) compared to b1000

In the tissue surrounding the tumor, the mean signal intensity was significantly lower when using b2000 (p < 0.05). On the contrary, the mean signal intensity of the tumor was significantly higher in b1000 (p < 0.05). When considering the whole tumor, the variance, 10th percentiles and 90th percentiles were significantly lower in b2000 (p < 0.05), meaning that signal intensity was more homogenous (variance) and less affected by outliers (10th percentiles and 90th percentiles) compared to b1000. In the same way, variance and 90th percentiles were significantly lower in the healthy surrounding tissues (p < 0.05) when using b2000 (p < 0.05). Regarding 10th percentiles in the same area, significant differences were shown (lower values in b2000) only for the Reader 1 in the validation set. The ICC for the signal intensity in the tumor bed was excellent for both b values, and improved with b2000 (from 0.92 to 0.97).

Discussion

The aim of this study was to investigate the potential benefit of performing ultrahigh b value DWI in rectal cancer by comparing the tumor detectability and the tumor conspicuity on b2000 to the more common clinically used b1000. Concering the treatment response assessment, there were no significant differences between b1000 and b2000 in terms of tumor detectability for both readers. However, our results demonstrated that b2000 has the potential to improve the conspicuity of rectal tumors since both readers assigned significantly higher qualitative scores to the b2000 images of restaging MR scans. Interestingly, in one case the better background suppression of b2000 helped the less expert reader to detect an unknown residual tumor on b1000. Interobserver agreement was comparably good for both b values.

The conspicuity quantitative analysis of the whole tumor and the tissues surrounding the tumor showed significant differences in terms of mean signal intensity, variance, 10th - 90th percentile between the two b values, in favor of the b2000 images, thus validating the results of conspicuity qualitative assessment. The reduced signal intensity in the tumor surrounding tissues obtained with b = 2000 images resulted in a higher conspicuity. Due to a better background suppression, the tumor was more prominently visible, therefore facilitating the delineation of tumor itself. Higher conspicuity was further confirmed by higher qualitative lesion conspicuity scores reported by both readers. Furthermore, the variance, 10th and 90th percentiles of the tumor and of the adjacent tissues, were lower in b2000, suggesting that images were less affected by noise and outliers compared to b1000.

A potential downside of the b2000 images, was that – in addition to the reduced signal in the tumor surrounding tissues – the mean signal intensity in the tumor itself was also lower compared to b1000. This was an expected finding, since the increase of the diffusion weighting is responsible for a reduction in the signal-to-noise ratio, with a consequent overall decrease in mean signal intensity [17]. In this regard, from a technical point of view, to overcome this problem we acquired b2000 using an adequate number of signal average and repetition time (Table 1). Moreover, we balanced the signal reduction related to the thin (3 mm) slice thickness maintaining the isotropy of the voxel (3 mm on each side). Another downside of the b2000 images was that the overall reduced signal also results in less anatomical detail, which may result in the loss of potentially valuable information and could pose a potential diagnostic challenge. However, b2000 images are usually acquired with a set of lower b values, thus limiting the weight of the above mentioned limitiations. Moreover, based on our results it appears that the beneficial effect of the improved lesion-to-background ratio outweighs the potential downsides of a higher b value, particularly for the less experienced reader. In other words, for readers with less experience, the complete suppression of all background signal can make the detection of a bright tumor lesion more straightforward and therefore easier. Finally, an additional downside of increasing the diffusion weighting is that images will typically become more susceptible to distortions, “ghosting artifacts” and increased spatial blurring due to eddy currents [18].

Our results are in line with Hausmann et al. who recently investigated the value of ultrahigh b value (b2000) DWI in a comparable population of rectal cancer patients (n = 26 patients, 31 MR exams). They concluded that b2000 results in a high signal-to-background ratio between tumor and healthy tissue, which could be helpful in detecting and demarcating tumoral tissue [12]. However, the authors did not investigate the potential differences in terms of detectability and conspicuity in comparison to ‘standard’ high b value images (b1000). Moreover, they provided a single-ROI measurement within the tumor instead of the whole tumor volume segmentation. In this regard, several authors underlined that the choice of ROI placement widely influences the quantitative measurements [19,20,21]. For instance, Goh et al. demonstrated that the ROI placement substantially influenced the ultimate perfusion values when using perfusion computed tomography in colon cancer patients [19]. In another study, Lambregts et al. concluded that the ROI size and positioning influenced the apparent diffusion coefficient (ADC) measurements in rectal cancer patients and that the whole tumor volume provides the most reproducible results [20]. In light of the previous studies, we opted for a more detailed selection of ROIs.

Other studies investigated the usefulness of high b value DWI (b2000) in different tumor types, mainly prostate cancer [8, 10, 17, 18, 22, 23]. Most of them were conducted using a 3T scanner and, according to our results, they showed that the tumor conspicuity and detectability of the prostate cancer significantly improved from b1000 to b2000 [8, 10, 17, 22].

Our study has some limitations. First, our investigation was intended as a technical feasibility study focusing on tumor visibility and conspicuity as the main outcomes. As such other more clinical-diagnostic outcomes such as staging accuracy where beyond the scope of this study. Furthermore, there is no consensus on the role of DWI in the staging [1]. Second, our study cohort was relatively small. In this regard, we would have particularly liked to have had access to a higher number of restaging scans. In fact, the response evaluation after CRT represents, from a clinical point of view, the most interesting timepoint for which the benefit of DWI is know to be most evident and DWI is now clinically recommended [1]. However, the introduction of an independent set of primary staging MR exams (“validation set”) validating the quantitative results of restaging MR exams gave more strength to our results and represented an added value of our study. Third, the number of complete responders (54%) in our study was higher than that generally encountered in daily clinics (10–24%) [3]. This fact may have affect the specificity and the PPV, especially in the second reader. In fact, three cases were misinterpreted as non-complete responders. Three were the potential reasons of this high percentage. First of all, the relatively small study cohort. Second, the mean time to surgery of our study population was > 13 weeks and this could have favorably influenced the response rate. In this regard, a recent study from Macchia et al. revealed an improvement in complete response rate in patients underwent delayed surgery [26]. Moreover, we excluded mucinous tumors from our study population that usually show a low complete response rate [27, 28]. Fourth, due to the retrospective nature of this study, our results should be confirmed by prospective studies with a larger number of patients.

In conclusion, the use of DWI with ultrahigh b value (b = 2000 s/mm2) has the potential to improve tumor conspicuity compared to the more commonly used b = 1000 s/mm2 in patients with rectal cancer, maintaining a comparable diagnostic accuracy to standard b1000. Improved conspicuity could be more beneficial in the restaging setting, where the differentiation between a complete response and small tumor remnant is of major clinical importance in the light of new conservative treatment strategies (i.e. watchful waiting) [24, 25]. Further validation in a larger patient population, particularly in the restaging setting, is warranted and– in addition to the outcomes addressed in the current study – should also include more clinical-diagnostic endpoints such as impact on diagnostic staging performance.