Introduction

Over the past two decades, magnetic resonance imaging (MRI) has been demonstrated to provide accurate information for the evaluation of rectal cancer; it is now a recommended imaging technique at both primary staging and follow-up after chemoradiotherapy (CRT) [1, 2]. Although staging is usually performed according to the American Joint Committee on Cancer (AJCC) guidelines, some other features, such as circumferential resection margin or extramural venous infiltration (EMVI), should also be reflected in reports [3,4,5].

EMVI is defined histologically as the presence of tumor cells beyond the muscularis propria in an endothelium-lined vessel; it is considered as a T3 stage, but not specifically assessed in the staging [2, 6]. This is despite the fact that different studies have demonstrated the role EMVI plays as an independent predictor of lymph node metastasis, disease-free/overall survival, local recurrence and synchronous/metachronous distant metastases [7,8,9,10,11,12,13,14]. Similar outcomes of stage II tumors with positive EMVI have been reported as for stage III tumors [7]. Previous studies have reported a moderate to high accuracy in the detection of EMVI using high-resolution T2w (HRT2w) MRI sequences, but with a wide range of documented sensitivities and initial staging results showing better results than the post-CRT results [8, 9, 11, 15].

There are few references to the performance of diffusion-weighted imaging (DWI) concerning EMVI, most likely due to the possible reduction in usefulness resulting from its lower resolution, even though larger vessels could be assessed [16]. However, the examination of small vessels (< 3 mm) may be difficult even using HRT2w [8, 17, 18]. Furthermore, DWI could allow a more accurate evaluation of the tumor contour, thanks to better distinction of reactive tissue or microvessels adjacent to the tumor with comparable signal intensity, and differentiation of fibrosis and viable tumoral remnants [16, 19].

On the other hand, DWI has demonstrated to improve rectal cancer detection and delimitation added to T2w sequences; radiologists without the previous experience could benefit to a larger extent from that. Such an effect would remain undetected in many previous studies, which included only experienced readers [20, 21]. In that case, using DWI might be helpful during the early stages of the learning curve or in less-specialized radiologists or centers.

Hence, the aim of the study was to assess the potential changes in MRI detection of EMVI in rectal cancer, both in primary staging and post-CRT follow-up, produced by the use of DWI added to HRT2w, compared to the use of HRT2w alone. As a secondary objective, while these potential improvements were analyzed in experienced radiologists, the performance of radiologists without prior experience in rectal cancer staging and radiology residents was also considered.

Methods

Patient population

This cross-sectional study was approved by the Research Ethics Committee of our center. Written informed consent was waived owing to its retrospective nature.

One hundred consecutive patients with MRI for rectal cancer staging who underwent surgery (Fig. 1), whether after primary staging or post-CRT follow-up, were enrolled in the study between January 2011 and July 2016. All patients satisfied the following inclusion criteria: (1) rectal cancer diagnosis, proven by colonoscopy and biopsy; (2) correctly performed rectal MRI, using an identical technique to that of all the other MRIs, with no significant artifacts and fully available for review; (3) post-CRT follow-up MRI when neoadjuvant treatment was necessary; and (4) surgery after MRI with complete surgical specimen (total mesorectal excision/abdominoperineal resection). At primary staging, 9 weeks were the maximum permitted interval until surgery (average 34.7 days). In post-CRT cases, the post-CRT follow-up MRI was the one used in the study, including those with surgery between 6 and 10 weeks after the end of treatment (average 64.1 days). Cases with an interval of 5 weeks or less from the end of CRT to the follow-up MRI were rejected (average 40.7 days).

Fig. 1
figure 1

Workflow chart of the study

Neoadjuvant treatment

The need for CRT was decided by the Hospital Committee for Colorectal Tumors. The neoadjuvant therapy schedule consisted in 825 mg/m2 of capecitabine (oral, twice daily) on an outpatient basis, concurrently with radiotherapy (long cycle, normofractionated pelvic radiation at a total dose of 50.4 Gy over 25 sessions). In general, the indication for CRT did not include IIA stages with at most millimetric infiltration, with no factors of poor prognosis.

Pathological analysis

The standard of reference was pathological staging. Surgical specimens were fixed in formalin for 24 h. Representative slides were obtained from the tumor borders and area, using hematoxylin–eosin staining and immunohistochemical markers. All histopathologic slides were reviewed by two experienced pathologist (7 and 10 years experience in interpretation of colorectal cancer specimens) to give the histological stage, following the guidelines of the AJCC [3]. The presence of EMVI was determined according to a standard definition: tumor present within an extramural endothelium-lined space that is either surrounded by a rim of smooth muscle or contains red blood cells [22].

MRI protocol

All patients underwent 1.5-T rectal MRI in the same center (MAGNETOM Avanto; Siemens Healthcare, Berlin, Germany), using a 16-channel phased-array body surface coil. Patients were previously administered 20 mg of intramuscular butylscopolamine bromide and 50 ml of rectal gel. The MRI protocol is shown in Table 1, used for both primary staging and follow-up MRIs. Contrast materials were not used.

Table 1 MRI acquisition protocol

Image assessment

The images of every case were reviewed by ten radiologists with different degrees of experience in staging rectal cancer using MRI, independently and blinded to any information except for the presence of biopsy-proven malignancy. Three of them (ER) had prior experience of 3–6 years (approximately 40 instances per year). Another three (NER) had 2–7 years experience with MRI, though not in abdominal pathology. The remaining four (RR) were radiology residents with general knowledge of MRI. The NER and RR received a baseline training (2 h) before the start of the study, consisting of review and discussion of several cases from our center and imaging examples.

Radiologists were asked to assess the presence or absence of EMVI in each patient, twice every single case. In a first session, the radiologists based only on the HRT2w set of images. The likelihood of EMVI was determined according to Smith et al. [9, 23]: the presence of intermediate signal intensity within vessels (similar to that of the main tumor), obvious irregular vessel contour and/or nodular expansion of vessels by definite tumor signal (Fig. 2). The proposed scale was adapted to a three-point one (1—negative, 2—doubtful, 3—positive).

Fig. 2
figure 2

Primary staging MRI from a 73-year-old man with rectal adenocarcinoma: axial (a) and sagittal-to-tumor and (b) high-resolution T2w sequences. A large mass (star) almost completely occupies the middle third of the rectum, with signs of profuse perirectal fat infiltration. Focal expansion of vessels with similar intensity to that of the tumor and irregular contours were visible (arrows). The histological results confirmed the presence of extramural venous infiltration

After a minimum 1-month washout period, with the aim of preventing memory bias, radiologists analyzed all MRIs a second time, and this time using both HRT2w and DWI images presented side by side. The suspicion criterion for EMVI in DWI was pre-defined as the presence of high signal intensity on a high b-value sequence and moderate to high hypointensity on the ADC (restricted diffusion), neighboring the main tumor and coincident with the location of the vessel in the HRT2w sequence (Figs. 3, 4) [16]. The scoring for the combined image set used the same scale as for the first session. This time, the radiologists provided an initial score for HRT2w sequences. Then, if the DWI suspicion criterion was present, the final score was made one point higher; otherwise, the initial score was considered the final one.

Fig. 3
figure 3

Primary staging MRI from a 77-year-old man with rectal adenocarcinoma: sagittal-to-tumor high-resolution T2w sequence (a) and axial diffusion-weighted imaging (b) and ADC map (c). Circumferential wall thickening on the lower third of rectum (arrows), showing signs of restricted diffusion. An expanded tubular structure adjacent to the posterior wall of the rectum (arrowheads) presented similar intensity in T2w sequence, suggestive of extramural venous infiltration that was histologically confirmed. Diffusion and ADC showed signs of restriction, also similar to those of the main tumor

Fig. 4
figure 4

Post-CRT follow-up MRI from a 63-year-old man with rectal adenocarcinoma. Tubular (white arrow) and nodular (arrowhead) expansion of vessels with moderate intensity was visible (arrows) in the axial high-resolution T2w sequence (a). They showed moderate hyperintensity in the high b-value diffusion sequence (b) with slight hypointensity in the ADC map (c). The histological results confirmed the persistence of extramural venous infiltration

Statistical analysis

Statistical analysis was conducted with the IBM SPSS Statistics package 24.0 (IBM Corp, Armonk, NY, USA) and Epidat 4.1 (SERGAS, Galicia, Spain). According to the degree of experience, the results were clustered in groups in the different phases and mean values were calculated. Only cases with a score of 3 in the confidence scale were assumed as positive for the diagnosis of EMVI and used in the analysis.

Diagnostic accuracy (by means of the area under the ROC curve—AUC), sensitivity and specificity, positive and negative predictive values (PPV/NPV) and likelihood ratios were all calculated for each group in every reading, group and category (primary staging or post-CRT follow-up). Fisher’s exact test was used to evaluate statistical significance, and McNemar test to analyze the differences between the image sets (p ≤ 0.05). Subsequently, overstaging and understaging rates and intragroup agreement (Fleiss’s kappa) were obtained for every reading.

Results

A total of 54 of the 100 MRI records were corresponded to primary staging (neoadjuvant treatment was not indicated) and the remaining 46 to post-CRT follow-up MRI with later surgery. Demographic and histological staging data of the sample are shown in Table 2. Three cases of adenocarcinoma presented just residual cell clusters after CRT, one in a lymph node. In four primary staging cases, no evidence of remaining tumor tissue was found in the surgical specimen, with adenocarcinoma in an initial resected polyp. Based on the surgical specimen, EMVI was present in ten cases in the whole sample (10%), seven of them at primary staging (13% of the subgroup) and three at post-CRT follow-up (6.5%). At primary staging, out of the 19 cases with positive nodes, four were related to histologically positive EMVI (21% of cases), while post-CRT group revealed malignant nodes in 18 cases, two of them EMVI positive (11%).

Table 2 Demographic and histological staging data of the sample

The results for accuracy are presented in Table 3, while Table 4 shows sensitivity, specificity, likelihood ratios, predictive values and intragroup agreement results. All the results for the ER group showed statistical significance by themselves (p < 0.05), as were those of the DWI reading by the other groups, except for the post-CRT follow-up by the NER and the accuracy results. In the comparison between the two readings, both primary staging and post-CRT follow-up by the ER showed significant differences (p < 0.01 and p = 0.033, respectively, according to McNemar test).

Table 3 Diagnostic accuracy by means of the area under the ROC curve (AUC) for the MRI series (Global EMVI) and for the subgroups of surgical patients without neoadjuvance (Primary staging) or after neoadjuvant treatment (Post-CRT)
Table 4 Results for sensitivity, specificity, intragroup agreement (Fleiss’ Kappa), positive and negative predictive values and likelihood ratios

The ER group demonstrated a marked enhancement by adding DWI to the post-CRT evaluation, with an increase of 0.01 in AUC, 5.25 in PLR, 17.5% in PPV, 11.1% in sensitivity and 8.6% in specificity, also at primary staging, to a lesser extent, with an improvement in specificity (11.3%) and positive likelihood ratio (1.35), despite a decrease of sensitivity (9.5%). Both categories showed an important decrease of overstaging (7.9% and 9.9%, respectively), with only slight changes in understaging (Table 5) or intragroup agreement.

Table 5 Results for overstaging and understaging rates. The distribution of the groups and pooling by experience is the same as that in Table 3

The NER group presented some increase in accuracy at primary staging with the addition of DWI, mainly associated with PPV (9.5%), positive likelihood ratio (1.32) and AUC (0.04); with minimal changes in the post-CRT follow-up. Again, overstaging decreased in both categories (around 2.8%). The intragroup agreement varied markedly, with a kappa decrease of 0.22 due to the addition of DWI to primary staging and an increase of 0.28 in post-CRT follow-up.

The RR group showed small changes at primary staging, with a slight improvement in AUC (0.025) and sensitivity (7.2%). In the post-CRT category, a marked increase of sensitivity stood out (16.6%), despite a decrease of specificity (8.1%) and AUC (0.043). Overstaging increased in both categories but more markedly post-CRT (7.9%), with a slight decrease of understaging. On the other hand, intragroup agreement presented an important enhancement by adding DWI, also in both cases (0.14 and 0.24).

Discussion

In this study, we evaluated MRI detection of EMVI in rectal cancer, both in primary staging and post-CRT follow-up, comparing changes resulting from the addition of DWI. Our results show a significant improvement in the performance of the ER group with the additional use of DWI, especially in the post-CRT follow-up and associated with the accuracy and positive predictive parameters. Since DWI presents lower image resolution, it may be hard to understand the way it could make a difference when considering such millimetric structures. Benefits in the detection and delimitation of viable tumors, secondary to the use of DWI, have been reported in previous studies [21, 24, 25]. Furthermore, it has been suggested that using DWI could allow a more accurate evaluation of the tumor contour, with better distinction of reactive tissue or microvessels adjacent to the tumor that has a comparable signal intensity to it [19]. This could explain the observed reduction in overstaging. Bearing in mind that assessment in post-CRT MRI might be hindered by the effects of treatment, and the fact that DWI could be more helpful in these cases seems reasonable [24, 26, 27].

The previous work has shown a histological incidence of vascular infiltration in surgical specimens of 21–53.5% at primary staging and of 16.6–21% post-CRT: higher than that found in our sample (9, 27–30). This may be related to the fact that we did not include intramural vascular infiltration cases in our study, unlike some previous works [17]. Meanwhile, our MRI detection prevalence was within the documented ranges of 23.7–47.6%, with a sample range of 22%–28% (16.6–38.8% at primary staging; 15.2–28.2% at post-CRT follow-up) [31]. For primary staging, published reports of EMVI accuracy, sensitivity and specificity are of 0.65–0.94, 43%–100% and 53%–100%, respectively [8, 18, 28,29,30, 32, 33]; while those corresponding to post-CRT follow-up are 0.78–0.83, 29–76.2% and 79.7–100%, respectively [27]. Reported values of PPV and NPV for primary staging were 36–53% and 84–94%, respectively [18]. Our results were mostly within the documented ranges, but the comparison with the previous work is hindered by some methodological differences: Consensus readings between two radiologists, added contrast-enhanced sequences or samples including specific tumoral stages, were sometimes present [27,28,29].

We could only find two studies of EMVI including DWI. In recently published work using a 3.0-T MRI, Ahn et al. [18] reported no significant added value of DWI in the diagnostic performance of EMVI by two radiologists at primary staging (AUC 0.72 and 0.82, with almost no changes between readings), which differs from our results maybe due to the methodological differences. Neither did their sensitivity match ours, with lower values in our study; probably related to our significantly lower overstaging. In the second study, the use of gadolinium-enhanced T1w sequences along with DWI and the absence of histological correlation of primary staging MRI both limit in comparison with our results [34]. Those authors reported a moderate increase of sensitivity with the addition of DWI and contrast-enhanced T1w, both at primary and post-CRT staging (43–50% to 57%; and 29% to 43–57%, respectively). Despite the coincident greater improvement in the post-CRT follow-up, unlike us they reported almost no changes in accuracy and specificity.

To the best of our knowledge, no previous study included inexperienced radiologists in the MRI assessment of EMVI; their results might represent the early stages of the learning curve. Although the NER group presented a similar trend to that of the ER, in the former case it was present only at the primary staging category. We hypothesize that the previously reported increase in viable tumor detection and delimitation with the use of DWI could be related to this finding [21, 24, 25]. In the post-CRT follow-up, the results did not improve, despite the increase of the kappa value. However, the interpretation of post-CRT follow-up MRI may sometimes be challenging, hindering the identification of vessels within the fibrotic or inflammatory aftermath, or overstaging peritumoral high signal intensity in DWI [16, 18, 26]. This could have led to the absence of changes, despite the marked increase in intragroup agreement due to the better visualization of viable tumor. On the other hand, the RR demonstrated small changes, with increased sensitivity in both categories, associated with a rise in overstaging. Since they had less experience in the use of DWI, misinterpretations were more likely; factors such as edema, desmoplastic reaction or inflammation may have led to overstaging, which obviously lowers the accuracy rate [16, 35]. Nonetheless, the results of the inexperienced radiologists must be interpreted cautiously: the small amount of positive EMVI cases in the sample (particularly post-CRT) may have yielded aberrant changes between readings.

Our study had some limitations to consider. First, the retrospective nature could have led to a patient selection bias. Second, the gained experience through participating in the study could have influenced the results, in particular for the less experienced radiologists. To avoid learning bias as far as possible, the observers were not provided with feedback on their results. Moreover, randomized reviews and washout periods between readings prevented memory bias. Third, the different angulation and slide thickness of HRT2w and DWI hindered comparison; in order to increase the accuracy of the study, the same characteristics for both sequences would have been preferable. Furthermore, the DWI slide thickness also hindered proper assessment of small vessels (< 3 mm); but anyway, their identification and characterization may be challenging due to insufficient spatial resolution or partial volume artifacts [11, 18]. Finally, the statistical analysis was restricted by the small number of positive results; a problem present in most studies of this topic, due to the limited number of cases. This should be borne in mind during interpretation, especially in terms of sensitivity and PPV values, as it limits the clinical significance of the findings.

Conclusions

According to the results of the study, adding DWI to HRT2w sequences improved the diagnostic performance of experienced radiologists and downgraded overstaging, especially in the post-CRT follow-up MRI. For the inexperienced radiologists and residents this addition brought about fewer changes, with some improvements in the primary staging and increased sensitivity in both primary staging and post-CRT staging, respectively.