Introduction

Detecting and localising cancer foci within the prostate is critical. Indeed, the precise mapping of prostate cancer could increase biopsy sensitivity, allow a more rational selection and follow-up of patients under active surveillance, improve treatment planning and allow the development of focal therapies aimed at destroying tumour foci while preserving the rest of the gland [1, 2].

Multiparametric magnetic resonance imaging (mp-MRI) combining T2-weighted (T2w) imaging with diffusion-weighted (Dw) imaging, dynamic contrast-enhanced (DCE) imaging and/or magnetic resonance spectroscopy (MRS) has yielded promising results in prostate cancer detection and localisation [36].

However, there still is a lack of consensus with regard to the use of prostate mp-MRI and some urologists think it is not ready for routine use yet [7]. As a result, many groups have tried to evaluate the factors that influence mp-MRI performances and particularly the characteristics of tumours that can be missed on mp-MRI as well as the causes of false positive (FP) findings. There is an increasing body of evidence suggesting that prostate cancer location and histological characteristics (such as Gleason score, volume and architecture) influence its ability to be detected on mp-MRI [816]. Conversely, the impact of imaging factors, and particularly of the field strength and the coils used, on mp-MRI cancer detection has been less studied and remains a matter of debate [1721]. Similarly, potential causes of FP findings have been specifically evaluated only in a few studies [9, 22].

In 2008, we started a prospective database collecting mp-MRI and histopathological findings in patients treated by radical prostatectomy. Its primary purpose was to assess, using precise MR pathological correlations, the mp-MRI prostate cancer detection rates as a function of tumour Gleason score, volume, location and histological architecture. The secondary objective was to define the histological conditions explaining FP findings on mp-MRI. We report here the results obtained in 175 patients.

Materials and methods

Study population

At our institution, prostate MRI is part of the usual preoperative work-up before prostatectomy. Since September 2008, all patients treated by prostatectomy who had undergone a preoperative prostate MRI at our institution were proposed to have their data entered in a radio-pathological correlation database (CLARA-P database). Formal approval by our Institutional Review Board (IRB) was not required for this observational study. Nevertheless, the IRB reviewed the informed form and all included patients gave written informed consent. The database was also registered with the appropriate administrative authority (Commission Nationale de l’Informatique et des Libertés, no. 08-06), as requested by our national law. In total, 175 consecutive patients were included. Their mean age and prostate-specific antigen (PSA) level were 61.3 years (45–73 years) and 8.75 ng/ml (0.9–60 ng/ml).

MR protocol

All mp-MRI included T2w, Dw and DCE imaging. Patients were imaged in two departments of radiology at our institution. Department 1 had 1.5-T MRI until March 2010 (device A, n = 71) and 3-T MRI (device B, n = 46) thereafter, and used a pelvic phased array (PPA) coil only for prostate imaging; Department 2 used a 3-T MRI (device C, n = 58) and combined endorectal and PPA coils throughout the study. There was no particular patient selection about the MRI allocation.

MR image acquisition parameters are detailed in Table 1. T2w images were obtained in the axial and sagittal planes and Dw and DCE images in the axial plane only. For Dw imaging, maximum b values ranged from 600 to 2,000 s/mm2. These b values were mainly chosen to keep an acceptable signal-to-noise ratio on apparent diffusion coefficient (ADC) maps, depending on the field strength and coil configuration. For DCE imaging, an intravenous injection of 0.2 ml/kg of gadoterate meglumine (Dotarem; Guerbet, Roissy, France) was performed at 3 ml/s in all cases. Temporal resolution was adapted to the field strength and coil configuration.

Table 1 MRI parameters at 1.5 T and 3 T

T2w, Dw and DCE axial images were acquired with the same slice thickness and position in order to allow direct comparison between sequences.

Preparation of the prostatectomy specimens

After 24 h in formaldehyde solution and conisation of the apex and bladder neck for margins analysis, the prostate was strictly cut from apex to base in an axial plane using a specifically designed machine that ensured that the blocks were evenly spaced. The blocks obtained were put in formaldehyde for a further 24- to 48-h fixation period, processed and paraffin embedded as whole-mounts.

Whole-mount sections were usually obtained every 1–1.5 mm and their precise location within the blocks was monitored. Then, they were stained with haematoxylin-eosin.

Histopathological analysis

One uropathologist with 10 years of experience at the start of the database in 2008 reviewed the whole-mount sections. She was blinded to all MR data.

Tumours were localised using the same dedicated diagram as radiologists. Gleason score was assessed for all tumour foci. Tumour architecture was classified into four categories: dense cancers (continuous malignant tissue with minimal intervening benign glands), infiltrative cancers (rare and spaced tumoral glands intermixed with benign glands with no dense tumoral component), mixed cancers (dense and infiltrative tumoral components) and lobulated cancers (well-differentiated transition zone cancers).

Only individual tumours with minimum in-plane dimensions of 2 × 2 mm2, visible on at least two section levels and with a Gleason score ≥5 were delineated on each section; others were not taken into consideration. Malignant regions that were less than 1 mm apart from each other in the same plane, with the same architecture and Gleason score were considered to be parts of the same tumour. Tumour limits (including infiltrative components) were outlined on the whole-mount sections by the pathologist. Then, 3-mm spaced whole-mount sections were selected for MR-pathological correlation and digitised alongside a ruler for calibration purpose. The volume of all tumours was calculated on digitised sections using dedicated in-house software (Matlab; Mathworks, Natick, MA USA).

MR image analysis

MR images of the patients included in the database were all independently reviewed by the same two uroradiologists. These readers had 11 years and 1 year of experience in prostate imaging at the start of the database in 2008. Although patients were formally included in the database at the time of the prostatectomy, mp-MRI interpretation for purposes of the study was done on average 3 months later in order to avoid any recall from routine management of the patients. Additionally the readers were blinded to any clinical, biological and histopathological data concerning the patients, except the fact that they had undergone a prostatectomy.

First, readers noted all suspicious focal abnormalities (FAs) and localised them using a 27-regions-of-interest diagram, as recommended by a European consensus panel [20]. In the peripheral zone (PZ), all FAs showing low-signal intensity on T2w images and/or ADC maps, and/or showing early enhancement on DCE images were taken into consideration. In the transition zone (TZ), only homogeneous low-signal intensity areas on T2w images, with ill-defined margins, no visible capsule and no cystic component [2325] were interpreted as suspicious. ADC maps and raw DCE images were assessed only visually. Quantitative ADC values were not used to diagnose cancer. No enhancement parametric map was generated.

Second, the degree of suspicion of a given FA was assessed using a five-level subjective suspicion score (SSS: 0, definitely benign; 1, likely benign; 2, indeterminate; 3, likely malignant; 4, definitely malignant), as recommended by several European guidelines [18, 20]. No clear diagnostic criteria have yet been assigned to the five categories of the SSS. In the current study, areas with normal signal on all sequences were assigned a default SSS of 0/4 (definitely benign) and were not taken into account in the analysis. FAs that were, by definition, areas with abnormal signal at least on one sequence received an SSS of at least 1/4. Typically, nodules with typical malignant appearance on all three sequences received a 4/4 score (definitely malignant). The distribution of the other combinations of signal abnormalities in the 1–3/4 categories was left to the readers’ appreciation.

Finally, readers specified the presence and degree (none, minimal, marked) of post-biopsy blood artefact in near vicinity of each FA, as shown on the first (unenhanced) dynamic acquisition.

MR histological correlation

Histopathological and mp-MRI findings were then correlated by the readers and the uropathologist. First, the readers and the uropathologist reviewed the MR images together and the readers disclosed the FAs they had noted and their location. Then, the uropathologist decided which FAs matched histological cancers and which did not, using side-by-side comparison and taking into consideration as many landmarks (e.g. cysts, hyperplasia nodules, ejaculatory ducts) as possible. Matching FAs were considered true positives only if their largest diameter was within 50–150 % of the largest diameter of the corresponding histological cancer. Otherwise, the FA was considered an FP and the cancer a false negative (FN). In the case of FPs, the pathologist reviewed the whole-mount sections a second time to assess if a benign condition could explain the MR abnormality.

Statistical analysis

Generalised linear mixed models (GLMM) were used to detect the influence regarding the FN and FP prediction of the following factors: tumour Gleason score, tumour histological volume, tumour architecture, tumour location (PZ vs TZ), field strength (1.5 T vs 3 T), coils used for imaging (PPA alone vs combined endorectal-PPA coils), patient age, preoperative PSA value and tumour pT stage. Individual random effects were used in order to take into account the intra-patient correlation. The model selection was performed using the likelihood ratio and the quasi-alike information criterion. Each regression coefficient was also tested using the Wald test. All GLMM models were calculated using the R package lme4. The odds ratios (ORs) were computed to quantify the predictive ability of factors. Agreement between readers’ SSS score was measured with the kappa coefficient. P < 0.05 was considered statistically significant and all confidence intervals (CI) were computed to 95 %.

Results

MR findings

Three hundred and sixty-nine FAs were described by reader 1, and 322 by reader 2. The proportion of FAs with a SSS of 1/4, 2/4, 3/4 and 4/4 were 22 % (81/369), 28 % (103/369), 25 % (93/369) and 25 % (92/369) for reader 1 and 9 % (30/322), 30 % (97/322), 31 % (100/322) and 30 % (95/322) for reader 2. The scoring concordance was moderate (kappa = 0.4, P = 0.02). Minimal and marked post-biopsy haemorrhage was visible in near vicinity of respectively 29 and 13 FAs for reader 1, and 31 and 30 FAs for reader 2.

Histological findings

The pathological T stage was pT0 in 3 (2 %) patients, pT2a in 13 (7 %), pT2b in 5 (3 %), pT2c in 78 (44.5 %), pT3a in 57 (32.5 %) and pT3b in 19 (11 %).

Pathological examination found 362 tumours (mean, 2.07 per patient). The Gleason score was 5 in 21 (6 %) tumours, 6 in 181 (50 %), 7 in 113 (31 %), 8 in 26 (7 %) and 9 in 21 (6 %). Sixty-six (18 %) tumours were in the TZ and 296 (82 %) in the PZ. The histological architecture was dense in 107 (30 %) tumours, mixed in 201 (55 %), infiltrative in 25 (7 %) and lobulated in 29 (8 %). Tumour volume was <0.05 cc (cm3) in 46 (12.5 %) tumours, between 0.05 cc and 0.5 cc in 141 (39 %), between 0.5 cc and 2 cc in 100 (27.5 %) and >2 cc in 75 (21 %). The mean and median tumour volumes were 1.4 cc and 0.5 cc respectively (range, 0.04–15.7 cc).

MR histological correlation

Factors influencing tumour detection

Overall mp-MRI tumour detection rates were 59 % (214/362) and 53 % (192/362) for reader 1 and 2 respectively (Figs. 1 and 2).

Fig. 1
figure 1

Multiparametric axial MR images obtained on scanner A at 1.5 T (a T2w image; b ADC map; c DCE image) and histopathological whole-mount section (d) in a 62-year-old patient with a PSA level of 5.3 ng/ml. Both readers attributed a maximal SSS of 4 to the nodular focal abnormality in the right peripheral zone (ac, arrow). Whole-mount section showed a Gleason 7 (3 + 4) cancer with a mixed architecture (d, green outline). Note that only the dense component (d, arrowheads) was visible at MRI, leading to a substantial underestimation of tumour volume

Fig. 2
figure 2

Multiparametric axial MR images obtained on scanner B at 3 T (a T2w image; b ADC map; c DCE image) and histopathological whole-mount section (d) in a 53-year-old patient with a PSA level of 7 ng/ml. Readers 1 and 2 gave an SSS of 3 in the left peripheral zone (ac, arrow). Reader 2 noted an additional abnormality in the right PZ with a SSS of 2 (ac, arrowheads). Whole-mount section showed a Gleason 7 (3 + 4) cancer with a mixed architecture (d, red outline). In the right PZ, only high-grade prostate intraepithelial neoplasia was visible at pathology

Table 2 shows, for both readers, the tumour detection rates expressed separately as a function of the histological characteristics of the tumour, the imaging field and the coils used.

Table 2 Tumour detection rates on mp-MRI as a function of imaging and histological parameters

At multivariate analysis, detection rates were significantly influenced, for both readers, by tumour Gleason score, histological volume, histological architecture and location (P < 0.0001 for all characteristics and readers; Table 3). Field strength, coils used, patient age, preoperative PSA value and pT stage did not significantly influence tumour detection.

Table 3 Influence of tumour location, volume, architecture and Gleason score on tumour detection at mp-MRI (multivariate analysis)

Tables 4 and 5 show, for both readers, the breakdown of tumour detection rates as a function of tumour Gleason score and histological volume, at 1.5 T and 3 T (Table 4) and with and without an endorectal coil (Table 5). Overall detection rates for tumours of <0.5 cc, 0.5–2 cc and >2 cc were 33–45/155 (21–29 %), 15–19/35 (43–54 %) and 8–9/12 (67–75 %) for Gleason ≤6, 17/27 (63 %), 42–45/51 (82–88 %) and 34/35 (97 %) for Gleason 7 and 4/5 (80 %), 13/14 (93 %) and 28/28 (100 %) for Gleason ≥8 cancers respectively.

Table 4 Tumour detection rate at multiparametric MRI as a function of the tumour Gleason score, the tumour histological volume and the imaging field
Table 5 Tumour detection rate at multiparametric MRI as a function of the tumour Gleason score, the tumour histological volume and the type of coils used

Factors influencing false positive findings

FP rates were 42 % (155/369) and 40 % (130/322) for reader 1 and 2 respectively.

Main causes of FP findings included active or chronic prostatitis, post-inflammatory glandular atrophy and high-grade prostate intra-epithelial neoplasia (Table 6). There was at least one retrospective explanation for 111/155 (72 %) and 103/130 (79 %) FPs for reader 1 and 2 respectively. At multivariate analysis, the field strength and coils used did not significantly influence the FP rate for reader 1. For reader 2, the use of combined PPA-endorectal coils was significantly associated (P < 0.003) with a lower rate of FPs.

Table 6 Causes of multiparametric MRI false positive findings

Value of the SSS

The SSS was a significant predictor of the malignant nature of FAs for both readers (P < 0.005, Table 7). Compared with FAs with an SSS of 1/4, FAs with an SSS of 2/4, 3/4 and 4/4 had ORs for being malignant of 1.96 (95% CI, 1–3.9), 5.5 (2.7–11) and 123 (28.9–1,145) for reader 1 and 4.7 (1–43.5), 34 (7.9–322) and 506 (71–7,986) for reader 2.

Table 7 Percentage of malignant FAs as a function of the subjective suspicion score (SSS)

The SSS was also a significant predictor of tumour aggressiveness among cancers detected (P < 0.00001, Table 8). Compared with malignant FAs with an SSS of 1–2/4, malignant FAs with a SSS of 3/4 and 4/4 had ORs of having a Gleason score ≥7 of 3.74 (95% CI, 1.6–7) and 24.3 (8–52) for reader 1 and 1.5 (0.6–3.3) and 7.8 (2.6–20.6) for reader 2.

Table 8 Percentage of Gleason ≥7 tumours in malignant FAs as a function of the subjective suspicion score (SSS)

Discussion

The purpose of our study was to evaluate the influence of imaging parameters and tumours’ histological characteristics on mp-MRI prostate cancer detection. We found four significant predictors of tumour detection: the tumour location, histological architecture, Gleason score and histological volume.

The fact that TZ tumours were significantly less detected comes as no surprise, as they are reputed to be difficult to distinguish from benign hyperplasia nodules, even if specific T2w image features have been recently described [2325].

The influence of the Gleason score on prostate cancer detection on mp-MRI has already been assessed in many studies. MR parameters such as tumour-muscle signal intensity ratio on T2w imaging [26], ADC value [2732] or (choline + creatine)/citrate ratio on MRS [33, 34] have been found to be correlated with tumour Gleason score and/or aggressive behaviour. Girouin et al. [8] reported detection rates for Gleason 6, 7 and ≥8 cancers of 11.8–16.9 %, 57.4–67.2 % and 64.1–73.1 % respectively on T2w imaging and 29.7–33.9 %, 77.5–80.7 % and 96.1–96.7 % on DCE imaging. Two recent studies also found that mp-MRI including T2w, Dw and DCE imaging, with [10] or without [11] MRS had a significantly better sensitivity for cancers with higher Gleason scores. Our results (Tables 4 and 5) strongly suggest that mp-MRI is particularly sensitive for cancers containing grades 4–5 carcinoma. In this respect, it could have been interesting to assess detection rates of Gleason 3 + 4 and 4 + 3 tumours that seem to exhibit significantly different prognosis [35, 36]. This analysis was not possible because of the small number of cases involved, but as our database grows in time, it will become possible.

The impact of tumour volume on tumour detection on mp-MRI has seldom been evaluated [13], maybe because the calculation of the histological tumour volume is difficult [37]. If tumour surface on each whole-mount section can be easily measured, thickness of histological blocks, and thus spacing between whole-mounts, is usually not precisely monitored. We paid particular attention to this point by monitoring the distance between whole-mount sections in order to estimate tumour volume as accurately as possible. We did not use any correction factor for tissue shrinkage because these correction factors remain controversial and fall within the wide range of 1.14–1.5 in the existing literature [38, 39]. Tissue shrinkage probably varies from one laboratory to another, possibly from one prostate to another. Only an estimation of the shrinkage on a patient-by-patient basis could provide precise evaluation of tumour volume. We are currently trying to develop a precise method for MR-pathological co-registration that remains compatible with routine workflow and therefore allows estimation of tissue shrinkage for each individual patient [40]. In the meantime, our results may slightly overestimate mp-MRI performance because tissue shrinkage is not taken into account.

Tumour architecture is a known predictor of detection at mp-MRI [11, 14], with dense tumours better detected, probably because T2w and Dw imaging are particularly sensitive to increased cellularity [15]. It remains unclear, however, whether tumour architecture is associated with tumour aggressiveness or not.

The influence of the imaging field strength and coils on tumour detection remains unclear. The use of high-field strength and/or the use of the endorectal coil provide an excellent signal-to-noise ratio and are associated with improved image quality or depiction of prostate anatomy [17, 41, 42]. Local staging seems improved by the use of the endorectal coil at 1.5 T and at 3 T [17, 41]. The comparison of the staging accuracy at 1.5 T with an endorectal coil and at 3 T with PPA coils showed similar results [43, 44]. As a result, the use of the endorectal coil remains the state of the art at 1.5 T, whereas it is usually considered optional at 3 T [18]. Nevertheless, data on the influence of the use of high-field strength or the endorectal coil on tumour depiction are scarce and limited to T2w imaging [17, 44, 45]. To our knowledge, no randomised study or any direct comparison in the same patients have ever assessed the tumour detection obtained with mp-MRI (i.e. including not only T2w but also Dw, DCE imaging and/or MRS) at 1.5 T versus 3 T or with versus without an endorectal coil. In this study, we aggregated data from three different MR systems using different field strength and coil configuration. Although this induces heterogeneity in the series, it also more closely reflects current routine practice and allows the assessment of the impact of technical imaging factors on tumour detection through data stratification. Our results strongly suggest that field strength and coil configuration have little influence on tumour detection compared with other factors such as histological characteristics. This is in line with the excellent results in tumour detection published at 1.5 T with PPA alone [3, 46]. We believe that this is due to the fact that tumours that are not detected at current mp-MRI are overlooked not because of a lack of signal but rather because of a lack of contrast with neighbouring normal tissue.

We characterised all visible FAs using a suspicion score based on a five-level scale, as recommended by several European guidelines [18, 20]. This score was a highly significant predictor of malignancy for both readers, and could be a way of addressing the issue of the lack of specificity of mp-MRI. Indeed, respectively 78 % and 76 % of FPs of readers 1 and 2 had a SSS of 1–2/4 (Table 7). We displayed our results by taking into account all FAs, whatever their SSS. Had we chosen a threshold of 1/4 or 2/4 for diagnosis of malignancy, the PF rate would have been largely improved, but at the expense of a slight decrease in the tumour detection rate (Table 7). However, we do not think that routine mp-MRI results should be binary (benign/malignant), but rather than the SSS assigned to every visible FA should be disclosed to the corresponding urologist in order to guide more precisely patient management. Indeed, we found the SSS to be not only a significant predictor of malignancy but also of tumour aggressiveness (Table 8). In other words, a FA that has a typical appearance of cancer on all sequences (SSS of 4/4) has not only a high probability of being malignant but also of being a Gleason ≥7 tumour. This finding might have important consequences on the management of candidates to repeat biopsy, active surveillance, or focal therapy.

Nevertheless, the SSS remains subjective and the good results obtained in a department of uroradiology might not be reproduced in non-specialised institutions. Therefore, suspicion scores based on more objective criteria are needed [47]. The recently published PiRads scoring system [18] might be less subjective, but this remains to be assessed.

Our study has some limitations. Firstly, we included only patients who underwent prostatectomy, which inevitably induced a selection bias. However, this bias was limited by the fact that we took into account all cancer foci in the specimens and not only the index tumour, which was usually the easiest to detect. Secondly, MR-histopathological correlation remains difficult because of the difference in angle section between whole-mounts and MR images. Despite all the care taken to ensure a match between histological cancer foci and MR FAs, mismatches might have occurred. Thirdly, we did not use MRS. Although this technique has yielded interesting results [4, 34, 48], its additional value to mp-MRI remains questionable [49, 50]. Furthermore, its long acquisition time makes it difficult to use in daily practice. As a result, MRS is not routinely performed at our institution.

In conclusion, prostate cancer location, histological volume, Gleason score and histological architecture were independent significant predictors of tumour detection on mp-MRI, whereas the use of a high field strength and/or endorectal coil was not. The use of a five-grade subjective score can significantly stratify the risk of malignancy and aggressiveness of an abnormality seen on mp-MRI. We believe that these findings will be useful in determining the role of mp-MRI before prostate biopsy or in the management of candidates for active surveillance or focal therapy.