Prostate cancer (PCa) is the most commonly diagnosed malignancy of the male genitourinary tract [1]. With the advent of the PSA era, remarkable improvements in its early detection and management have led to a significant increase in the survival rate. On the other hand, the discovery of PCa at its earliest stages spurred questions about the clinical significance of some of these tumors. The balance between the need for early diagnosis and the avoidance of over diagnoses represents a delicate issue, especially if we consider that current recommendations for PCa detection did not change since several years, including PSA level testing, digital rectal examination (DRE), and prostate biopsy (PBx) guided by transrectal ultrasound. These methods, however, have limited accuracy for the majority of early cancers [2] and do not help to clarify the issue about clinically significant PCa.

In the last years, prostatic multiparametric magnetic resonance imaging (mp-MRI) based on T2-weighted imaging has achieved an important role in the detection and staging of PCa [3]. However, mp-MRI accuracy can be hindered by several factors, including tumor sizes below 5 mm, the seat apical or basal, prostatitis, calcifications, scars, and hemorrhagic foci. Furthermore, the variability in mp-MRI technical features and definitions of PCa clinical significance hamper the comparison of mp-MRI outcomes and its generalization to clinical practice [4]. Keeping in mind these limitations, a promising potential for early PCa detection has been attributed to mp-MRI, when properly used. With the advent of MRI-targeted biopsies, the bioptic detection rate, usually estimated around 33% for standard ultrasound-guided extended biopsy protocols, showed a marked improvement, both in men with previous negative PBx and in naive ones [5, 6].

At present, the most reliable mp-MRI predictors of PCa are functional sequences including diffusion-weighted imaging (DWI) with related apparent diffusion coefficient (ADC), which highlights cell proliferation, and dynamic contrast-enhanced (DCE) imaging for neo-angiogenesis. DWI-calculated ADC values of detected lesions were found to have a moderate inverse correlation with the Gleason score (GS), which is the single most important prognostic parameter denoting PCa aggressiveness [711]. However, no definitive recommendations were given about the role of mp-MRI in the pre-PBx setting.

We tried to shed light on these issues, of great interest to both radiologists and urologists, by exploring with a retrospective study how negative and positive biopsies are related to mp-MRI parameters. Our scope was to identify the parameters that could help clinicians to decide whether a PBx is actually necessary, and if this is the case, to aim the biopsy on the most significant lesions.

Materials and methods

Study population

The study population was extracted from 525 consecutive patients who underwent prostatic MRI in our facility between January 2011 and December 2014. All patients had suspicion of PCa, due to elevated PSA, positive DRE, and/or transrectal ultrasound findings.

We considered only patients whose suspicious lesion was identified by MRI in the peripheral zone and underwent PBx within 3 months after MRI.

Exclusion criteria were findings of multifocal bilateral neoplastic areas at subsequent PBx (in order to obtain a good correlation between the mp-MRI suspected area and biopsy), and a recent previous PBx (to avoid the risk that post-biopsy hemorrhage might determine artifacts on DWI and ADC map). The final number of patients thus amounted to 175.

All patients signed an informed consent form and the study was conducted in good clinical practice according to the Helsinki Declaration of 1975 and subsequent modifications.

MRI protocol

MRI was performed on a 1.5 T superconductive magnet system (Achieva version 2.6; Philips Medical Systems, DA Best, Netherland) consistent with the Consensus Conference of the European Association of Urology [12]. We followed the ESUR (European Society of Urogenital Radiology) prostate MR guidelines of 2012 [13], using the cardiac 32-channels surface coil (pelvic phased-array coil PPA) instead of the endorectal coil (ERC), which has recognized drawbacks in terms of cost and patient acceptability. If tolerated, a 1 mg Hyoscinbutylbromide (Buscopan, BoehringerIngelheim, Ingelheim am Rhein, Germany) intravenous injection was administered to the patient to reduce peristaltic motion.

ADC map was automatically reconstructed on a pixel-by-pixel basis (b = 0 and 1500 s/mm2). The examination was completed with high-temporal resolution DCE imaging with Gradient-echo (GRE) axial fast 3D T1-weighted fat suppression sequence (started after) and intravenous administration of 0.1 mmol/kg at a rate of 3.0 mL/s of Gadobenic acid (Gd-BOPTA, Multihance®, Bracco Corporates, Italy), followed by a 20-mL saline flush at the same infusion rate. Approximately, 40 sets of images were acquired to monitor the time course of the contrast agent uptake and clearance within the prostate.

The main features of the study protocol are summarized in Table 1.

Table 1 mp-MRI protocol of our study

Imaging analysis

Image evaluation was performed using a medical imaging workstation (MR Extended Work Space 2.6.3.2 2009 software, Philips Medical Systems, Eindhoven, The Netherlands).

Two observers (R.F. and G.B., with 5 and 3 years experience in prostate MR imaging) reviewed and evaluated qualitatively all images (DWI, T2W and DCE) in a randomized and blinded way; for DWI images, they also measured the corresponding ADC values both in the suspicious lesion (SL) peripheral area and in the contralateral peripheral zone with no suspicious lesions (NSL). We considered only tumors in the peripheral prostate because ADC is significantly different for tumors occurring in the transitional zone compared to the peripheral zone.

A lesion was considered suspicious when it had (i) clear definition (≥5 mm); (ii) low signal intensity on T2W with restricted diffusion; (iii) hypointensity on the ADC map; and (iv) intense early enhancement and rapid washout on DCE. These features correspond to a most probably malignant (PI-RADS 5) on PI-RADS classification [14].

Suspect hypointense areas on T2W were searched on DWI sequences; when confirmed, a ROI (region of interest) was drawn manually on the ADC map over the center of the suspect zone, excluding the tumor edges. We used the ROI drawn on the T2W dataset instead than on the DWI one, because of the significantly better spatial resolution of the former. The suspicious lesion ROI was automatically converted from the ROI on T2W images to DCE-MRI using image-registration software. The area of ROIs ranged from 5 to 18 mm2 due to the wide range in size of prostate glandes and PCas. The information on the contralateral peripheral parenchyma, cancer-free according to the histological report, was obtained by drawing a ROI with size from 5 to 20 mm2 on the same image of the suspicious lesion.

Based on the brightness of the color-coded value of each parameter, the outcome of the mp-MRI exam was summarized for each patient as follows: T2W = 0 or 1, with 1 standing for predominantly distinctly hypointense compared to the hyperintense glandular tissue of the peripheral zone, DWI = 0 or 1, with 1 standing for strongly and focal signal increment. When DWI = 1 and/or T2W = 1, there was also the information on the ADC value and on DCE = 0 or 1, with 1 standing for a curve with immediate washout after peak enhancement [15].

Prostate biopsy protocol

All ultrasound-guided biopsies were performed by experienced urologists. An extended 12-core template was followed, as recommended by European Association of Urology guidelines [16]. Dedicated uro-pathologists experienced in urogenital diseases were responsible for all histopathologic reports. Gleason score was attributed according to the 2005 International Society of Urological Pathology (ISUP) [17]; in case of more than one positive cores, the Gleason score considered was the highest found.

Statistical analysis

Continuous variables were checked for normality with Kolmogorov–Smirnov–Lilliefors test, D’Agostino test, and the Shapiro–Wilk W test. Since no evidence against normality was found, they were reported as mean ± standard deviation. Statistically significant differences among k = 2 independent distributions were investigated with Student t test and double-checked with Mann–Whitney’s test (age, PSA and ADC in positive and negative biopsies): for k > 2 with ANOVA and Kruskal–Wallis’s test (ADC for different GS grades). The relation between a bivariate set of paired continuous xy rankings was explored with Spearman’s rank correlation coefficient ρ with its relative p (ρ = o).

Binary variables, reported as counts and percentages, were arranged in 2 × 2 tables studied with the Chi-square test with Yates’ correction and Fisher exact test, estimating also the odds ratio (OR) with its 95% confidence interval (CI).

Statistical significance was set at two-tails p < 0.05 and OR 95% CI’s not including 1.

Inter-reader agreement was assessed in different ways for qualitative and quantitative analysis. For the former we used Cohen’s coefficient kappa (<0.20, poor agreement; 0.21–0.40 fair; 0.41–0.60 moderate; 0.61–0.80 good; 0.81–1.00 excellent). Agreement on ADC reading was determined with the Student test for 2 correlated distributions and the non-parametric Wilcoxon’s test: p > 0.05 implies the absence of statistically significant differences between the two readers, the absence which gets more and more important up to p = 1 (lack of differences, perfect agreement on the ADC value).

ADC discrimination ability was assessed by the receiving operating characteristics (ROC) curve, and the quality of discrimination was measured by the area under the curve (AUC), 0.5 being null (chance) and then increasing from 0.6 (poor) to 1 (excellent). The threshold disease/not disease was set at the value which maximized the harmonic mean (HM) of sensitivity and specificity, Jouden’s index J, Cohen’s kappa, and diagnostic accuracy and also minimized the distance D of the curve from the (0,1) upper left vertex.

Open source software (www.openepi.com and www.vassarstats.net) and Statplus:Mac version v6 (AnalystSoft, Walnut, CA, US) were used along with Excel spread sheets. All statistical procedures were run on at least two different packages.

Results

The consensus on the imaging analysis between the two readers was very good in both qualitative analysis (Cohen’s concordance coefficient k = 0.94 ± 0.2) and on the measured ADC values (p = 0.89).

Each patient was described by the parameters of his mp-MR examination (T2W, DWI, ADC values and DCE) and the outcome of his biopsy (negative/positive; in the latter case with the reported GS). Table 2 reports the descriptive parameters of patients. Patients negative at PBx were successively followed-up with regular clinical visits comprehensive of PSA measurement and DRE, for 19.4 ± 1.7 months.

Table 2 Descriptive parameters of patients enclosed in the study

DWI = 0 and T2W = 0 characterized 43/104 (41%) patients with negative biopsies and no patients with positive biopsies (p < 0.0001).

DWI = 1 characterized 61/104 (59%) patients with negative biopsies and all 71 (100%) patients with positive biopsies. These 132 patients constituted the sample considered for ADC and DCE analysis.

Table 3 reports the average ADC values measured in the SL and NSL peripheral areas. The former were significantly lower for patients with positive biopsies than for patients with negative biopsies, whereas the latter were quite similar; in both cases, ADCSL was significantly lower than ADCNSL. The computation of the ratio ADCnorm = ADCSL/ADCNSL had the double purpose of evening out biological differences within patients and allowing comparisons with other studies.

Table 3 Patients with DWI = 1 (N = 132): apparent diffusion coefficient (×10−3 mm2/s) in negative and positive biopsies

The ROC curve procedure assessed the ability of ADCSL to discriminate between suspicious lesions confirmed as tumors and those negative at biopsy. The top panel of Fig. 1 shows the ROC curves relative to ADCSL and ADCnorm, which share the value of AUC = 0.80 (95% CI 0.70–0.90). The bottom panel of Fig. 1 shows that both ROC curve parameters HM and k have their maximum at ADCSL = 0.90 × 10−3 mm2/s, which is thus identified as the threshold above which there is the highest degree of association with negative biopsies. The corresponding diagnostic parameters are 60% sensitivity, 92% specificity, and 75% diagnostic accuracy.

Fig. 1
figure 1

ROC procedure to estimate the ADC ability to discriminate between negative and positive biopsies. Top panel ROC curves for ADCSL and ADCnorm (AUC = 0.80). Bottom panel Determination of the threshold (ADCSL = 0.9 × 10−3 mm2/s)

The domain characterized by ADCSL > 0.90 × 10−3 mm2/s includes 40 patients, 34 (85%) with negative biopsies and 6 (15%) with positive biopsies, 4 of them GS6, and 2 GS 7(3 + 4). The complementary domain ADCSL ≤ 0.90 × 10−3 mm2/s includes 92 patients, 65 (71%) with positive biopsies and 27 (29%) with negative biopsies (p < 0.0001).

The incidences of the mp-MRI parameters in negative and positive biopsies are reported and are compared in Table 4. Their relative OR 95% CIs are plotted in the forest plot of Fig. 2. No 95% CI includes 1, evidencing a significant association either with negative biopsies or with positive biopsies; the farther from 1 the OR 95% CI, the strongest the association.

Table 4 Incidence of mp-MRI parameters
Fig. 2
figure 2

Forest plot with mp-MRI predictors of negative or positive biopsies. The intervals represent the odds ratio 95% CIs, on a logarithmic scale. OR intervals <1: significant association with negative biopsies; OR intervals >1: significant association with positive biopsies

A similar procedure identifies 0.6 as the corresponding threshold for ADCnorm.

It is well known that both the average ADCSL and ADCnorm in positive biopsies are influenced by the incidence of the different Gleason scores. In our study, GS ranged from 6(3 + 3) to 9(5 + 4), as detailed in Tables 1 and 5. Figure 3 shows how ADCSL has a significant moderate inverse correlation with GS: Spearman’s rank correlation coefficient is ρ = −0.40, with p (ρ = 0) = 0.001, consistent with the statistically significant decrease of ADCSL and ADCnorm (p = 0.014).

Table 5 Patients with positive biopsies (N = 71): mp-MRI parameters for different GS
Fig. 3
figure 3

ADCSL plotted as a function of the Gleason score (Spearman’s ρ = −0.40)

The previous results are confirmed by the ROC curve procedure: AUC = 0.72 for the ability of ADCSL and ADCnorm to identify GS > 6 levels, with thresholds ADCSL ≤ 0.70 × 10−3 mm2/s and ADCnorm ≤ 0.50.

Figure 4 shows the sequences relative to one case with negative mp-MRI and three cases with positive mp-MRI and biopsies corresponding to two GS values.

Fig. 4
figure 4

mp-MRI sequences of four typical cases (T2W, DWI, and ADC map): A patient with negative MRI diagnoses (ADCNSL=1.61 × 10−3 mm2/s); B patient with positive mp-MRI (left posterolateral base nodule) and positive biopsy with GS = 6 (ADCSL = 0.85 × 10−3 mm2/s; ADCnorm = 0.54); C patient with positive mp-MRI (left posterolateral paramedian focal lesion) and positive biopsy with GS = 7(3 + 4) (ADCSL = 0.75 × 10−3 mm2/s; ADCnorm = 0.48); and D patient with positive mp-MRI (left posterolateral middle wide lesion) and positive biopsy with GS = 9(5 + 4) (ADCSL = 0.60 × 10−3 mm2/s; ADCnorm = 0.38)

Discussion

Our study showed that DWI = 0 and T2W = 0 were associated with negative biopsies for all our patients. DWI evaluates differences in water molecule diffusion by qualitative visual assessment using relative signal intensity and also by quantitative assessment by calculating the ADC. Because of the many tightly packed glandular elements with little central space for mucin or fluid storage in prostate cancer, ADC values, which correspond to the restriction of water displacement, are reported significantly lower than in normal prostates [11]. Although ADC represents capillary perfusion and diffusion characteristics, the use of a large b value can reduce the influence of capillary perfusion. Proton diffusion properties of water in high b value with increased signal to noise ratio (SNR), diffusion-weighting, and diminished T2 shine-through may reflect the cellular density and structural changes in prostate cancer more strongly than in standard b value in DWI. On the other hand, a b value >2000 induces a decrease in SNR and an increase in susceptibility artifacts and image distortion [18].

The finding of NPV = 100%, with 95% CI probability of having a positive biopsy 0–5%, is in agreement with recent works [1921]. If confirmed in larger datasets, this result would be an important achievement, as up-to-date, in the presence of PCa clinical suspicion, a PBx is advised irrespectively of MRI findings. However, this association should be tested in more extreme situations, such as patients with very high levels of PSA or frankly positive DRE, to verify if it is still safe to avoid a PBx also in these cases.

As for ADC, in the presence of signal increment, DWI-calculated ADCSL values in suspicious areas were significantly different between positive and negative PBx, with no overlapping of the relative 95% CIs 0.87–0.95 × 10−3 mm2/s for the former and 0.68–0.76 × 10−3 mm2/s for the latter. On the contrary, in the NSL zones, the ADC values were similar for negative and positive biopsies, with average value 1.56 ± 0.21 × 10−3 mm2/s, consistent with other studies with similar b values: 1.48 ± 0.29 for Woodsfield [22], 1.60 ± 0.25 for Hambrock [23], and 1.79 ± 0.28 for Tamada with b = 0–1000 and 1.32 ± 0.24 with b = 0–2000 [18].

The ROC curve procedure allowed to associate the region of ADCSL > 0.9 × 10−3 mm2/s to an 85% probability of negative biopsies, presumably with a scant probability of histological presence of cancer. For ADCnorm = ADCSL/ADCNSL, the corresponding threshold was 0.60. To test its general applicability, we considered results obtained in other studies. Again, the wide variability in mp-MRI technical features (field strength, coils, highest b values) and clinical factors (GS distribution, cancer, and cancer-free zones considered) allowed only rough comparisons [4]. Recent studies with technical characteristics not too different from ours reported for ratio of ADC between cancerous and not-cancerous regions ≈0.54 for Tanimoto [24], ≈0.50 for Woodfield [22], ≈0.53–0.55 for Tamada [18], and ≈0.49 for Boesen [25]. Despite the heterogeneity of these studies, all figures satisfy the threshold ADCnorm > 0.60 as indicator of benignity.

In patients where PCa was found at biopsy, our study confirmed the existence of a moderate inverse correlation between ADCSL and GS, measured by ρ = −0.40, midway between the values −0.50 by Tamada [18] and −0.42 by Boesen [25], and the values −0.38 by Thormer [26] and −0.37 by Oto [3]. Also, a significant ADC decrease was detected from GS = 6 to GS ≥ 8.

The ADCnorm ≤ 0.50 threshold for GS > 6 suggested by the ROC curve accounts for the data by Woodfield [22] (≈0.45), Boesen [25] (≈0.46), Thörmer [26] (≈0.46), and Tamada [18] (≈0.49–0.51). It is important to notice that all these studies refer to GS measured after prostatectomy.

DCE = 0 was a significant predictor of negative biopsies, even if to a minor degree than DWI = 0 and ADC > 0.90 × 10−3 mm2/s. In positive biopsies, DCE = 1 did not contribute significantly to the GS stratification (p = 0.40).

Our study is affected by the limitations deriving from its retrospective nature.

Needle biopsies are known to underestimate GS in approximately 25% of the cases compared with prostatectomy specimens because of biopsy sampling errors and tumors heterogeneity [3]. Furthermore, in our series, biopsies were performed with visual registration, the so-called “cognitive” approach, as at the time of study enrollment a mpMRI/ultrasound fusion device was not available at our institution. Therefore, we have to consider a non-negligible percentage of human error typical of cognitive approach. As for standard, extended PBx, a 20% of false negatives has to be taken into account, considering that, even in positive cases, some cancer foci may be missed by the biopsy.

All that considered, our results suggest that mp-MRI has the potential for becoming a first-line examination in patients with clinical suspicion of peripheral PCa. Negative MRI findings, including DWI = 0, T2W = 0, and ADCnorm values >0.60, might spare unnecessary PBx, at least in patients not at very high clinical PCa risk. On the other hand, a strong MRI suspicion should not be ignored, particularly if ADCnorm falls below 0.5, prompting a targeted biopsy if possible.