Introduction

Prostate cancer is the most common cancer and the second most common cause of cancer deaths among men in Western countries [1]. There are numerous therapeutic options including active surveillance, radical prostatectomy, several kinds of radiation therapy, hormonal therapy or combined approaches [2]. Optimising treatment strategies requires a careful establishment of an individual prognosis to avoid unnecessary therapy-induced morbidity or treatment failure. Fundamental to this effort is an ability to achieve a reasonable degree of accuracy for preoperative staging.

Magnetic resonance imaging (MRI) is able to clearly demonstrate zonal anatomy and adjacent anatomical structures on T2-weighted (T2W) images [3]. Overall staging of prostate cancer with T2W MRI at 1.5 T has shown a large range of sensitivities (51–89 %) and specificities (67–87 %) [414]. The reported ranges of sensitivity and specificity for detection of extracapsular extension (ECE) are 23–91 % and 84–97 %, respectively [4, 9, 12, 15, 16]. This wide performance range is a negative factor for the broad acceptance of MRI as a routine component of the clinical staging of prostate cancer and is clearly influenced by the experience of the reader [1719].

Dynamic contrast-enhanced (DCE) MRI frequently uses a model-based post-processing approach to analyse images and generate parameters relating to microvascular characteristics of prostate tissue [2024]. Results of such processed DCE MRI have correlated with markers of tumour angiogenesis [2527]. Hence, the additional information gained from DCE MRI has been applied to help differentiate benign from malignant tissue, including within the prostate [2831]. Most of the studies regarding DCE MRI of the prostate emphasise high temporal resolution at the expense of spatial resolution, typically using thick slices or incomplete coverage of the gland. In so doing, critical features needed for staging may not be visualised, potentially sacrificing diagnostic accuracy [11, 22, 3235]. The incremental value of high spatial resolution DCE MRI for staging has been demonstrated at 1.5 T [4].

With 3-T MRI using an endorectal coil (ERC), images of the prostate gland can be obtained with higher signal-to-noise ratio and higher spatial resolution than had been previously achieved at 1.5 T [36, 37]. Initial reports of ERC 3-T MRI of the prostate in volunteers or small patient cohorts in a research setting have been promising [31, 36, 3842]. Staging accuracies of up to 92 % were achieved with experienced readers in small pilot, technique-developing and retrospective, non-blinded studies [3941, 43]. However, the accuracy of ERC 3-T MRI for determining the presence of ECE before prostatectomy across different levels of reader experience needs to be defined in a sizable cohort.

The purpose of our study was to determine the value of combined high spatial resolution DCE and T2W ERC 3-T MRI in a clinical setting for the assessment of ECE and local staging in prostate cancer patients, using the histopathology of radical prostatectomy as the reference standard.

Materials and methods

Subjects

Subjects provided informed consent and were prospectively enrolled in this study, which was approved by the internal review board of the institution and compliant with the Health Insurance Portability and Accountability Act. Eligibility for the study consisted of patients who had undergone ERC 3-T MRI of the prostate and subsequent prostatectomy for treatment of prostate cancer within 3 months of the MRI. Between January 2004 and May 2007, 108 patients fulfilling these entry criteria were identified. The demographics of the patients were as follows: mean age, 58.5 years (range 47–72); mean total PSA, 10.6 ng/mL (range 2–117); median Gleason score, 7 (range 6–9). The distribution of Gleason scores was as follows: Gleason 3 + 3, n = 41 (38 %); Gleason 3 + 4, n = 45 (42 %); Gleason 4 + 3, n = 20 (19 %); Gleason 4 + 4, n = 1 (1 %); Gleason 4 + 5, n = 1 (1 %). The patients were primarily referred for evaluation of biopsy-proven cancer (n = 92), or for a high PSA and a negative biopsy history with subsequent positive biopsies that warranted radical prostatectomy (n = 16).

MRI technique

All patients underwent a standard MR protocol that was routine at our institution using a 3-T MRI system (Genesis Signa LX Excite, GE Healthcare, Waukesha, WI, USA) with a gradient strength of 30 mT/m and slew rate of 120 mT/m/s. A minimum of 3 weeks was required between the date of the MRI and the previous biopsy to reduce the influence of post-biopsy changes in diagnostic accuracy [44]. Patients were asked to refrain from ejaculation for 3 days preceding the examination to maximise distension of the seminal vesicles [45]. A sodium phosphate enema was administered within 2 h of the study in order to minimise faecal residue in the rectum. A 1-mg glucagon intramuscular injection was administered to reduce peristaltic motion. The ERC (MRInnervu, Medrad, Pittsburgh, PA, USA) was inserted into the rectum and connected to a pelvic eight-channel phased-array coil (GE Healthcare, Waukesha, WI, USA) with a coupling device to combine the surface phased-array coil with the ERC. The coil position was verified with sagittal T2W localiser images and adjusted, if needed, so that the coil was optimally situated with respect to the prostate gland. The ERC balloon was inflated with 80 mL air or less depending on the individual tolerance in the first 59 patients and in the latter 49 patients the ERC balloon was filled with 80 mL barium to minimise susceptibility artefacts from the air–tissue interface [46].

Imaging parameters

Transverse and coronal fast spin-echo T2W images were obtained from below the prostatic apex to above the seminal vesicles using the following parameters: repetition time/echo time (effective) 4,500–7,600/165 ms, 2.0- to 2.6-mm section thickness and no intersection gap, 4 averages, 14-cm field of view, 320 × 192 matrix, and no phase wrap. The true maximum voxel size was 0.44 mm × 0.73 mm × 2.6 mm = 0.82 mm3. The frequency encoding direction was anterior–posterior. Images were acquired before, during and after contrast medium injection. DCE images were obtained after bolus injection of gadopentetate dimeglumine (Magnevist, Berlex Laboratories, Wayne, NJ, USA) at a dose of 0.1 mmol/kg of body weight administered with a mechanical injection system (Spectris, Medrad, Pittsburgh, PA, USA) at a flow rate of 4 mL/s. The imaging parameters of the 3D GRE sequence are repetition time/echo time 7.1/2.1 ms, flip angle 18°, 14-cm field of view, 256 × 224 matrix, section thickness 2.0–2.8 mm, no phase wrap, which yielded a temporal resolution of 91–95 s. Fat suppression was not used. Two pre-contrast and five post-contrast sequential acquisitions were obtained. The true maximum voxel size was 0.55 mm × 0.63 mm × 2.8 mm = 0.96 mm3.

The first pre-contrast acquisition was used to ensure relevant anatomical coverage; the second was used as part of a continuous series of pre- and post-acquisitions in which the instrument settings (gain and attenuation values) were identical. Contrast medium injection was initiated during the last 5–7 s of the second pre-contrast acquisition.

Dynamic CE images were processed at pixel resolution using a three-time-point (3TP, non-commercial software) model to analyse the time evolution of contrast enhancement. This reflects microvascular permeability and extracellular volume fraction in each pixel that are translated into a colourised parametric map (Fig. 1).

Fig. 1
figure 1

Large bilateral tumour (T) predominantly involving the right posterolateral peripheral zone shown as low signal infiltrative lesion on the axial T2-weighted fast spin echo image (a) and in red in the anatomically corresponding colour-coded DCE map (b). The colour red indicates high microvascular permeability and low to medium extracellular volume fraction, parameters suggestive of malignancy. Note the well-defined capsule (arrows in a) and the adjacent neurovascular bundle (dashed circle) without signs of involvement. MRI-based T2c stage was confirmed by histopathological assessment (not shown)

Imaging diagnostic criteria

The following criteria suggesting ECE on MRI are widely accepted in the literature and used in routine clinical practice: neurovascular bundle asymmetry, tumour envelopment of the neurovascular bundle, angulated contour of the prostate gland (Fig. 2), irregular/spiculated margin, or obliteration of the rectoprostatic angle. Seminal vesicle invasion was indicated when focal low signal intensity of the seminal vesicle, irregular or nodular thickening of the vesicle wall, enlargement with a low-signal-intensity mass, direct tumour extension from the base to the undersurface of the seminal vesicle or expanded low-signal-intensity ejaculatory duct with low-signal-intensity seminal vesicle was seen [47]. Absence of these findings suggested organ-confined disease (Fig. 3).

Fig. 2
figure 2

Multifocal bilateral tumours (T and t) shown on axial T2-weighted fast spin echo image (a) and early (“wash-in phase”) post-contrast axial T1-weighted gradient echo (b). Note that the smaller right-sided tumour (t) is better appreciated on T2-weighted images (a), whereas the dominant left-sided tumour (T) with extracapsular extension expressed by bulging (arrow) and irregular contours of the capsule is better depicted by the DCE images (b), reinforcing the complementary role of both pulse sequences. T3a stage suggested by MRI was confirmed by histopathology (areas delineated by the blue dots in c)

Fig. 3
figure 3

Unilateral organ-confined tumour (T) in the left posterolateral peripheral zone shown as diffuse low signal infiltrative lesion on axial T2-weighted fast spin echo image (a) with “rapid wash-in” on the early post-contrast axial T1-weighted gradient echo (b), T2b stage confirmed by histopathology (area delineated by the blue dots in c)

Dynamic CE images were interpreted on the basis of the source images (suspicious areas defined as those with increased relative peak enhancement and subsequent decrease in signal intensity—“rapid wash-in and wash-out pattern” [32]) and a colour-coded map highlighting high microvascular permeability and low to medium extracellular volume fraction [4, 48].

Histopathological evaluation

Whole-mount histopathological preparation of the excised prostate gland was performed in 47 patients (44 %). The excised gland was fixed in 10 % buffered formaldehyde, embedded in paraffin, and sectioned in 3- to 4-mm consecutive intervals in planes that closely paralleled the MR images. Haematoxylin–eosin staining was performed on tissue slices of 5 μm for standard histological analyses.

In the remaining 61 patients, routine non-whole-mount sections of the prostate gland underwent histopathological analysis, with the tissue preparation otherwise identical to that of the whole-mount processing. All of the specimens were prospectively classified according to stage by one pathologist with 12 years of experience in prostate evaluations using the 2002 TNM staging classification proposed by the American Joint Committee on Cancer (AJCC) [49] who was unaware of the MRI findings.

After completion of the initial MRI–histopathology correlation analysis, the pathologist reanalysed 6 patients classified as having T3 disease at histopathology but designated as T2 disease on MRI. The pathologist quantified the distance of ECE and designated cases as having “minimal” T3 disease when the tumour extended no further than 1 mm from the prostate gland/periprostatic soft tissue border, determined with a metric ruler. The 1-mm cut-off point was selected based on the expected lower limits of detection inferred from the in-plane spatial resolution employed in our protocol. The overall pathology report and histopathological TNM stage were reviewed and compared with the corresponding MRI results and MRI-based TNM stages, as previously described.

MRI–histopathology comparisons

The MRI-based stage indicating presence or absence of ECE and/or seminal vesicle invasion in the pre-surgical imaging was ascribed by radiologists blinded to the histopathology results during the initial study interpretation. Conversely, prostatectomy histological findings were reviewed and entered by a pathologist blinded to the MRI results. Both data sets were assessed regarding disease location and stage to provide the diagnostic performance discussed herein. From among the 108 cases, MRI stage was not prospectively assigned in the original report in 20 patients. These 20 original reports were reviewed by radiologists blinded to histological findings and an MRI-based stage was retrospectively generated. Among these 20 reports, an MRI-based stage was not able to be determined owing to ambiguity in the original clinical MRI report in 4 cases. The images from these 4 cases were then reviewed by another radiologist blinded to histological findings and an MRI stage was assigned.

The name of the attending radiologist on the MRI report was noted for each patient. Six different radiologists read prostate MRI during the course of the 108 cases based on clinical rotation assignments. The six radiologists were divided into two groups, according to their experience of prostate MRI. Two readers with 20 years and 15 years of experience in MRI of the prostate were designated the “experienced readers” group and, in total, interpreted 56 cases of this series. Four other readers, all with less than 3 years’ experience and between 20 and 50 cases of experience interpreting MRI of the prostate were designated the “less experienced readers” group and, in total, interpreted 52 cases of this series.

Statistical analysis

The analysis for this study consisted of two parts: diagnostic accuracy analysis based on staging data, and diagnostic accuracy analysis based on ECE data.

For the analysis of diagnostic accuracy using staging data, three different stages (T2, T3a, T3b) were used to calculate the area under the receiver-operating characteristic curve (AUC) using the Wilcoxon–Mann–Whitney index of diagnostic accuracy [50, 51]. The 95 % confidence interval was also provided. The kappa statistic was calculated to examine the concordance between the assessment of the two reader groups (experienced and less experienced) and that of the histopathology data. The staging analysis was done separately for each of the two reader groups in order to assess for variability between the groups. In this analysis, the upper confidence interval was truncated at 1.0 if it exceeded 1.0.

The ECE analysis was carried out by first grouping the histopathology stages into two categories (ECE = pT3a, and seminal vesicle invasion = pT3b, versus pT2). As the number of cases with pT3b (n = 6) was too small for a meaningful statistical analysis, ECE was defined as having a histopathology stage of pT3a or pT3b. As ECE is binary, the diagnostic accuracy measures used were sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). These estimates along with the corresponding 95 % confidence intervals were calculated on the basis of the algorithm provided by Zhou et al. [52].

The confidence intervals were computed using the score confidence interval [53, 54] as most of these estimates were close to 1, and the upper bound of the confidence interval could exceed 1. The calculation of the PPV and NPV were done using the observed sample prevalence estimate.

A statistical analysis was performed first with, and then without the 6 patients with minimal ECE (as previously defined). All statistical analyses were carried out using SAS software (SAS Institute, 2000) [55].

Results

Cohort characteristics

Of the 108 individuals enrolled in the study, histological evaluation revealed 76 patients with T2 disease, 26 with pT3a disease and 6 with pT3b disease (30 % with any pT3 disease). Minimal T3 disease was identified in 6 patients, 4 with T3a disease and 2 with T3b disease.

Predicting any extracapsular extension (histopathology stages pT3a or pT3b)

The sensitivity for ECE detection was 75 %, 64 % and 83 %, for all readers, experienced readers, and less experienced readers, respectively. The specificity was 92 %, 95 % and 88 % for each of the three groups, respectively. NPV was 91 %, 92 % and 90 % and PPV was 79 %, 77 % and 80 % (Table 1). The correct stage was predicted by the experienced and less experienced readers in 84 % and 85 % of cases, respectively (Table 2). Less experienced readers showed a tendency to overestimate the tumour stage, overstaging all of the incorrectly assessed cases (10 %). The experienced readers were again more consistent than the less experienced readers, with over- and understaging in 7 %, and 9 %, respectively (Table 2). Predictive accuracy showed an AUC of 86 % for all readers, 80 % for the experienced readers, and 91 % for less experienced readers.

Table 1 Diagnostic accuracy estimates of any extracapsular extension (pT3)
Table 2 Diagnostic accuracy estimates of staging and 95 % confidence intervals with any ECE (pT3)

Predicting extracapsular extension when minimal disease is excluded

The results of the data set analysis excluding the 6 cases of minimal ECE (<1 mm) are shown in Tables 3 and 4. Excluding minimal ECE in general led to better results for experienced and less experienced readers, with a stronger effect on the experienced readers. The overall AUC was improved by 4 % (90 %) for both reader groups, compared with the results obtained when minimal ECE was not excluded; for the experienced readers the AUC was improved by 5 % (to 85 %), whereas for the less experienced readers the AUC was reduced by 3 % (91 %).

Table 3 Accuracy of MRI stage in predicting ECE when minimal disease is excluded
Table 4 Accuracy of MRI stage in predicting extensive ECE when minimal disease is excluded

Inter-reader variability results (Table 5) were favourable for both experienced and less experienced readers (kappa value above 0.8). Higher variability, as shown by lower kappa values, was observed when minimal ECE data were included in the analysis.

Table 5 Inter-reader variability of MRI concordance with histopathology via kappa statistic and 95 % CI

Discussion

Most prostate MRI staging studies with high levels of performance are generated from retrospective interpretations using experienced readers [17]. At 1.5 T the positive impact of reader experience on staging performance has been fairly consistent with T2W alone or in combination with DCE MRI or MR spectroscopy [19, 56, 57]. In addition to employing experienced readers such research studies often involve interpreting the images in a convenient and quiet setting, with limited simultaneous demands.

For our study, we assessed staging performance using MRI in routine clinical practice with multiple radiologists varying in their experience. Image interpretation occurred in the context of large daily reader volumes and the numerous distractions inherent to a busy reading environment. Such factors risk a negative impact on diagnostic performance, vulnerable to a variety of involuntary interpreter errors [58, 59]. As a reflection of standard clinical practice, the radiologists in our study had access to clinical data while interpreting the examinations.

Despite these considerations, staging accuracies of at least 80 % and at least 85 % based on AUC data both with and without minimal ECE, respectively, were achieved across all readers. The overall staging accuracy of 90 % and an NPV of 95 % achieved when minimal ECE was excluded rivals the high performance levels reported in retrospective, experienced reader studies [17, 39, 57]. Furthermore, our cohort consisted of a locally advanced rate of 30 %, minimising the chances that accuracy was achieved by virtue of a population enriched by gland-confined disease [60].

Our results show marked improvement compared with those achieved by Brajtbord et al. [60] even though many of their reasons for low sensitivity and specificity were present in our study, such as (1) our inclusion of less experienced, non-genitourinary radiologists, (2) a reasonably sized locally advanced disease cohort, and (3) using binary distinctions regarding the stage of disease. In that manuscript, details of the MR technique used were not provided—details that, in our opinion, are a critical performance determinant. We suspect that our use of high spatial resolution may be an important distinction contributing to our superior results. Such spatial resolution is readily achievable with commercially available MR systems.

The consistently excellent inter-reader variability achieved when minimal ECE was excluded, regardless of experience, supports the possibility of achieving similar results when the technique is disseminated to a more general population of radiologists.

The evaluation of ECE as a distinct, secondary endpoint and identification of cases with minimal ECE are important aspects of the results achieved herein. Focal capsular extension has been well established as a distinct category with favourable prognosis and post-treatment recurrence-free outcomes more similar to organ-confined cancer than to cancer with more extensive extracapsular extension [61]. While we recognise that focal capsular penetration is a different concept, we chose to consider minimal ECE based on the limits of expected detectability given the image resolution obtainable with our imaging parameters. As discussed by Langlotz et al. [62], a decision has to be made by the radiologist regarding a cut-off point. That will have an impact on sensitivity and specificity. The key goal for prostate MRI is to achieve high specificity ensuring that few, if any, patients will be unnecessarily deprived of potentially curative therapy. Sensitivity becomes secondary because even a low sensitivity is an improvement over the sensitivity of clinical examination [62]. Long-term outcomes of such cases will be important to assess.

In a study using MRI with inferior spatial resolution to ours, Manzone and colleagues [63] showed that there was no significant difference in disease recurrence rates between those patients whose MRI showed definitely localised disease and those with possible ECE. These data included false-negative cases of pathology-proven ECE that were designated as ‘definitely’ localised by MRI, i.e. disease that was below the detection limits of their technique. As those authors have pointed out, ECE that is below the detection limits of ERC MRI may not be determinative of patient outcome [63], a sentiment derived from an MRI protocol with spatial resolution inferior to ours.

Therefore, the reduced staging accuracy and decline in inter-reader variability seen when minimal ECE is incorporated into any analysis may undermine the practical value of MRI. Such an analysis can appear to yield unfavourable results, despite a limited impact on clinical decisions. In this regard, further outcome studies evaluating minimal ECE with refinement of the criteria for such a determination will be useful.

With a combined approach of T2W and DCE MRI at 3 T, a preliminary study by Futterer et al. [57] suggested that inexperienced readers could improve their staging results. On the other hand, a previously published study by Mullerad et al. [19], using an imaging protocol at 1.5 T without DCE MRI, showed a larger difference between experienced and inexperienced readers than in our study.

A previously published 1.5-T staging study using similar DCE and T2W MRI in a retrospective research setting with two experienced readers demonstrated an overall staging accuracy of 95 % [4], similar to what was achieved herein at 3 T. Indeed, determining the relative contributions of field strength, acquisition strategy, and reader experience to improvements in staging results is challenging.

It is possible that the superior spatial resolution and detail attainable at 3 T [36, 37, 40] might be less critical than the additional value of DCE imaging to achieve improved staging results. Because we used the prospective clinical interpretations that were derived from all available data, we were unable to analyse the separate performance characteristics of the imaging strategies employed. On the other hand it is worth noting that the use of thinner sections, as we have employed, results in more sampling of the gland and adjacent structures compared with the traditional 3- to 4-mm section approach.

This study was performed as part of a large clinical imaging trial in which the protocol was fixed and did not include diffusion-weighted imaging (DWI). While DWI can improve detection and thereby have an impact on staging accuracy [64], its inherently low spatial resolution [65] renders it unlikely to yield substantial improvements for the most challenging cases, those with subtle ECE.

We recognise limitations in our study. There was potential for interpretation bias as the clinical radiologists had access to the clinical information (e.g. PSA or biopsy histopathology) at the time of prospective interpretation. While this represents a potential weakness from a scientifically rigorous attempt to predict staging accuracy based on MR data alone, our study design sought a practical reflection of clinical practice. In so doing, it is hoped that the results reported herein can be reproduced across a diverse set of radiological practices.

Interpretation bias was further reduced by the radiologist assigning MRI stage without having any knowledge of prostatectomy histopathology stage. Other limitations consist of each reader interpreting a different set of studies, hindering a direct comparison of diagnostic performance between experienced and less experienced readers, as well as assessment of interobserver agreement. Another relative limitation of this study is that a small subset of MRI interpretations (n = 20) was retrospectively categorised. It would have been preferable to have a prospective stage designation in all cases. However, the subsequent classification was ascribed without knowledge of the histopathology results, mimicking the cases that were prospectively evaluated and minimising the chances of bias.

While our designation of experienced and less experienced readers is somewhat arbitrary, it is similar to those of previously published studies [19, 40]. Heijmink et al. [40] defined an experienced reader as having read more than 150 prostate MRIs, and Mullerad et al. [19] defined experienced readers as uroradiologists with more than 3 years’ experience in prostate MRI. It is likely that a learning phase is necessary and that interpretations by completely inexperienced readers may produce less consistent results. Of note, it is possible that, in some cases, readers may have involved other radiologists for an opinion regarding staging; however, given the pressures for turnaround times and that individuals were assigned to unique schedules, this concern may be minimal. Finally, a revised edition of the TNM classification was published by the AJCC [66], but no changes were made to the consequences of the findings discussed in this study and the corresponding pathological staging.

In conclusion, prostate ERC 3-T MRI with a combined approach of high spatial resolution DCE and high spatial resolution T2W imaging is an accurate non-invasive staging tool for pretherapeutic assessment of ECE across readers of varied experience in a clinical setting, particularly when minimal ECE is not considered. Such results may assist clinicians in their efforts to more objectively and rationally select appropriate treatment for the individual patient with prostate cancer.