Introduction

Prostate cancer (PCa) diagnosis and assessment has improved significantly since the introduction of multiparametric-MRI (mp-MRI) into its management [1, 2]. Combining multiple MRI sequences such as T2-weighted imaging (T2WI), diffusion-weighted imaging (DWI), spectroscopy, and dynamic contrast-enhanced (DCE) imaging provides complementary information for detection, localization, staging, and grading of the disease, thereby improving accuracy, sensitivity, and specificity [1, 3, 4].

T2WI provides anatomical images of the prostate at high spatial resolution. DWI probes the restriction of free motion of water molecules in tissue to derive the apparent diffusion coefficient (ADC), which provides information about cell density and tissue microenvironment [5, 6]. DCE-MRI enables pharmacokinetic modelling of the dynamic uptake and extravasation of a contrast agent, and can be used to derive parameters such as the volume transfer constant (Ktrans), extravascular-extracellular volume fraction (Ve), and blood plasma volume fraction (Vp) [79], from which tissue pathophysiology (e.g. perfusion, vascular permeability) can be inferred [10].

The Gleason grading system [11, 12] is regarded as the gold standard for PCa diagnosis and aggressiveness assessment. However, PCa is a heterogeneous and multifocal disease; Gleason grading based on prostate biopsies, which only samples a small portion of the gland, may, therefore, not provide a complete representation of the disease and tends to have low specificity [13, 14], whereas prostatectomy Gleason grading is usually confined to intermediate- and high-risk patients. Furthermore, the Gleason grading is qualitative, subjective, and hence prone to intra and inter-observer variations. Despite the improvements offered by mp-MRI incorporation, the separation of Gleason score (GS) 3+4 cancers from GS 4+3 cancers using conventional MRI parameters tends to be less effective due to overlapping values [15, 16]. These two PCa patterns account for the largest group at biopsy or prostatectomy, and have significantly different prognoses irrespective of having the same GS of 7 [1720]. Hence, misclassification could have serious clinical consequences. Accurate classification of these two GS patterns thus requires objective, sensitive, and reproducible quantitative image analysis methods.

Texture analysis is a promising technique known to have such attributes. It comprises the use of mathematical parameters, derived as functions of the spatial distribution of pixel intensities, to characterize the fundamental textures of objects in an image [21]. The applicability of texture analysis in medical image analysis has been demonstrated in several studies [2227], predominantly for characterizing disease types or diseased from healthy tissues. Yet, unlike conventional quantitative MRI parameters such as ADC, Ktrans, and Ve [28, 29], little is known about the pathophysiological semantics of textural features. This knowledge gap poses a challenge in the development of texture-based computer-aided diagnosis tools and their integration into clinical practice. The aim of this study was to bridge this gap by evaluating how grey-level co-occurrence matrix (GLCM) based textural features [30] derived from T2W MRI relate to GS (3+4 vs. GS 4+3) cancer patterns, and MRI-based physiological parameters (ADC, Ktrans, and Ve). Furthermore, the study evaluated the utility of these features/parameters as potential markers for distinguishing between these two PCa patterns was evaluated.

Materials And Methods

Patient Cohort

The cohort of 23 PCa patients used in this retrospective study was obtained from a previous (in 2011) prospective study of 30 patients approved by St. Olavs University Hospital, Trondheim, Norway, and the Regional Committee for Medical and Health Research Ethics, Central Norway. Each patient gave a written informed consent for data usage. Inclusion in the prospective study required that the patient was initially confirmed by means of core needle biopsy to have PCa, and was scheduled to undergo pre-operative mp-MRI involving T2W, DW and DCE-MRI at least 4 weeks after the last biopsy, followed by radical prostatectomy within 12 weeks after the mp-MRI.

MR Image Acquisition

All MRI were performed in feet first supine position, on a 3T MRI system (Magnetom Trio; Siemens Medical Solutions, Erlangen, Germany) using standard body and spine phased array matrix coils for signal detection. Glucagon (Novo Nordisk, Bagsværd, Denmark) was injected intramuscularly prior to imaging to suppress bowel movements.

The T2W images were acquired with a turbo spin-echo sequence (TR/TE 4000/101 ms; flip angle: 150°; FOV: 200×200; matrix: 320×320; slice thickness: 3 mm; interslice gap: 0.6 mm), and the DW images with a single-shot echo-planar imaging sequence, using four b-values: 0, 100, 400, and 800 s/mm2 (TR/TE: 3300/60 ms; flip angle: 90°; FOV: 260×211; matrix: 160×130; slice thickness: 3.6 mm). A 3D fast low-angle shot sequence (time-resolved imaging with stochastic trajectories) was used for the DCE-MRI (TR/TE: 3.85/1.42 ms; flip angle: 12°; FOV: 260×260; matrix: 160×160; slice thickness: 3.6 mm). It included the acquisition of pre-contrast T1W image scans using four variable flip angles (2°, 5°, 10°, and 20°), followed by 70 dynamic scans at 4.22 s temporal resolution. After the third dynamic scan, the gadolinium-based contrast agent gadoterate meglumine (Dotarem®; Amersham Health, Oslo, Norway) was administrated intravenously by bolus injection (0.1 mmol/kg; rate: 2 mL/s) using a Spectris™ MR injection system (Medrad Inc. PA, USA), followed by a 20-mL normal saline flush. The images were oriented along the longest axis of the prostate, perpendicular to the urethra to best match routine histologic sectioning of the prostate and covered the entire prostate volume.

Histopathology and Regions of Interest Delineation

After radical prostatectomy, the excised gland was fixed in formalin and serially sectioned from the apex to the base into 4-mm axial slices, from which 3.5-μm sections were stained with haematoxylin-erythrosine-saffron (HES). The HES-stained slides were examined by an experienced pathologist who outlined cancer foci, described cancer location, and graded all foci in accordance with the Gleason scoring system [11, 12].

For each patient, the histopathology slide with the index tumour—the largest histopathologically defined, clinically significant (volume ≥ 0.5 cm3) cancer focus with the highest GS—was selected. The best corresponding axial T2W image slice was then identified with the guidance of anatomical landmarks (Fig. 1a). To support region of interest (ROI) delineation in the T2W image, both images (histopathology and T2W) were overlaid with a radial grid using the same anatomical landmarks as reference, while taking into account the offsets in orientation and distance between the urethra and the base for the two images. Subsequently, a ROI replicating the index tumour was manually outlined in the T2W image based on the location in the histology section and its appearance on the T2W image.

Fig. 1
figure 1

(a) Histopathology slide spatially matched to T2-weighted (T2W) image slice based on anatomical landmarks (e.g. urethra, ejaculatory ducts, size/shape of the peripheral zone, apex/base proximity), (b) cancer region of interest delineated in T2W image based on histology and rigidly registered to diffusion-weighted (DW) and dynamic contrast-enhanced (DCE) images, and (c) textural features, apparent diffusion coefficient (ADC), volume transfer constant (Ktrans), and extravascular-extracellular volume fraction (Ve) computation.

The delineated T2W image ROI was then transformed (using Elastix toolbox [31]) to the corresponding DW and DCE images (Fig. 1b) and their parametric maps (Fig. 1c) via intensity-based rigid registration using the Mattes mutual information similarity metric. To optimize the registration, possible intensity-based artefacts due to artificial global intensity inhomogeneity were corrected for according to Cohen et al [32] prior to registration. The resulting registration transformations were then applied to the original/unfiltered images or ROIs. The quality of the inter-protocol ROI registration was assessed visually in the three images together with the ADC and Ktrans maps.

Quantitative Analysis

Haralick textural features [30] and MRI-based physiological parameters consisting of ADC, Ktrans, and Ve were computed from the respective delineated ROIs as follows:

Two-dimensional GLCM-based texture analysis [30] was performed on the T2W image ROIs (mean area: 212 pixels, range: 116–416 pixels) by first rescaling (histogram equalization) the intensities within each ROI to a range of 0-255 grey levels. The GLCM—a square matrix in which each element GLCM(i,j) represents the number of times that a pair of pixels with grey levels i and j co-occur at a specified distance and direction in relation to each other [30]— was computed using 64 bins and at one pixel distance for each direction of 0°, 45°, 90°, and 135°. Haralick textural features were then calculated from each GLCM, and the average of each parameter over the four directions was obtained. A single value was computed for each textural feature for a defined ROI. Haralick et al [30] originally proposed fourteen GLCM textural features, most of which are highly correlated and reflect similar information. We therefore selected four distinct textural features—angular second moment (ASM), contrast, correlation, and entropy—based on the literature [26, 30, 33] for analysis. ASM measures homogeneity of an image; contrast measures local variations; correlation measures linear dependency of the grey-levels; and entropy measures randomness or complexity [30]. The mathematical expressions for these features are given in the appendix.

ADC maps were calculated voxel-wise from the DW images using b-values of 100, 400, and 800 s/mm2 by fitting the DWI signal as a function of b-value to the monoexponential decay model. The median ADC value for each ROI was calculated.

The extended Tofts model [9] (two-compartment with intravascular tracer contribution) was applied to the DCE-MRI time series images for pharmacokinetic modelling. Prior to parameter estimation, the DCE and the pre-contrast T1W images were rigidly co-registered [31] in time as described above to correct for possible patient motion during imaging. Pre-contrast T1-maps were calculated (from the pre-contrast images) for the conversion of signal intensity into contrast agent concentration. Voxels with T1 values outside [500–3000 ms] were excluded [34]. Fitting of image data to the model was done on a voxel-by-voxel basis from which maps of Ktrans, Ve, and Vp were computed. The population-averaged arterial input function (AIF) reported by Parker et al [35] was used, and the time delay of the tissue enhancement curve relative to the AIF was accounted for. The contrast agent relaxivity was 3.5 s-1mM-1. Voxels for which Ktrans < 0, Ktrans > 10, Ve < 0, or Ve > 1 were also discarded, and the median values computed for each ROI. Vp was not used for further analysis.

Median T2W signal intensity (T2WSI) was also computed for each ROI. First, the image intensities (original/non-scaled) were corrected for nonstandardness by aligning the image intensity histogram of each patient image to match the mean intensity histogram across the different patients [36], thereby ensuring that the image intensities are comparable and have consistent tissue-specific meaning.

All the analyses were performed using MATLAB R2014a (Mathworks, Natick, MA, USA).

Statistical Analysis

Point-biserial correlation coefficients (rpb) and Spearman correlation coefficients (ρ) were calculated to quantify the relationships between the textural features and GS and between the textural features and MRI-based physiological parameters (ADC, Ktrans, and Ve), respectively.

Each computed parameter was tested for significant differences between GS 3+4 and GS 4+3 cancers using two-tailed Mann-Whitney U tests. A generalized linear model with a logistic regression model component was used to perform receiver operating characteristic (ROC) analysis to classify the two cancer patterns. The nonparametric approach by DeLong et al [37] for comparing the areas under two or more correlated ROC curves was further used to evaluate the classification capabilities of the parameters. p-values < 0.05 were considered to be statistically significant. The p-values were adjusted for multiple testing using Benjamini and Hochberg’s approach [38]. The statistical analyses were performed in MATLAB.

Results

Patient Characteristics

Seven of the 30 patients were excluded from this retrospective study due to unavailable histopathology (n = 5), poor MR image quality, or histopathology-MR matching (n = 2). A total of 32 clinically significant (volume ≥ 0.5 cm3) cancer foci according to the histopathology were present in the remaining 23 patients (details of patient and tumour exclusion are given in Fig. 2). Twenty-three respective index tumours were analysed (Table 1). Because of incomplete DCE data acquisition, data sets for 20 patients were included in the statistical analysis of the DCE pharmacokinetic (Ktrans and Ve) parameters.

Fig. 2
figure 2

Flowchart of patient and tumour inclusion and exclusion. PZ = peripheral zone; TZ = transition zone. One cancer focus per patient, i.e. the index tumour, the largest histopathologically defined, clinically significant (volume ≥ 0.5 cm3) cancer focus with the highest Gleason score was considered for analysis. Excluded multifocal cancers: two 3+3, three 3+4 and four 4+3 (two from the same patient) Gleason score cancers.

Table 1 Clinical and demographic characteristics of the patient cohort.

T2W MRI-derived Textural Features Correlate Significantly with GS and ADC

T2W image textural features ASM and entropy correlated significantly (p < 0.05) with GS after multiple test correction. ASM correlated moderately negatively (rpb = -0.52), and entropy moderately positively (rpb = 0.49). ASM, entropy, and contrast also correlated significantly (p < 0.05) with median ADC. The correlations were highly positive (ρ = 0.82), highly negative (ρ = -0.80) and moderately negative (ρ = -0.44), respectively. None of the textural features correlated significantly with median Ktrans or Ve (Table 2).

Table 2 Correlation between T2W MRI-derived textural features and Gleason grade, ADC, Ktrans, and Ve.

T2W MRI-Derived Textural Features Distinguish GS 3+4 from 4+3 Cancers

Patients with GS 4+3 cancers had significantly lower T2W image textural ASM and higher entropy than those with GS 3+4 cancers (Fig. 3), but no significant difference was found for the conventional MRI parameters (Fig. 4). Table 3 depicts the areas under the curves (AUCs) and 95% confidence intervals (CIs) for the ROC analysis classification of the two cancer patterns using the individual parameters. ASM and entropy were the best performing individual parameters, whereas correlation and contrast performed worst. The combined textural features resulted in a higher classification accuracy (AUC = 82%, CI = 0.61–0.99, p < 0.05) than the MRI-based physiological parameters combined (AUC = 75%, CI = 0.52–0.98, p < 0.05). The highest classification accuracy (AUC = 91%, CI = 0.75–0.99, p < 0.05) was however achieved when all computed parameters were combined (Fig. 5). The differences in AUCs were not significant.

Fig. 3
figure 3

Box plots comparing the distribution of T2W MRI-derived textural features between Gleason score 3+4 and 4+3 cancers. Black diamonds indicate mean values. p-values corrected for multiple testing. *significant (p-value < 0.05) after multiple test correction. ASM = angular second moment.

Fig. 4
figure 4

Box plots comparing the distributions of median T2W signal intensity (T2WSI), apparent diffusion coefficient (ADC), volume transfer constant (Ktrans), and extravascular-extracellular volume fraction (Ve) between Gleason score 3+4 and 4+3 cancers. Black diamonds indicate mean values. p-values corrected for multiple testing.

Table 3 Areas under the receiver operating characteristic curves for comparing the performance of T2W MRI-derived textural features and MRI-based physiological parameters in distinguishing Gleason score 3+4 from Gleason score 4+3 cancers.
Fig. 5
figure 5

ROC curves for T2W MRI-derived textural features (angular second moment, contrast, correlation, and entropy), MRI-based physiological parameters (apparent diffusion coefficient, volume transfer constant, and extravascular-extracellular volume fraction), and combined texture-MRI parameters for distinguishing Gleason score 3+4 versus Gleason score 4+3 cancers.

Discussion

The clinical applicability of texture analysis remains limited despite having been shown to have high accuracy and sensitivity in classifying tumour patterns and discriminating healthy from diseased tissues [2226, 39]. The latest prostate imaging reporting and data system (PI-RADS) [40] describes the texture of prostate tissue as an important feature for identifying prostate cancer, especially in the transition zone (TZ), yet the evaluation procedure is qualitative. Successful integration of texture-based computer-aided diagnosis tools (i.e. texture analysis) into clinical practice could make this process quantitative and less subjective. However, texture analysis being a relatively new field in medical image analysis, the pathophysiological semantics and relevance of the textural features are currently lacking. In addition, the existence of numerous texture analysis approaches [21, 41] and subsequent large number of possible textural features serve as challenges in selecting the most relevant features for prostate cancer. This preliminary study therefore investigated the diagnostic relevance of T2W MRI-derived textural features (ASM, contrast, correlation, entropy) in relation to parameters with known pathophysiological significance (i.e. GS, ADC, Ktrans, and Ve).

A number of studies have used texture analysis of histopathology images for automated Gleason grading with high accuracy [4244]. Other studies found GLCM-based textural features derived from ADC maps to be capable of differentiating between benign and malignant tumours in the peripheral zone (PZ) [26] and GS 3+3 from GS ≥ 7 cancers [45]. In this study, however, we focused the texture analysis on T2W images because they are easy to acquire, less prone to artefacts, non-invasive, and suitable for every patient. Importantly, T2WI offers high signal-to-noise ratio, spatial resolution and soft tissue contrast images of the prostate structures [46], an important feature in PI-RADS [40]. Hence, texture analysis on these high-resolution images could reveal subtle alterations in tissue architecture due to cancer. Moreover, this study further improves our understanding of textural features in that it explored additional quantitative physiological MRI parameters, Ktrans and Ve, which the previous studies did not touch on.

We found the T2W image-derived textural features (ASM and entropy) to correlate significantly with GS. Our results support the findings of Vignati et al [45], who also found significant correlations between T2W image textural features [homogeneity (ASM) and contrast] and GS. Although Wibmer et al [26] did not find T2W image textural features to correlate significantly with GS, their study and that of Rosenkrantz et al [15] found that ADC map textural features correlated significantly with GS and the percentage of Gleason grade 4 in GS 7 cancers, respectively. The differences in data acquisition (resolution) and analysis (e.g. pooling together PZ and TZ cancers) between these studies possibly contributed to the discordance.

More interestingly, we found these two textural features (ASM and entropy) to be significantly different between GS of 3+4 and 4+3 cancers. A similar observation was also reported in [26] for ASM. These findings further underscore the need to consider each cancer group separately in clinical practice during staging or therapeutic intervention, irrespective of having the same GS of 7, as indicated in previous studies [19, 20]. Median T2WSI did not differ significantly between the two cancer groups, an indication that T2W image textural features could be more sensitive in revealing underlying tissue morphology than ordinary summary statistics of T2W image intensities.

Cancer growth requires formation of new blood vessels to sustain the proliferating cells. The abnormal nature of such neo-vasculature causes fast extravasation of injected contrast agent and thus an increase in Ktrans when compared to healthy tissues [10, 47]. DCE pharmacokinetic analysis in particular probes blood vessel function and not the underlying morphology of the tissue per se, a possible explanation for the lack of association with the textural features. Moreover, GS 7 cancers could grow relatively slowly, without heavy demand for new blood vessels, as opposed to more aggressive cancers.

Generally, PCa is characterized by high cellularity and decreased extracellular space as a result of reduced luminal space and stromal matrix [46, 48]. Movement of extracellular and intraductal water molecules is, therefore, restricted, resulting in low ADC. These tissue microenvironment characteristics, which serve as determinants of ADC, also define the underlying textural properties of the tissue and could explain the significant correlation observed between the textural features (ASM, contrast, entropy) and ADC (Table 2).

Textural features ASM (homogeneity) and entropy (randomness/complexity) were particularly found to correlate significantly with both GS and ADC. Increased PCa aggressiveness is characterized by decreased ADC, as well as deterioration of the architectural patterns depicting cellular integrity of the prostate gland due to poor differentiation and glandular structure deformation (i.e. high Gleason grade). All of these could result in less homogeneity and increased complexity of the tissue, which is reflected by the decreased ASM and increased textural entropy, possibly explaining the observed correlations with GS and ADC. Arguably, these two features could be considered the most intuitive textural descriptors of the Gleason grading system.

Vignati et al [45] also found significant correlations between T2W image textural features (ASM and contrast) and ADC values. Similar to Fehr et al [33], our results also indicate that combining traditional MRI parameters and T2W MRI-derived textural features could achieve a higher classification accuracy (91%) than the traditional MRI parameters alone. ADC calculations are dependent on the range of b-values used [40]. Our ADC calculation excluded the b0 image, which minimized the contribution of perfusion to some extent. Though relatively high, the majority of the ADC values are within a range ([816–1891 μmm2/s] for GS 3+4 and [753–1405 μmm2/s] for GS 4+3 cancers) comparable to other studies [15, 26, 28]. Contrary to other studies [28, 49], we did not find median ADC to differ significantly between GS 3+4 and 4+3 cancers, as also reported in [15]. The most plausible reason for this observation is the relatively low number of patients in our cohort; nonetheless, the textural features could significantly differentiate the two cancer patterns with higher accuracy.

This study has some limitations. The sample size is relatively small (23 patients), and our focus on separating only GS 3+4 and GS 4+3 PCa patterns could be regarded as limited in terms of tumour diversity. The sample size did not allow for separate analyses of PZ and TZ cancers as in other studies [25, 26, 33]. As all the analyzed data represent index tumours of GS 3+4 or 4+3, this small tumour diversity could also be explained as sample homogeneity, which contributes to the quality of our data. Including multiple cancers from the same patient (multifocal cancers) as separate data points could be a way to increase the sample size, but would have hampered sample homogeneity. Furthermore, in clinical practice, treatment decision mostly relies on the Gleason score of the index tumour (i.e. the largest histopathologically defined, clinically significant (volume ≥ 0.5 cm3) cancer focus with the highest GS). Hence, including multifocal cancers could generate confounding results that do not accurately represent clinical practice.

Secondly, standardized T2WSI were used for cancer characterization, rather than directly quantified T2 values through T2-mapping. Further, we employed 2D texture analysis instead of the reported improved 3D texture analysis [50, 51] since the latter is not considered optimal for acquisitions with interslice gaps. Finally, the lack of spatial co-registration of the histopathology slides and the MR images, which caused us to delineate the ROIs directly on the T2W MR images is also worth noting.

Conclusions

Our study shows that T2W MRI-derived textural features correlate with the underlying pathophysiology of prostate cancer tissue, and thus could augment existing prostate cancer classification techniques. However, for texture analysis to gain a stronger place in prostate cancer management, more efforts need to be directed towards understanding the biological and/or pathological semantics of the textural features, determining the most relevant features, and standardizing texture analysis methods. To this effect, a validation of this study in a larger, preferably multi-centre cohort with higher tumour diversity should be pursued.