Introduction

Prostate cancer (PCa) is a significant health issue affecting predominantly elderly men worldwide. Its incidence has been ranked high for many years in the global cancer survey, and its mortality rate is second to lung cancer [1,2,3]. The Gleason scoring system is the most widely used scoring system for judging the malignant degree of PCa; the higher the GS, the higher the malignancy and the corresponding invasiveness [4]. For low-risk tumors (GS < 7), no immediate treatment is required, that is, either watchful waiting or active surveillance; for intermediate-risk (GS = 7), monotherapy is offered, and for high-risk prostate cancer (GS > 7), combination therapy will be the best treatment option [5]. Recently, much greater attention is given to the intermediate risk group to be subdivided into GS = 3 + 4 and GS = 4 + 3 for subanalysis due to prognostic differences between the two groups [4, 6]. Kamel et al. [7] showed that PCa with GS = 4 + 3 was more prone to metastasis than GS = 3 + 4, the probability was about 2.8%, 0.9%, and the overall survival rate was 23% lower than the latter. Recent studies have shown that GS = 3 + 4 tumors have high biological inertia and good prognosis, and active monitoring is recommended to avoid overtreatment [8]. Transrectal ultrasound-guided prostate biopsy (TRUS-biopsy) can cause side-effects including bleeding, pain, and infection, and it is less sensitive than what we expected [9, 10].

Diffusion-weighted magnetic resonance imaging (DW-MRI) offers a noninvasive visualization approach that reflects the diffusion characteristics of water molecules in biological tissues and indirectly reflects the microscopic changes in tissue structures, which characterize the organization. It is an informative MRI modality in detecting PCa, and it shows moderately high diagnostic accuracy [11]. Routinely, in our clinical work, apparent diffusion coefficient (ADC) values are calculated by means of a monoexponential model via assumption of the diffusion in deference to Gaussian distribution similar to that in pure water. Nevertheless, these movements in biological tissue include molecular diffusion of water and blood microcirculations in a network of capillaries (perfusion). The microcirculation or perfusion of blood can also be considered an incoherent movement due to the pseudorandom tissue of the capillary network at the voxel level. In a significant development, Le Bihan [12] established an in vivo bi-exponential model, which is also known as the intravoxel incoherent motion (IVIM) model. This model correlates the molecular diffusion coefficient and perfusion. A study by Hiroshi Shinmoto showed that the molecular diffusion coefficient and perfusion fraction in prostate cancer were significantly lower than those found in the peripheral zone (PZ) [13]. Liu et al. found that IVIM could potentially improve the differentiation of prostate cancer in the central gland and offer better accuracy than ADC for differentiating stromal hyperplasia and prostate cancer [14]. Furthermore, some studies concluded that perfusion-free diffusion parameter D performed better in differentiating the GS of PCa [15,16,17,18]. When higher b values are added, the perfusion is depressed, and the molecular diffusion was proven to depart from the conventional random diffusion process due to the existence of barriers within cellular complex environments, which is acknowledged as non-Gaussian diffusion behavior. This calls for more advanced modeling of DWI to characterize non-Gaussian behavior—the idea of reflecting organizational heterogeneity and irregularity—detected using high b values. The DKI model allows for the estimation of kurtosis, and higher kurtosis values indicate a more peaked, non-Gaussian distribution of diffusion [19]. Previous studies have shown that the DKI model improves PCa detection and diagnosis [20,21,22,23,24]. Wang et al. reported that the 90th Kapp exhibited better diagnostic performance in differentiating the GS of PCa [23]; Wu’s team reported that DKI may help in predicting GS upgrade in biopsy-proven GS 6 prostate cancer [24]. A recent study by Tamada et al. [25] reported that Kapp performed well in differentiating GS ≤ 3 + 3 and GS ≥ 3 + 4 tumors, GS ≤ 3 + 4 and GS ≥ 4 + 3 tumors, which were similar to the diagnostic performance of ADC; and ADC and Kapp were highly correlated.

Although much work as we mentioned above had been done, studies on these two models are relatively deficient. And the GS results used in many studies were obtained from biopsies which can be inaccurate due to sampling error considering the fact that the GS is upgraded in every third patient following radical prostatectomy (RP) [26]. The GSs used in this study were obtained from RP. We aimed to explore the preponderant diagnostic perfor°mances of these two models in predicting the aggression of PCa; and what their unique parameters added to monoexponential model (ADC).

Materials and methods

Patient population

From May 2017 to December 2018, 121 consecutive patients were enrolled as a part of an ongoing prospective study. All the patients underwent diffusion-weighted MR scanning and gave informed consent. Our target population was people who exhibited remarkable findings in serum prostate-specific antigen (PSA) test and/or digital rectal examination (DRE) and/ultrasonography. Patients who had previously undergone ultrasound-guided transrectal biopsy were also included because our primary objective was to detect and characterize clinically significant cancer in the gland [27]. Recently, Jung et al. showed that postbiopsy hemorrhage did not negatively affect the detection of tumors with GS ≥ 3 + 4 or with volume ≤ 0.5 ml [28]. Through image analysis, we observed that only 11 (28%) cases had hemorrhage, and the signal of hemorrhagic foci was depressed well on high b value (e.g., b = 2200) DW images. Eighty-one patients were excluded for the following reasons: (a) those who exhibited neither prostatectomy nor biopsy pathological proof (n = 53), including patients who did not have significant suspicious foci on all of mpMRI or refuse biopsy; (b) those in whom the interval between prostatectomy/biopsy was more than 3 months (n = 4); (c) those who had prior treatment (n = 4), such as endocrine therapy and transurethral resection of carcinoma of the prostate (TURCaP); (d) those with no prostatectomy (n = 6); (e) cases with poor image quality (n = 7); and (f) those with no lesion being identified on MR imaging (n = 7). Finally, we considered a total number of 43 patients for this study. Figure 1 presents a flowchart of the population. All the GS scores were evaluated using radical prostatectomy gross specimens. The clinical data of the 40 patients are summarized in Table 1. In view of lacking GS = 6 and good prognosis of GS = 3 + 4, patients in our study were divided into three risk groups as GS ≤ 3 + 4 (group A, GA), GS = 4 + 3 (GB), and GS > 4 + 3 (GC).

Fig. 1
figure 1

Flowchart of patient population. PZ peripheral zone, TZ transitional zone, CZ central zone

Table 1 Clinical data of the 40 patients

MR imaging protocol

Multiparametric MR imaging was performed using a 3.0-T MR imager (Discovery MR 750, GE Medical Systems, Milwaukee, WI, USA) and a 32-channel phased-array surface coil without an endorectal coil. The contraindications for enhanced magnetic resonance imaging had been excluded, and particular preparations such as gastrointestinal preparation were not highly noted in this study considering that there was no consensus regarding patient preparation issues [27]. Propeller FS T2-weighted MR imaging was used to reduce motion artifact. The inclination angle of the axial-oblique scanning was adjusted according to the inclination degree of the prostate. Echo-planar DW images were acquired in the axial-oblique plane that was consistent with T2 W imaging using a single-shot spin-echo echo-planar sequence. Eleven b values of 0, 50, 100, 200, 900, 1100, 1400, 1800, 2200, 2500, and 3000 s/mm2 (with number of averages of 1, 1, 1, 1, 4, 4, 6, 8, 10, 10, and 12, respectively) were determined. ADC maps were calculated automatically via monoexponential fitting per voxel of the DW images. 3D T1 liver acquisition with volume acceleration flex (LAVA FLEX) sequence was used for DCE-MR imaging. DCE was only used for the facilitation of diagnosis in this study. The detailed parameters of these main acquisition sequences are shown in Table 2.

Table 2 Acquisition parameter values of major Sequences

IVIM and DKI models

IVIM model and its parameters of D, Dstar, and F are fit for a biexponential equation:

$$S_{\text{b}} /S_{0} \, = \,\left( {1 - F} \right) \cdot \exp \left( { - b \cdot D} \right) + F \cdot \exp \left( { - b \cdot {\text{Dstar}}} \right),$$
(1)

where D characterizes extravascular diffusion of water, while Dstar represents signal changes attributing to the intravascular movement of water. F is the perfusion fraction. Sb is the DWI signal intensity at a specified b value, and S0 is the baseline signal at b = 0.

The DKI model is based on the following equation:

$$S_{\text{b}} /S_{0} \, = \,\exp \left( { - b \cdot {\text{Dapp}} + b^{2} \cdot {\text{Dapp}} \cdot {\text{Kapp}}/6} \right).$$
(2)

In Eq. [2], Sb and S0 have the same meaning as in Eq. [1]. When S0 is known, Dapp and Kapp are obtained. The parameter Kapp represents the apparent diffusional kurtosis (unitless), and Dapp is the diffusion coefficient that is corrected to account for the observed non-Gaussian behavior [29].

ROI analysis

Two experienced radiologists (Professor A with 3 years of experience in prostate MRI, and Professor B with 4 years of experience in prostate MRI) identified suspicious tumors in consensus according to the criteria in Prostate Imaging-Reporting and Data System, Version 2 [27]. These radiologists had not been previously informed of the pathological results. Usually, a patient having more than one suspicious focus as well as the prostatic tumor had usually multiple foci separated by noncancerous tissue. Index lesion of each patient was evaluated in this study. An index lesion is one that locates in the zone which is depicted in prostatectomy/biopsy pathologic result and can be found on MRI. The method for index lesion definition is presented in Fig. 2. The two radiologists depicted every region of interest (ROI) separately on high b (b = 2200 s/mm2) DWI with reference to the ADC imaging which was generated automatically after scanning, using the IMAge/enGINE MR_Diffusion software (V2.0.3, Vusion Tech, Hefei, China, http://www.vusion.com.cn) to perform each DW-MR imaging, obtaining parameters of the IVIM (F, D, and Dstar) and DKI models (Dapp, Kapp) [30]. Their mean values were used for data analysis. The three-dimensional ROI data measurement capability of this version offered more convenient measurement and more comprehensive use of the diffusion information of lesions. The placement of 3D-ROIs was in accordance with the index lesion, avoiding the urethral and ejaculatory ducts, as well as hemorrhage. Figure 3 shows an example of manual ROI placement.

Fig. 2
figure 2

Flowchart of index lesion

Fig. 3
figure 3

ROIs being signed as green by postprocessing software on DWI when b = 2200

Statistical analysis

Data analysis was conducted using the SPSS software (version 20.0; SPSS, Chicago, USA) and the MedCalc Statistical Software (version 15.8; MedCalc Software bvba, Ostend, Belgium; https://www.medcalc.org; 2015). The interobserver agreement for each parameter measurement was assessed by calculating the interclass correlation coefficient (< 0.40, poor; 0.40–0.59, fair; 0.60–0.74, good; and 0.751–1.00, excellent) [31]. The mean values of those parameters measured by the two radiologists were used in the flowing data analysis. Shapiro–Wilk test of normality was performed to assess the normality of each parameter at P value > 0.05. Spearman’s rank correlation coefficient (0.0–0.2, very weak to negligible; 0.2–0.4, weak; 0.4–0.7, moderate; 0.7–0.9, strong; 0.9–1.0, very strong) [32] was used to crystallize the correlation between each parameter and GS. Correlations of ADC and the unique parameters of IVIM and DKI models were also computed. The Kruskal–Wallis one-way analysis of variance (ANOVA) (k samples) and Mann–Whitney U test were used to analyze the differences of each parameter between different groups. The ROC curves were employed to analyze the diagnostic performance for predicting GS of PCa. Areas under the curves (AUCs) were compared using the DeLong method [33]; and 95% confidence intervals (CIs), optimal cutoff values, and the corresponding sensitivity and specificity values were calculated. A two-sided significance level of 0.05 was set for the above statistical tests.

Results

These 40 index lesions consisted of 4 PI-RADS category 3, 24 PI-RADS category 4, and 23 PI-RADS category 5 foci. The agreements for these metrics between the two readers were excellent for ADC (interclass correlation coefficient (ICC): 0.95; 95% CI 0.90–0.97), D (ICC: 0.96; 95% CI 0.92–0.98), Dstar (ICC: 0.94; 95% CI 0.90–0.97), F (ICC: 0.97; 95% CI 0.95–0.99), Dapp (ICC: 0.97; 95% CI 0.94–0.98), and Kapp (ICC: 0.94; 95% CI 0.89–0.97).

GS was moderately inversely correlated with ADC (rho = − 0.487, P < 0.01), F (rho = − 0.473, P < 0.01), D (rho = − 0.432, P < 0.01) and Dapp (rho = − 0.436, P < 0.01), and positively associated with Kapp (rho = 0.611, P < 0.01); GS showed no significant correlation with Dstar (rho = 0.255, P = 0.11). The differences in ADC, F, D, Dapp, and Kapp values between GC and GA, GC and GA + GB, Gand A and GB + GC were all significant (P < 0.05) and were all not significant between GA and GB, and GB and GC. Details are presented in Table 3. The distribution of each parameter’s values according to different GS groups are shown in Fig. 4. ADC exhibited a strong positive correlation with F (rho = 0.785; P < 0.001), and a strong negative association with Kapp (rho = − 0.849, P < 0.001).

Table 3 Nonparametric tests results of diffusion parameters between different GS group
Fig. 4
figure 4

Boxplots above showing the results of Kruskal–Wallis test of parameters for independent samples among group1 (GS ≤ 3 + 4), group2 (GS = 4 + 3), and group3 (GS > 7). Center line indicates median, top of box indicates the 75th percentile, bottom of box indicates the 25th percentile, whiskers indicate the 10th and 90th percentiles, asterisk indicates extreme values (more than 3 interquartile ranges), and circles indicate outliers (between 1.5 and 3 interquartile ranges). ADC, F, D, and Dapp display a decreasing trend with GS, while Dstar and Kapp display an increasing trend with GS

Figure 5 and Table 4 display the results of the ROC cure analysis of the diffusion metrics for distinguishing different GS PCa values. The AUCs of ADC, F, D, Dapp, and Kapp in differentiating GS ≤ 3 + 4 and GS > 3 + 4 PCa were 0.744 (95% CI 0.581–0.868), 0.726 (95% CI 0.563–0.855), 0.732 (95% CI 0.569–0.860), and 0.752 (95% CI 0.590–0.875), 0.766 (95% CI 0.606–0.885), respectively, and those in differentiating GS ≤ 7 and GS > 7 PCa were 0.755 (95% CI 0.594–0.877), 0.734 (95% CI 0.571–0.861), 0.724 (95% CI0.560–0.853), and 0.716 (95% CI 0.552–0.847), 0.828 (95% CI 0.676–0.929), respectively, with all the P values less than 0.05. For pairwise comparisons of ROC curves, there were no significant differences among ADC, F, D, Dapp, and Kapp in differentiating different GS group (P = 0.0501–0.9414). Figures 6 and 7 display representative patients and diffusion parameter maps.

Fig. 5
figure 5

Graph showing utility of ROC curves of ADC, F, D, Dapp, and Kapp to differentiate GS ≤ 3 + 4 and GS > 3 + 4 PCa. Graph b shows utility of ROC curve of those parameters to differentiate GS ≤ 7 and GS > 7. Gray line = chance diagonal

Table 4 Diagnostic test characteristics of diffusion parameters for the diagnosis of GS
Fig. 6
figure 6

72-year-old man with prostate cancer (GS 3 + 4 = 7, lesions in left lobe of prostate, < T2, PSA 4.4 ng/ml). Pictures above show the index lesion (0.7 cm) in left PZ, PI-RADS V2 category 4. a Lesion is indicated by an arrow on T2WI; bf images obtained with b values of 200, 900, 1100, 2200, and 3000 s/mm2; as the b value increases, the high signal of the normal tissue is gradually suppressed, whereas the tumors become more and more obvious; g ADC map processed by monoexponential model; h, l pseudo color maps of D (= 0.67 × 10−3mm2/s), F (= 35.76%), Dstar (= 5.97 × 10−3mm2/s), Dapp (= 1.30 × 10−3mm2/s), Kapp (= 0.72)

Fig. 7
figure 7

70-year-old man with prostate cancer (GS 4 + 5 = 9, lesions in both lobes of prostate, T3a, PSA 7.9 ng/ml). Pictures above show the index lesion (1.8 cm) in left PZ, PI-RADS V2 category 5. a Lesion is indicated by an arrow on T2WI; bf images obtained with b values of 200, 900, 1100, 2200, and 3000 s/mm2; as the b value increases, the high signal of the normal tissue is gradually suppressed, whereas the tumors become more and more obvious; g ADC map processed by monoexponential model; h, l pseudo color maps of D (= 0.50 × 10−3mm2/s), F (= 25.45%), Dstar (= 8.40 × 10−3mm2/s), Dapp (= 0.94 × 10−3mm2/s), Kapp (= 0.94)

Discussion

Our study findings demonstrated that altered IVIM (F and D) and DKI parameters (Dapp and Kapp) in different GS PCa, revealed good diagnostic performance in differentiating GS ≤ 3 + 4 and GS > 3 + 4 PCa, GS ≤ 7 and GS > 7 PCa. We could interpret our findings in view of histopathologic tumor tissue composition. The increasing Gleason pattern is attributed to the increased heterogeneity of prostate histological compartments which consist of vascular (i.e., capillaries), fibromuscular stroma, epithelium, and glandular lumen, correlating with tumor aggressiveness [34, 35]. Recently, Chatterjee et al. [36] found that Gleason patterns exhibited a strong positive correlation with the epithelium and a negative correlation with the stroma and lumen space, but no remarkable correlation with cellularity metrics. But no parameter was able to differentiate GS ≤ 3 + 4 and GS = 4 + 3, and this might indicate that these two GS tumors’ microstructures had no significant differences, and it also could be attributed to small samples. Similar to conventional ADC, D, and Dapp are the adjusted diffusion coefficients, respectively, for IVIM and DKI. A number of studies have described the relationship between ADC and GS [37,38,39,40,41], and they have almost consistently reported a negative correlation. An increase of diffusion-restricting ingredients (i.e., vascular, epithelial fractions) associated with loss of diffusion-promoting components (i.e., stromal, luminal space) in tumors [42], leads to the decline of values of these diffusion parameters.

Although previous studies have proven the influence of tissue perfusion on ADC [43], the nature of the biexponential model has not yet been well explained. A report of Kuru et al. in 2014 [15] also indicated that perfusion-free diffusion constant D might hold potential for improved image-based tumor grading, which was consistent with our findings. It has been reported that the Dstar was at least one magnitude greater than D, and perfusion may be only palpable at very low b values [44]. Low b values were proposed in precluding high b values for IVIM to avoid the interference by high b values, where the contribution due to non-Gaussian diffusion was appreciable. However, in our study and other studies with a high b value, Le Bihan [45] suggested that the slow diffusion component may represent water that is associated with cell membranes and with cytoskeleton structures, while the fast diffusion component represents the remaining, less-restricted water, which is found in both intra- and extracellular spaces. A study with a larger patient population (50 patients) concluded that b value distribution influences mainly the repeatability of DWI-derived parameters (including IVIM and DKI parameters) rather than the diagnostic performance [46]. In the present study, the measurement of the relevant parameter Dstar indicated remarkably large standard deviations of most cancer lesions, which was similar to previous studies [47, 48], and we all found negative result of Dstar in predicting GS; but what differentiated from them to our study was that D and F performed well in differentiating different GS groups. A reason might be their different group (GS = 6 and GS ≥ 7). The F value can be calculated by assuming the random direction of the capillary segment at the voxel level [12]. A relatively purer IVIM parameter investigation by Pang et al. [44], which used different combinations of five b values (0, 188, 375, 563, and 750 s/mm2), reported a significant increase in F in tumors compared to benign tissues with b values below 750 s/mm2, and when high b values were employed, F might become lower or indistinguishable. However, even for low b values, they did not observe a significant difference in F among different GS tumors. Some previous studies reported that F was significantly smaller in PCa than in healthy PZ [13, 48], and in our study F was found to negatively correlated with GS. This may be interpreted by a theory of bulky phenomenon [49], where F is not only specific to perfusion but also may be sensitive to glandular secretion and fluid flow in the prostatic ducts, which corresponded to the results obtained by Le Bihan, as stated above.

Previous studies showed that kurtosis had significant correlations with histopathologic parameters (cytoplasmic, cellular, and stromal fractions) [50, 51]. ADCs obtained with b values less than 1000 s/mm2 were thought to mainly reflect diffusion of water in the extracellular space; when the b value increases to more than 1000 s/mm2, the intracellular interaction promotes non-Gaussian diffusion behavior and increases kurtosis, and the kurtosis parameter was supposed to reflect the interaction of water molecules with cell membranes and intracellular components [50, 52]. Therefore, Kapp has an excellent diagnostic ability for high GS lesions, which is proven by our results (AUC = 0.828, P < 0.001). Similar results had been concluded in a recent study [53]. A recent study by Lawrence et al. [51] showed that Dapp exhibited a significant positive correlation with luminal space and a negative correlation with cellularity, which assisted in differentiating cancerous lesions from normal tissue. However, they found that only the median Kapp was significantly different between groups with GS ≥ 4 + 3 and ≤ 3 + 4 (P < 0.05). Being different from them, in our study, mean values were used for analysis, and we found Dapp could also assist to differentiate GS ≥ 4 + 3 and ≤ 3 + 4. In another recent study, Wu et al. [24]. reported that both Kapp and Dapp helped in the prediction of GS upgrade in biopsy-proven GS 6 prostate cancer.

Although there was no significant difference in the AUCs among ADC, F, D, Dapp, and Kapp for differentiating GS ≤ 3 + 4 and GS > 3 + 4 PCa, GS ≤ 7 and GS > 7 PCa, Kapp always had the biggest one in our every periodical (when the number of cases was 20/34/40) analysis. A previous study with big sample size (n = 121) report that Kapp exhibited significantly greater sensitivity for differentiating low- and high-grade PCa than ADC or D (68.6% vs 51.0% and 49.0%, respectively; P < 0.004) [54]. That in this present study was 92.31% with the Youden index of 0.63. These might suggest a potential clinical advantage for incorporating the DKI model into prostate MRI protocols. From another aspect, strong correlations were observed between ADC and Kapp, F, which may suggest that these metrics individually provide similar information in PCa. The similar correlation between ADC and Kapp had been reported before [32].

The amount of GS = 3 + 3 PCa involved was deficient in this study; actually, the original number of GS = 3 + 3 patients proved by biopsy was 13, and they all underwent mpMRI examination, but 7 (54%) of them upgraded to GS = 3 + 4 at final pathology through prostatectomy, and 4 (31%) of them did not find a defined lesion on mpMRI. As the method to define prostatic foci was based on the PI-RADS V2 which was incomprehensive, it gave the definition of clinically significant PCa as GS ≥ 7 (including 3 + 4 with prominent but not predominant Gleason 4 component), and/or volume ≥ 0.5 cc, and/or extraprostatic extension (EPE) [27]. PI-RADS score of ≥ 3 might rarely yield PCa of GS ≤ 6. In our study and clinic, there could be cases in which mpMRI missed the diagnosis of GS ≤ 6 PCa, but fortunately, this group was with low risk or harmless disease which is not likely to cause problems in a man’s lifetime, and they are increasingly being managed with active surveillance [55]. And we recommend that those aged more than 50 years old without significant findings on mpMRI should follow-up (every 3 months) with PSA or ultrasound, etc.

There were some limitations to this study. First, the number of cases included in this study is limited, which may lead to errors due to sampling bias. In addition, the geographical source of our patients is relatively limited. These are common problems faced in other single-center studies. Second, the influence of image signal-to-noise ratio and the one-to-one correspondence between the lesion location on the gross specimen and the lesion location in the image were not solved in this study; therefore, there are some data measurement errors. Regarding the extent of misregistration, it is hoped that in future research, the quality of the image can be further improved, and the layer-by-layer slice pathology can be used as a reference. Third, the cancerous sample analysis did not consider differences in the central gland and peripheral lesions, because many cases had cancerous lesions in both regions. In addition, IVIM imaging and DK imaging were scanned in a series of b values simultaneously, so IVIM measurements might be biased to some degree, as mentioned above. A large-sized sample study is warranted for further discussion and for regulating and refining the above results.

In conclusion, both the IVIM and DKI models are beneficial to predict GS of PCa and indirectly predict its aggressiveness. However, we found no significant additional performance to ADC in the present study. Nonetheless, work remains to be performed to fully understand the mechanisms underlying these two models, as well as the manner in which b values generate differences.