Introduction

Bladder cancer is one of the most common malignant tumors of the urinary system with high morbidity and mortality [1]. As the ninth frequently diagnosed cancer globally, bladder cancer ranks 13th in mortality and has been observed obvious male predominance with three-quarters of all cancers occurring in men [2]. Statistics show that approximately 70–85% of bladder cancers are diagnosed without invasion of the detrusor muscle [3], which are defined as NMIBC (including stage T1 or lower), and the remaining are classified as MIBC (including stage T2 or higher) [4]. Transurethral resection of bladder tumor (TURBt) is the definitive treatment for NMIBC, whereas radical cystectomy, radiation therapy, and adjuvant chemotherapy are generally recommended to MIBC patients due to the high possibility of recurrence and metastasis [5]. Therefore, precise preoperative diagnosis and differentiation of MIBC and NMIBC are essential.

Bladder tumor grade and T category mainly depend on histological evaluation derived from transurethral resection or multiple bladder biopsies [6, 7]; however, both of them are invasive and may result in under-staging cancer because the muscle-invasive component is not always sampled. Therefore, a non-invasive and convenient evaluation mode for the assessment of the clinical stage of bladder cancer remains a crucial and imperative challenge.

DWI is a functional MR technology that can provide a variety of physiological tissue characteristics based on the apparent diffusion coefficient (ADC) value [8]. Previous studies indicated that DWI could serve as a valuable biomarker of bladder cancer in routine clinical situations [9]. Meanwhile, volumetric ADC histogram analysis has been applied in different tumors, such as glioma, breast cancer, hepatocellular carcinoma, etc. [10,11,12], but it is less used in the diagnosis of upper urinary tract urothelial cancer or bladder cancer. Compared with a single ADC value, a volumetric ADC histogram may provide much more statistical information and shows a potential value for further supplementing and improving the accuracy of diagnosis and grading of bladder cancer [13, 14].

Based on multiparametric magnetic resonance imaging (mpMRI) including T2-weighted imaging, DWI, and dynamic contrast enhancement MR imaging (DCE-MRI), the Vesical Imaging Reporting and Data System (VI-RADS) was created in 2018, providing a score of five grades. The mpMRI provides high tissue contrast resolution differentiates bladder wall layers much more accurately [14], and the VI-RADS has achieved good performance in discriminating NMIBC from MIBC [15, 16]. When it comes to DCE-MRI, the use of contrast media may be limited by the patient's physical condition and will increase the cost, and it has been suggested in the literature that bladder tumor contours could be assessed more accurately by DWI because the microvessels or reactive tissue adjacent to the tumor could be enhanced to a level similar to that of cancer components on dynamic contrast-enhanced images [17, 18]. Based on the above opinions, it is reasonable to think that DWI plays a very important role in VI-RADS, especially in some specific situations. Then, could we add some additional quantitative parameters to further improve and supplement DWI-VI-RADS, so that it can play a greater role in clinical practice?

Therefore, our study aimed to explore whether volumetric ADC histogram analysis could provide additional value to VI-RADS based on DWI in the differentiation of NMIBC and MIBC to predict the presence of muscle invasion in bladder cancer.

Materials and methods

Patients

This retrospective study was approved by the Institutional Review Board of our hospital and the requirement for informed consent was waived. From May 2019 to September 2020, a total of 123 consecutive patients with bladder cancer clinically diagnosed by enhanced CT, ultrasound, or cystoscopy underwent MRI before receiving surgical treatment in our hospital. The inclusion criteria were patients with bladder cancer who received MRI before undergoing resection of lesions for pathologic diagnosis and treatment. Exclusion criteria were as follows: (1) lack of pathological results (n = 3); (2) pathologically diagnosed inflammatory lesions (n = 6) or stage T0 (n = 8); (3) diagnosis of bladder adenocarcinoma (n = 1) or bladder neuroendocrine tumor (n = 1); (4) flattened or insufficient tumor area (diameter < 10 mm) for drawing VOIs (n = 32). Finally, 80 patients with pathologically confirmed bladder urothelial carcinomas were included. As for multiple lesions in 34 patients, we selected the one with the largest burden (the largest volume or the highest stage) and excluded the others [19]. A flowchart of the study population is shown in Fig. 1.

Fig. 1
figure 1

Flowchart shows process of selecting patients

Imaging acquisition

All subjects underwent the same MRI protocol at 3.0 T MRI scanner (Discovery 750; GE Healthcare) with a 32-channel torso phased-array coil covering the whole pelvic area. Before the MRI examination, all patients were instructed to urinate 2h in advance, drink 500–1000 mL of water, and not urinate for at least 30 minutes to distend the bladder moderately.

The routine pelvic protocol included the following sequences: (1) sagittal FRFSE T2-weighted image (TR/TE, 4902/130 ms; FOV, 240 mm × 240 mm; matrix size, 320 × 320; slicer thickness, 4 mm; intersection gap, 0.4 mm; bandwidth, 62.5 kHz), (2) axial FSE T1-weighted image (TR/TE, 568/6.8 ms; FOV, 340 mm x 340 mm; matrix size, 320 × 192; slicer thickness, 4 mm; intersection gap, 1 mm; readout bandwidth, 50 kHz); (3) axial FRFSE T2-weighted image and axial fat-saturated T2-weighted image (TR/TE, 3920/68 ms; FOV, 340 mm × 340 mm; matrix size, 320 × 256; slicer thickness, 4 mm; intersection gap, 1 mm; readout bandwidth, 62.5 kHz).

The DWI was scanned based on a free-breathing fat-suppressed single-shot echo-planar sequence in the axial plane (TR/TE, 4000/57 ms; FOV, 340 mm × 340 mm; matrix size, 128 × 160; slice thickness, 4 mm; intersection gap, 1 mm; bandwidth, 250 kHz; b-value,0 and 800 s/mm2).

Imaging analysis

All relevant MR images were transferred to the Picture Archiving and Communication System (PACS) for analysis. Two abdominal radiologists (Meng XY and Li Z, [blinded for review], with 8 and 16 years of experience in clinical MRI diagnosis) reviewed diffusion-weighted images of each participant and gave the DWI scores, respectively, without being informed of the pathologic findings. Each lesion was scored following the VI-RADS criteria strictly (see Supplementary Material) [20]. (1) Probably NMIBC; (2) possibly NMIBC; (3) indeterminate; (4) possibly MIBC; (5) probably MIBC. As for multiple lesions, differences were solved through group discussion until a consensus was reached.

Furthermore, the DWI images of each patient were analyzed independently using a freely available FireVoxel software package (NYU Center for Advanced Imaging Innovation and Research, New York, USA) on a personal computer. Volumetric ADC histogram analysis was performed by the two mentioned above radiologists. For each patient, the ROI was drawn to cover the entire lesion with appropriate size on all continuous slices of the tumor, resulting in a 3D volume of interest (3D VOI). Besides, the stalks with low signal intensity in some tumors were included in the ROIs for it may be associated with the depth of bladder cancer infiltration [17, 21, 22]. Finally, The ADC map within 3D VOI was calculated automatically with FireVoxel software by using a monoexponential model:

$$S = S_{0} \exp ( - b*ADC)$$

where S is the signal in the presence of diffusion and S0 is the signal in the absence of diffusion, and b is known as the b-value, which determines the degree of diffusion motion weighting in the signal. The original frequency table of each 3D VOI was generated automatically using the FireVoxel software, including the parameters min ADC (ADCmin), mean ADC (ADCmean), median ADC (ADCmedian), max ADC (ADCmax), skewness, kurtosis, and entropy. The supplementary parameters calculated by SPSS software included 10th percentile ADC (ADC10%), 25th percentile ADC (ADC25%), 75th percentile ADC (ADC75%), and 90th percentile ADC(ADC90%).

Then, we implemented the consistency check of the measurements analyzed by two radiologists and randomly selected one for further analysis.

Statistical analysis

Statistical analysis was performed using SPSS (version 24.0, IBM), GraphPad Prism (version 8.3.0), and MedCalc (version 12.7, MedCalc Software). Clinical and statistical data were described as mean ± standard deviation (SD). All tests were two-sided and p value < 0.05 was considered statistically significant. The intraclass correlation coefficient (ICC) was used to compare the consistency between two readers, and the agreement was interpreted according to the ICC (ICC > 0.80, excellent consistency; 0.61–0.80, good consistency; 0.41–0.60, moderate consistency; 0.21–0.40, general consistency; ICC ≤ 0.20, poor consistency). The non-parametric Mann–Whitney U-test was used for the comparison of each volumetric ADC histogram parameters between NMIBC and MIBC groups without the consideration of a normal distribution. Receiver operating characteristic (ROC) analysis was used to explore the potential diagnostic performance of each parameter in differentiating NMIBC from MIBC. Meanwhile, the corresponding area under the ROC curve (AUC) with 95% confidence intervals (95% CI) was calculated, and the optimum cutoff value based on the Youden index was correspondingly determined. The variables with p < 0.05 in the univariate analysis were selected for further analysis. The combination of VI-RADS score and ADC histogram parameters was analyzed using the binary logistic regression analysis and was suppose as a new variable in the ROC analysis. Areas under the ROC curve (AUCs) were compared using DeLong's test.

Results

Patient characteristics

A total of 80 patients (66 males and 14 females; age range, 28–81 years; mean age, 60.23 ± 11.03[SD] years) were enrolled in our study. 36 patients underwent cystectomy including 31 cases of radical cystectomy and 5 cases of partial cystectomy, and the remaining 44 patients underwent TURBt. The VI-RADS scores based on DWI were as follows: 47 cases of VI-RADS 2, 8 cases of VI-RADS 3, 23 cases of VI-RADS 4, and 2 cases of VI-RADS 5. There were no statistical differences in age or gender between the NMIBC and MIBC groups (Table 1). Representative images of NMIBC and MIBC are presented in Figs. 2 and 3.

Table 1 Characteristics of patients with bladder cancer
Fig. 2
figure 2

Non-muscle-invasive papillary urothelial carcinoma in the anterior bladder wall in a 56-year-old man. a Axial T2-weighted image; b Axial DWI images; and c Corresponding diffusion-weighted image reconstruction of ADC values (ADC values are given in units of × 10–3 mm2/s). d Volumetric ADC histogram shows a large portion of voxels with high ADC values and negative skewness of 0.725 and negative kurtosis of 0.018

Fig. 3
figure 3

Muscle-invasive papillary urothelial carcinoma in the posterior bladder wall in a 71-year-old man. a Axial T2-weighted image; b Axial DWI images; and c Corresponding diffusion-weighted image reconstruction of ADC values (ADC values are given in units of × 10–3 mm2/s). d Volumetric ADC histogram shows a large portion of voxels with low ADC values and positive skewness of 2.993 and positive kurtosis of 11.225

Pathological findings

In this study, all 80 patients were pathologically diagnosed with urothelial carcinomas. According to the T category, the lesions were classified as NMIBC in 53 patients (stage Ta in 21 cases and stage T1 in 32 cases), and MIBC in 27 patients (stage T2 in 20 cases, stage T3 in 3 cases, and T stage 4 in 4 cases).

Comparisons of volumetric ADC histogram parameters between the NMIBC and MIBC groups

The NMIBC group had higher ADCmin, ADCmean, ADCmedian, ADC10%, ADC25%, ADC75%, and ADC90% than the MIBC group. Meanwhile, the skewness and kurtosis of the NMIBC group were lower (Fig. 4). No significant difference was observed in ADCmax and entropy between NMIBC and MIBC. Interobserver measurement consistency in the volumetric ADC histogram parameters was good to excellent, with the ICC range from 0.827 to 0.967 (Table 2).

Fig. 4
figure 4

Boxplots show median and interquartile ranges for the histogram parameters for a ADCmin, ADC10%, ADC25%, ADCmedian, ADCmean, ADC75%, and ADC90% and b kurtosis and skewness between NMIBC and MIBC. The ADCmin, ADC10%, ADC25%, ADCmedian, ADCmean, ADC75%, and ADC90% were significantly lower in the MIBC group than in the NMIBC group

Table 2 Volumetric ADC histogram parameters and VI-RADS score in differentiating NMIBC and MIBC, and the intraclass correlation coefficient for volumetric parameters and VI-RADS

Receiver operating characteristic curve analysis

Among all parameters mentioned above, DWI-VI-RADS achieved the highest AUC (AUC 0.880; 95% CI 0.793–0.940) than any other single parameter in differentiating NMIBC from MIBC. The optimal cutoff value was the VI-RADS score of 3 with sensitivity and specificity of 88.89% and 83.61%, respectively (Table 3 and Fig. 5). Compared with DWI-VI-RADS alone, combined with any one of ADCmin, ADCmean, ADCmedian, ADC10%, ADC25%, ADC75%, and ADC90%, the AUC value showed no statistical difference (p = 0.830, p = 0.515, p = 0.708, p = 0.947, p = 0.893, p = 0.499, p = 0.717), while the combination of DWI-VI-RADS score and skewness or kurtosis showed significantly higher AUC value (p = 0.012, p = 0.011). The combination of DWI-VI-RADS and skewness yielded the best diagnostic performance (AUC 0.923; 95% CI 0.847–0.969). In addition, the AUC value of the combination of DWI-VI-RADS, skewness, and kurtosis (AUC 0.926; 95% CI 0.850–0.971) was higher than the combination of DWI-VI-RADS and skewness, but the difference was not statistically significant (Table 4 and Fig. 6).

Table 3 Diagnostic Performance of Volumetric ADC Histogram Parameters in Differentiation of MIBC From NMIBC
Fig. 5
figure 5

ROC curves of VI-RADS score and volumetric ADC histogram parameters

Fig. 6
figure 6

ROC curves of VI-RADS score combined with volumetric ADC histogram parameter

Discussion

Accurate differentiation between NMIBC and MIBC is of great significance as the management and prognosis vary widely. In this study, based on the existing DWI score, volumetric ADC histogram analysis was added to further improve the diagnostic efficiency.

Table 4 Diagnostic performance of VI-RADS score combined volumetric ADC histogram parameters

Several prior studies have demonstrated correlations between ADC values and pathological grade of bladder cancer, and histogram features of ADC map perform better than ADC value alone [23,24,25,26]. In this study, we found that the ADC value of different percentiles was significantly lower in MIBC than those of NMIBC, which was similar to the previous studies [17, 25], indicating that the diffusion of water molecules in MIBC is more limited, and the pathological changes of bladder cancer may be uniform in voxels. ADC value is considered as a quantitative surrogate for tissue cellularity, which reflects the average diffusion of water molecules in voxels. It is generally believed that lower ADC values in malignant tumors are related to increased cells, restricted blood perfusion, and decreased membrane permeability [27].

Besides the quantitative percentile ADC values, we found that the skewness and kurtosis of the NMIBC group and MIBC group were significantly different, higher in the MIBC group. Suo et al. [28] also demonstrated that both skewness and kurtosis were significantly higher in high-grade bladder tumors, which was consistent with our conclusion. Statistically, skewness and kurtosis represent asymmetry and peakedness of the histogram, respectively, reflecting tumor heterogeneity from different perspectives [29]. We considered that the underlying mechanism may be related to the pathologic features of different grades of cancers, for the muscle-invasive tumors are generally solid and poorly differentiated with atypical cells and disorganized microvessels [30]. In the study of Andrew et al. [24], the mean ADC value showed an association with tumor stage, but the skewness and kurtosis showed no statistical differences, which was inconsistent with our conclusion. Two possible reasons may account for the different results. First, the sample of this previous study was relatively small, just 23 included ultimately. Second, all tumors included were high-grade, while no low-grade tumors result in uneven histological distribution. In our study, we included more participants with various histological grades. At last, different study cohorts may also influence the results.

The recently concerned VI-RADS could provide good diagnostic performance for bladder cancer grading and muscle invasiveness evaluation [19, 31]. For example, Wang et al. [19] found that the AUC value of VI-RADS based on multiparametric MRI could be up to 0.94 (95% CI 0.90, 0.98). In this study, with the cutoff score of 3 or higher, the AUC value of VI-RADS was 0.880 (95% CI 0.793–0.940), with a specificity of 88.89% (95% 70.8–97.6%), a specificity of 83.61% (95% CI 71.9–91.8%), which was slightly lower than the former because our VI-RADS scores were just based on the DW imaging. Otherwise, all tumors scored as VI-RADS 5 were confirmed as MIBC based on pathological results, and 96.2% of them (51 of 53) were confirmed as NMIBC. For tumors scored as VI-RADS 3 and 4, 25.0% (2 in 8) and 26.1% (6 in 23) were verified to be NMIBC. The above results are consistent with previous studies, and the performance is not good enough when scored VI-RADS 3 and 4. It is worth noting that one of the patients with the VI-RADS score of 4 was histologically confirmed as NMIBC, but was confirmed as MIBC several months later, suggesting that the VI-RADS score had potential value in preoperative assessment of bladder cancer to some extent. In the present study, we not only completed the VI-RADS score but also measured the ADC value, which provided additional value for the diagnosis of MIBC from NMIBC.

Although the VI-RADS and ADC values provided clinical value validated by previous studies, our study still showed some differences. For example, Xi Zhang et al. [32] used histogram and GLCM texture features from DW images and ADC maps, combined with an SVM classifier to differentiate bladder cancer grade, and received the optimal feature subset AUC of 0.861. Shuaishuai Xu et al. [33] combined the DWI radiomics features with transurethral resection, significantly improved the sensitivity and accuracy in discriminating the presence of muscle invasion (sensitivity, 0.964; accuracy, 0.897), effective but invasive. In our study, we combined the VI-RADS score of DWI and the ADC value, making the DW-MRI fully exploited and utilized, and completely non-invasive.

There were still some limitations in our study. First, as a single-center retrospective study, the selective bias could not be avoided. Only one lesion was analyzed for the patients with multiple tumors, this also may cause selection bias. Second, the sample size of our study was relatively small, and the distribution of T stages was uneven, with a small number of T3 or higher tumors and a larger number of T2 or lower tumors, and further study of a larger population was required. Third, the small lesions (< 10 mm in diameter) were not included due to the difficulty of drawing ROI, which led to the tumors evaluated in our study were relatively large. What’s more, follow-up information was insufficient and further follow-up was required. Finally, the ADC value was measured only from the monoexponential model with b values of 0 and 800 s/mm2, several previous studies had revealed that biexponential and stretched exponential diffusion-weighted MRI may afford more information [34, 35].

In conclusion, our study had demonstrated that both volumetric ADC histogram analysis and VI-RADS could contribute to the prediction of muscle invasion for bladder cancer, and the volumetric ADC histogram parameters could provide additional value. The diagnostic performance could be significantly increased when combining VI-RADS and volumetric ADC histogram parameters, especially the skewness of ADC values.