Abstract
Purpose
The purpose of the study is to perform a meta-analysis of studies investigating the diagnostic performance of apparent diffusion coefficient (ADC) values in separating high-risk from low-risk prostate cancer (PCa).
Methods
MEDLINE and EMBASE databases were searched in December 2015 for studies reporting diagnostic performance of ADC values for discriminating high-risk from low-risk PCa and providing sufficient data to construct 2 × 2 contingency tables. Diagnostic performance was quantitatively pooled using a bivariate random-effects model including subgroup analysis and assessment of study heterogeneity and methodological quality.
Results
13 studies were included, providing 1107 tumor foci in 705 patients. Heterogeneity among studies was moderate (τ2 = 0.222). Overall sensitivity was 76.9% (95% CI 68.6–83.6%); overall specificity was 77.0% (95% CI 69.9–82.8%); and summary AUC was 0.67. Inverse correlation between sensitivity and specificity (ρ = −0.58) indicated interstudy heterogeneity was partly due to variation in threshold for test positivity. Primary biases were readers’ knowledge of Gleason score during ADC measurement, lack of prespecified ADC thresholds, and lack of prostatectomy as reference in some studies. Higher sensitivity was seen in studies published within the past 2 years and studies not using b value of at least 2000; higher specificity was associated with involvement of one, rather than two, readers measuring ADC. Field strength, coil selection, and advanced diffusion metrics did not significantly impact diagnostic performance.
Conclusion
ADC values show moderate accuracy in separating high-risk from low-risk PCa, although important biases may overestimate performance and unexplained sources of heterogeneity likely exist. Further studies using a standardized methodology and addressing identified weaknesses may help guide the use of ADC values for clinical decision-making.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
While prostate cancer (PCa) is the most commonly diagnosed non-cutaneous cancer in men in the United States, the large majority of patients die of other causes [1]. Indeed, it is estimated that only 25% of PCa cases would be detected clinically without prostate specific antigen (PSA) screening [2]. Given the high prevalence and relative indolence of most cases of PCa, accurate risk stratification is of paramount importance in order to avoid overtreatment. This need is particularly critical considering the substantial morbidity associated with radical prostatectomy (RP), including high rates of impotence and incontinence in some studies [3]. While the Gleason score (GS) determined at RP is the gold standard for predicting mortality following RP [4], methods are needed to reliably determine the GS prior to RP in order to provide an accurate prognostic assessment and select an appropriate management regimen, for instance between RP and active surveillance. Current alternative risk stratification tools include a combination of laboratory results, clinical findings, and biopsy GS, although these have shown moderate accuracy at best [5–7]. Furthermore, the central role of biopsy in such prognostic schemes necessitates a degree of invasiveness, with associated cost, patient discomfort, and risk of complications.
Diffusion-weighted imaging (DWI) is an MRI technique that has been established to improve tumor detection and localization [8]. Furthermore, an inverse correlation between apparent diffusion coefficient (ADC) values derived from DWI and GS has been well described [9–23]. This association suggests that quantitative ADC values could potentially serve as a prognostic biomarker for PCa treatment selection, and improve upon both the suboptimal accuracy and invasiveness of current risk stratification strategies. However, the extent of the relationship between ADC values and GS has varied between studies, and a greater understanding of this association is warranted in view of the growing interest in both the urologic and radiologic communities in applying ADC values in clinical settings [24]. While past meta-analyses have explored the role of qualitative DWI for tumor detection, no meta-analysis to our knowledge has examined the use of quantitative ADC values for risk stratification. Therefore, the purpose of this study was to determine the performance of quantitative ADC values for separating high-risk from low-risk PCa, as well as to identify factors influencing this performance.
Methods
Literature search
A single radiologist (HS, a fellow in abdominal MRI) performed searches of the MEDLINE and EMBASE databases to identify relevant articles. The “PICO” question to be addressed by the review was [25] (P) Patients with prostate cancer; (I) Intervention—quantitative ADC measurements obtained from diffusion-weighted MRI; (C) Comparison—none; (O) Outcome/Target Condition—presence of high- or low-risk tumor based on histopathologic findings from either prostate biopsy or radical prostatectomy serving as reference standard. The searches were conducted on December 11th, 2015. The specific search terms are provided in the Appendix in the Electronic Supplementary Material. The study protocol was not registered.
Study selection
The retrieved abstracts were reviewed by the previously noted investigator (HS). Full texts were retrieved for those articles that represented original research and reported diagnostic accuracy of ADC values for separating high-risk from low-risk prostate cancer. To be included in the meta-analysis, the full-text had to then provide sufficient data for constructing a 2 × 2 contingency table. If multiple eligible studies reported overlapping patient cohorts, then the larger study was included. For studies lacking data for construction of the 2 × 2 table but otherwise meeting inclusion criteria, efforts were made to contact the study authors and obtain these data. Articles in non-English languages were translated to English to allow for further assessment.
Data extraction
The included articles were reviewed by two investigators (HS and AR; AR is a fellowship-trained body radiologist with 8 years of experience). The relevant data were independently extracted and differences resolved by consensus. The extracted study data included covariates to be analyzed in meta-analysis, all represented in binary form, as well as other study characteristics.
The selected covariates included the following: Year of publication (classified as before or after January 1, 2014), MRI field strength (classified as 3T for all patients vs. other), GS threshold for differentiating low-risk from high-risk prostate cancer in the study’s 2 × 2 data, use of an endorectal coil, use of RP as reference standard in all patients, whether all patients had a known diagnosis of prostate cancer at the time of MRI, number of tumor foci per patient contributing to 2 × 2 data (classified as 1, vs. more than 1 tumor focus, per patient), ADC metric selected for the 2 × 2 data (classified as mean ADC vs. other metric, regardless of inclusion of b0 in ADC computation), maximal b value used for the computation of ADC metric within 2 × 2 data (classified as whether or not at least 2000s/mm2), prostate zone studied (classified as whole gland vs. peripheral zone), number of readers placing regions of interest (classified as 2 readers vs. 1 reader), reader experience (classified as whether at least one reader had at least 5 years of experience), and use of anti-peristaltic agent (classified as used in some or all patients vs. not used in any patients).
Additional extracted study characteristics included the country of publication, number of overall patients with tumors contributing to the 2 × 2 contingency table, total number of tumor foci evaluated in the 2 × 2 table, patient age (as reported in the article), and PSA (as reported in the article). In studies involving more than one reader, the reported interreader variability was also recorded, when available.
For each study, a 2 × 2 contingency table was constructed using the extracted data for the performance of ADC values for discriminating high-risk from low-risk prostate cancer. If multiple diagnostic accuracies were reported for ADC values derived from different b value combinations or for different ADC metrics, then the b value combination or ADC metric which yielded the highest diagnostic performance (defined as the average of sensitivity and specificity) was selected for inclusion in the meta-analysis. Diffusional kurtosis metrics, when provided in addition to conventional ADC metrics, were not selected. If diagnostic accuracy was reported for separating tumors of varying combinations of GS, then the data for separating GS 3 + 3 tumor from higher GS tumors were selected. In studies in which diagnostic accuracies were reported for multiple observers, the raw 2 × 2 data of the observers were averaged. If the provided sensitivities and specificities did not yield whole numbers for the 2 × 2 data, then attempts were made to contact the authors for clarification, and the data were fit to the published sensitivities and specificities when unsuccessful. Additional comments regarding derivation of the 2 × 2 data for included studies are detailed in Table 1. The end result was a single 2 × 2 table per study for the meta-analysis.
Methodological quality assessment
Overall methodological quality and likelihood of bias of the articles included in the meta-analysis were assessed for risk of bias by using the revised tool for the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) [26]. The questions evaluated for each study are detailed in Table 2. A maximum of 6 months was considered as an acceptable time interval between MRI and biopsy, given the very slow growth rate of PCa with a doubling time of 2 years [27, 28].
Given the expectation of heterogeneity among studies of diagnostic accuracy, the inability of conventional tests for publication bias to handle threshold effects between sensitivity and specificity, as well as uncertainty regarding whether publication bias in fact exists for studies of diagnostic accuracy, tests for publication bias, including the Deeks method, are not recommended for meta-analyses of diagnostic tests and were therefore not performed [29, 30].
Statistical analysis
Following computation of the sensitivity and specificity of each individual study for discriminating high-risk from low-risk prostate cancer using an ADC cutoff, the data were pooled and analyzed in sum using the bivariate random-effects model of Reitsma et al. [31]. The sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio across all studies were computed. A summary receiver operating characteristic (sROC) curve was produced and used to calculate the area under the curve (AUC) [32]. A hierarchical summary ROC (HSROC) model was not used given the inability of such a model to allow for separate assessments of study covariates’ impact on test sensitivity and specificity, prior studies suggesting similar results in the sROC when using a bivariate random-effects model vs. HSROC model [33], and initial exploratory assessment of our pooled data prior to covariate analyses showing essentially identical sROC curves between the two models (not published).
The presence of heterogeneity between studies was assessed for sensitivity and specificity through the Cochran’s Q test (χ2 > df, p < 0.1: heterogeneity is present) [34] and τ2 statistic (quantifies between-study variance when >0) [35, 36]. In the case of a direct correlation between the sensitivity and false-positive rate, a threshold effect to explain interstudy heterogeneity was considered likely, whereby apparent differences in performance are explained by a tradeoff between specificity and sensitivity. Sensitivity and specificity were calculated separately for subgroups defined by the prespecified covariates using a bivariate random-effects model. Metaregression was performed of pairwise combinations of selected study characteristics observed to significantly impact study performance at univariable assessment. Statistical analysis was performed using the MADA (“Meta-analysis of diagnostic accuracy”) package within the R software environment (R Foundation for Statistical Computing; Vienna, Austria; http://www.R-project.org) [37, 38]. A p value of <0.05 was considered statistically significant.
Results
Study selection
The two database searches resulted in 617 abstracts (Fig. 1). 488 of these were excluded at the abstract level (Fig. 1), leaving 129 articles reviewed at the full-text level (Appendix in Electronic Supplementary Material). An additional 116 articles were subsequently excluded (Fig. 1). The remaining 13 studies were included in the meta-analysis [9–21]. One of the included studies [19] was published in Chinese, with an English-language abstract.
Summary of included studies
The 13 included studies are summarized in Tables 3, 4, and 5. Nine studies were published in 2014 or 2015 [13–21]. The MRI field strength was 3T in seven studies [11, 12, 15, 18–21] and 1.5T in the remaining studies. Ten studies classified GS ≥ 7 tumors as high risk, while 3 studies used a higher GS threshold to define high risk: GS ≥ 3 + 4+5 [12], GS ≥ 4+3 [13], and GS ≥ 8 [16]. In 4 studies, an endorectal coil was used [9, 10, 16, 17]. The reference standard was RP for all patients in 7 studies [10, 12–15, 17, 21], biopsy for all patients in 5 studies [9, 11, 16, 19, 20], and a combination of RP and biopsy in 1 study [18]. In 8 studies, all patients had known PCa at the time of MRI [9–12, 16–18, 21]. There was more than one tumor focus per patient contributing to the 2 × 2 data in 9 studies [9–11, 13, 16–19, 21]. In 4 studies, an ADC metric other than mean ADC was selected as providing highest accuracy: normalized ADC in two studies [15, 16], minimum ADC in one study [13], and 10th percentile ADC derived from an ADC histogram in one study [21]. Five studies used a b value greater than or equal to 2000s/mm2 for calculating ADC [11, 13, 18, 20, 21]. Eight studies included lesions from the whole prostate gland [9, 13–18, 21], while the remaining 5 studies only included peripheral zone lesions. Two readers were involved in placing the ROIs in 10 studies [10–12, 14, 16–21], while one reader was involved in ROI placement in the remaining studies. An expert reader was involved in measuring ADC values in five studies [9, 12, 14, 15, 18], while no expert reader was involved in four studies [11, 13, 16, 17]. An antiperistaltic was used prior to prostate MRI in some or all patients in 5 studies [9, 12, 14, 15, 18].
Of the ten studies with two readers, five of these entailed consensus readings. Of the five two-reader studies with independent readings, only one of these reported interobserver variability [15], noting κ = 0.90.
Quality assessment
The overall scores for the QUADAS-2 questions are shown in Fig. 2. Scores were very high, indicating low risk of bias, for questions relating to quality of patient selection [100% (13/13) of studies receiving a score “yes”] and quality of the flow and timing of study design [85% (11/13)–100% (13/13)] aside from a score of 54% (7/13) for the question relating to appropriate time interval between MRI and the reference standard. Scores were very low, indicating high risk for bias, for questions relating to the quality of index test interpretation [8% (1/13)–31% (4/13)]. Low performance in this category was attributed to the readers’ access to pathologic findings from either biopsy or radical prostatectomy at the time of ADC measurement as well as lack of a prespecified ADC threshold. The question relating to the quality of the reference standard received a low to moderate score [54% (7/13)] due to the lack of RP serving as reference standard for all patients in 6 studies.
Assessment of heterogeneity between studies
Between-study heterogeneity was quantified using τ2 among the 13 studies, with resulting τ2 = 0.222, although Cochran’s Q was not significant (10.81; df = 12, p = 0.545). A threshold effect was identified [moderate correlation between sensitivity and false-positive rate (r = −0.58)], indicating a tradeoff in sensitivity for specificity and accounting for some part of the apparent heterogeneity in test performance among studies.
Overall diagnostic accuracy
There were a total of 1107 tumor foci in 705 patients across the 13 included studies. Sensitivity and specificity were both variable among studies, [range, 51.0% (26/51)–90.8% (58.8/65)] and [range, 61.1% (11/18)–96.2% (24.7/26)], respectively. Figure 3 depicts the Forest plots of sensitivity and specificity for these 13 studies. The summarized sensitivity and specificity of ADC values for separating high-risk from low-risk PCa were 76.9% (95% CI 68.6–83.6%) and 77.0% (95% CI 69.9–82.8%), respectively. The positive likelihood ratio was 3.37 (95% CI 2.66–4.25). The negative likelihood ratio was 0.30 (95% CI 0.22–0.39). Figure 4 shows the sROC curve with AUC of 0.67, indicating moderate performance in discrimination of high-risk from low-risk PCa.
Subgroup analyses
Table 6 details the results of the subgroup analyses. Sensitivities were significantly higher for studies in which a b value of 2000s/mm2 was not included in computation of the ADC values metrics (81.8% vs. 67.1%, p = 0.04) and for studies published in 2014 or 2015 (81.3% vs. 63.3%, p = 0.02). Specificities were significantly higher for studies in which ADC values were measured by 1 reader rather than by 2 readers (93.3% vs. 73.2%, p = 0.005). The remaining analyzed covariates showed no statistically significant difference in terms of sensitivity or specificity (p ≥ 0.162). At pairwise metaregression, both using ADC values derived from b values lower than 2000 and publishing in 2014 or 2015 remained as significant independent predictors of increased sensitivity (p < 0.001).
Discussion
Our meta-analysis demonstrated that quantitative ADC values have moderate sensitivity and specificity in separating high-risk from low-risk PCa. Nonetheless, both sensitivity and specificity were heterogeneous across the 13 included studies. While studies varied greatly in technical factors, a moderate inverse correlation between reported sensitivity and specificity of individual studies indicates that apparent interstudy heterogeneity was at least partially due to variation along a single ROC curve. Differences in potentially important technical considerations including field strength, coil arrangement, and ADC metric (whether conventional mean ADC or an alternative advanced metric) may also contribute some degree of heterogeneity, though none of these covariates were associated with significant differences in sensitivity or specificity. Nonetheless, our meta-analysis was unable to fully account for all of the observed heterogeneity in test performance across studies, such that additional unexplained sources of heterogeneity likely exist.
The GS determined at RP serves as the single best predictor of mortality in patients undergoing RP for organ-confined PCa [4]. However, the majority of PCa patients have a low-risk tumor with indolent behavior and would be overtreated by RP. The aim in such patients is early and accurate risk stratification in order to avoid RP. To date, no risk stratification tool has been able to predict the GS at RP with sufficient accuracy in order to reliably distinguish between those patients who require RP to prolong cancer-specific survival and those patients who can be managed non-invasively. While the GS determined at systematic biopsy has traditionally served as the mainstay for clinical risk assessments, up to 43% of men with low-grade PCa on standard biopsy will be upgraded on RP [7]. More recently, MRI-targeted biopsy has shown increased accuracy in the determination of tumor grade compared to standard 12 core biopsies [39]. In spite of this improvement, targeted biopsy continues to miss some high-risk lesions. In a large prospective study, Siddiqui et al. demonstrated that fusion biopsy outperformed standard 12 core biopsy in detecting high-risk PCa when using RP as the reference standard (AUC of 0.73 compared to 0.59, respectively) [40]. Other studies have reported a similar trend [41, 42]. While a wide variety of approaches combining results of prostate biopsy with other data elements have also been explored for improved risk stratification, including risk categories incorporating PSA and biopsy findings [5], as well as advanced preoperative nomograms [43], these have also had at best moderate accuracy with AUCs ranging from 0.52 to 0.80 [6].
The previously described prognostic strategies for identifying which patients warrant aggressive treatment vs. those that can be managed non-invasively all fundamentally rely on information obtained from prostate biopsy. Furthermore, monitoring of patients on surveillance would require serial biopsies to detect a potential change. Our meta-analysis indicates that quantitative ADC values also achieve moderate accuracy in risk stratification, albeit in a completely non-invasive manner. In these regards, based on the available literature, ADC measurements appear to compare favorably with other current invasive schemes for risk stratification of PCa.
Only two of the evaluated study parameters were found to be significantly associated with study sensitivity. First, sensitivity was significantly higher in more recently published studies, which underscores the potential of continued optimization in enhancing the value of quantitative diffusion metrics in prostate cancer risk assessment. Such optimization may relate to increasing experience within the radiological community in the interpretation of prostate DWI as well as to continued advancements in the acquisition or postprocessing of prostate DWI, for instance involving hardware or software-related factors not captured in our analysis. Furthermore, this observation raises the possibility that the diagnostic performance of DWI for PCa risk assessment may continue to improve with further advances in imaging techniques. Second, sensitivity was significantly lower in studies incorporating a b value of 2000s/mm2 in ADC computations. This finding may at first seem surprising given the prior studies that have reported utility from including very high b value images in clinical assessments of prostate DWI [44, 45]. However, it has also been suggested that despite the value of the high b value images themselves, these should not be included in ADC map computations given non-mono-exponential diffusion behavior occurring at very high b values which is not reflected by standard ADC computation techniques, as well as the impact of low SNR at very high b values on quantitative ADC metrics [46]. Indeed, other studies have failed to confirm benefit of very high b values for tumor detection when interpretation is based primarily on assessment of the ADC map rather than of the acquired diffusion-weighted images [47–49]. Thus, while there is growing support for very high b value images, based in part on studies supporting their role for tumor detection, it is possible that the very high b value images do not have similar added value for non-invasive risk assessment of a detected lesion.
The only study parameter that was observed to be significantly associated with specificity was the number of readers involved in ADC measurement. Specifically, the presence of one, rather than two, reader was associated with significantly increased specificity. The exact explanation for this association is unclear. There was no observed significant association between reader experience and test performance to suggest the presence of a “super-reader” in single-reader studies to account for the finding. Alternatively, we speculate that there may be a tradeoff in sensitivity and specificity when involving multiple readers in selecting areas for region-of-interest measurement, as the presence of 1 reader was also associated with a non-significant decrease in sensitivity. Indeed, the relatively small number of tumor foci in studies with a single reader, rather than 2 readers, performing ADC measurements (144 vs. 687 tumor foci, respectively) may have contributed to the lack of statistical significance for the comparison of sensitivity between these two groups of studies. Further evaluation of the influence of multiple readers is difficult given the common presence of consensus readings as well as the reporting of interobserver variability in only a single one of the included studies.
While the observed AUC was reassuring, our quality assessment demonstrated several key limitations in the summarized literature that may the affect the generalizability of the findings when seeking to apply quantitative ADC values prospectively. Specifically, a number of weaknesses in design were common among the included studies: the availability of pathologic information at the time of ADC measurement, the lack of a prespecified ADC threshold, and (in a smaller number of studies) the lack of radical prostatectomy as reference standard. In addition, some studies that used prostatectomy as reference standard performed radiologic-pathologic correlation on a per-patient rather than a per-lesion basis [12, 14, 15]. As a result, it is possible that the observed accuracy of ADC measurements in the included studies may overestimate the actual test performance of ADC. Thus, prior to prospective application of ADC measurements in clinical management, additional investigations are required to validate the role of ADC values using measurements blinded to pathologic information derived from radical prostatectomy in combination with testing of an a priori determined threshold. Aside from these several important concerns, other key aspects of study quality were overall strong among the included studies.
There are several limitations in our meta-analysis. First, our sample size of 13 studies was small. A lack of availability of 2 × 2 data in many studies examining the role of ADC in PCa risk stratification contributed to this sample size. For example, while a number of studies have explored associations between ADC metrics and prostate cancer aggressiveness, potentially with high-quality study designs including radical prostatectomy as the reference standard, the data of such studies may have largely comprised correlation coefficients or summary AUC values without actual sensitivity and specificity data, thereby precluding formal inclusion in our meta-analysis (Appendix in Electronic Supplementary Material). It is anticipated that additional original data will continue to accumulate as further investigations are performed. Also, we pooled studies utilizing different ADC metrics field strengths, and coil designs. While our meta-analysis showed no significant effect of such technical considerations on accuracy, the study may have been underpowered to detect significant differences in some of the subgroups. In addition, our meta-analysis relies on GS, whether obtained by biopsy or RP, as the reference standard. However, substantial literature demonstrates that determination of the Gleason score itself is variable among pathologists [50–52]. This inherent heterogeneity of the reference standard is a further potential source of variation in performance of ADC metrics not considered by the explored radiological literature. Finally, the retrospective nature of the studies may not reflect the performance of ADC measurements in routine clinical practice.
In conclusion, the available literature supports that quantitative ADC values have moderate accuracy in separating high-risk from low-risk PCa, comparing favorably to that of other currently accepted invasive predictors. Sensitivity was higher in more recent studies and in studies not using a b value of 2000s/mm2, while specificity was higher in studies with a single reader. While heterogeneity in diagnostic performance of ADC across studies could not be fully explained by our meta-analysis, such heterogeneity appeared to represent, at least in part, a tradeoff between sensitivity and specificity. Although the improvement in sensitivity in more recent publications is encouraging regarding the potential future role of quantitative diffusion metrics in prostate cancer risk assessment, methodological heterogeneity exists in numerous technical considerations across studies. Moreover, the available studies may have overestimated the true test performance of ADC due to several identified common methodological biases, including the lack of radical prostatectomy as a reference standard in a number of studies. Thus, additional prospective investigations evaluating the performance of ADC values obtained using a standardized methodology that addresses these particular weaknesses are needed prior to applying ADC values to guide clinical decision-making for PCa.
References
Siegel RL, Miller KD, Jemal A (2015) Cancer statistics, 2015. CA Cancer J Clin 65:5–29
Etzioni R, Cha R, Feuer EJ, Davidov O (1998) Asymptomatic incidence and duration of prostate cancer. Am J Epidemiol 148:775–785
Sanda MG, Dunn RL, Michalski J, et al. (2008) Quality of life and satisfaction with outcome among prostate-cancer survivors. New Engl J Med 358:1250–1261
Epstein JI, Partin AW, Sauvageot J, Walsh PC (1996) Prediction of progression following radical prostatectomy. A multivariate analysis of 721 men with long-term follow-up. Am J Surg Pathol 20:286–292
Caster JM, Falchook AD, Hendrix LH, Chen RC (2015) Risk of pathologic upgrading or locally advanced disease in early prostate cancer patients based on biopsy Gleason score and PSA: a population-based study of modern patients. Int J Radiat Oncol Biol Phys 92:244–251
Chun FK, Karakiewicz PI, Briganti A, et al. (2007) A critical appraisal of logistic regression-based nomograms, artificial neural networks, classification and regression-tree models, look-up tables and risk-group stratification models for prostate cancer. BJU Int 99:794–800
Chun FK, Steuber T, Erbersdobler A, et al. (2006) Development and internal validation of a nomogram predicting the probability of prostate cancer Gleason sum upgrading between biopsy and radical prostatectomy pathology. Eur Urol 49:820–826
Tan CH, Wei W, Johnson V, Kundra V (2012) Diffusion-weighted MRI in the detection of prostate cancer: meta-analysis. AJR Am J Roentgenol 199:822–829
deSouza NM, Riches SF, Vanas NJ, et al. (2008) Diffusion-weighted magnetic resonance imaging: a potential non-invasive marker of tumour aggressiveness in localized prostate cancer. Clin Radiol 63:774–782
Nagarajan R, Margolis D, Raman S, et al. (2012) MR spectroscopic imaging and diffusion-weighted imaging of prostate cancer with Gleason scores. J Magn Reson Imaging: JMRI 36:697–703
Rosenkrantz AB, Sigmund EE, Johnson G, et al. (2012) Prostate cancer: feasibility and preliminary experience of a diffusional kurtosis model for detection and assessment of aggressiveness of peripheral zone cancer. Radiology 264:126–135
Kitajima K, Takahashi S, Ueno Y, et al. (2013) Do apparent diffusion coefficient (ADC) values obtained using high b-values with a 3-T MRI correlate better than a transrectal ultrasound (TRUS)-guided biopsy with true Gleason scores obtained from radical prostatectomy specimens for patients with prostate cancer? Eur J Radiol 82:1219–1226
Bae H, Yoshida S, Matsuoka Y, et al. (2014) Apparent diffusion coefficient value as a biomarker reflecting morphological and biological features of prostate cancer. Int Urol Nephrol 46:555–561
Itatani R, Namimoto T, Kajihara H, et al. (2014) Triage of low-risk prostate cancer patients with PSA levels 10 ng/ml or less: comparison of apparent diffusion coefficient value and transrectal ultrasound-guided target biopsy. AJR Am J Roentgenol 202:1051–1057
Itatani R, Namimoto T, Yoshimura A, et al. (2014) Clinical utility of the normalized apparent diffusion coefficient for preoperative evaluation of the aggressiveness of prostate cancer. Jpn J Radiol 32:685–691
Lebovici A, Sfrangeu SA, Feier D, et al. (2014) Evaluation of the normal-to-diseased apparent diffusion coefficient ratio as an indicator of prostate cancer aggressiveness. BMC Med Imaging 14:15
Nowak J, Malzahn U, Baur AD, et al. (2014) The value of ADC, T2 signal intensity, and a combination of both parameters to assess Gleason score and primary Gleason grades in patients with known prostate cancer. Acta radiologica (Stockholm, Sweden: 1987) 2014
Tamada T, Kanomata N, Sone T, et al. (2014) High b value (2,000 s/mm2) diffusion-weighted magnetic resonance imaging in prostate cancer at 3 Tesla: comparison with 1,000 s/mm2 for tumor conspicuity and discrimination of aggressiveness. PLoS ONE 9:e96619
Min X, Wang L, Feng Z, et al. (2015) Prostate cancer: the correlation between apparent diffusion coefficient values obtained from high resolution diffusion-weighted imaging and Gleason scores. [Chinese]. Chinese J Radiol (China) 49:191–194
Roethke MC, Kuder TA, Kuru TH, et al. (2015) Evaluation of diffusion kurtosis imaging versus standard diffusion imaging for detection and grading of peripheral zone prostate cancer. Invest Radiol 50:483–489
Wang Q, Li H, Yan X, et al. (2015) Histogram analysis of diffusion kurtosis magnetic resonance imaging in differentiation of pathologic Gleason grade of prostate cancer. Urol Oncol 33:337.e315-324
Chang JH, Lim Joon D, Lee ST, et al. (2014) Diffusion-weighted MRI, 11C-choline PET and 18F-fluorodeoxyglucose PET for predicting the Gleason score in prostate carcinoma. Eur Radiol 24:715–722
Kim TH, Jeong JY, Lee SW, et al. (2015) Diffusion-weighted magnetic resonance imaging for prediction of insignificant prostate cancer in potential candidates for active surveillance. Eur Radiol 25:1786–1792
Chen L, Liu M, Bao J, et al. (2013) The correlation between apparent diffusion coefficient and tumor cellularity in patients: a meta-analysis. PLoS ONE 8:e79008
McInnes MD, Bossuyt PM (2015) Pitfalls of systematic reviews and meta-analyses in imaging research. Radiology 277:13–21
Whiting PF, Rutjes AW, Westwood ME, et al. (2011) QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 155:529–536
Carter HB, Morrell CH, Pearson JD, et al. (1992) Estimation of prostatic growth using serial prostate-specific antigen measurements in men with and without prostate disease. Cancer Res 52:3323–3328
Schmid HP, McNeal JE, Stamey TA (1993) Observations on the doubling time of prostate cancer. The use of serial prostate-specific antigen in patients with untreated disease as a measure of increasing cancer volume. Cancer 71:2031–2040
Leeflang MM, Deeks JJ, Takwoingi Y, Macaskill P (2013) Cochrane diagnostic test accuracy reviews. Syst Rev 2:82
van Enst WA, Ochodo E, Scholten RJ, Hooft L, Leeflang MM (2014) Investigation of publication bias in meta-analyses of diagnostic test accuracy: a meta-epidemiological study. BMC Med Res Methodol 14:70
Reitsma JB, Glas AS, Rutjes AW, et al. (2005) Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. J Clin Epidemiol 58:982–990
Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36
Harbord RM, Deeks JJ, Egger M, Whiting P, Sterne JA (2007) A unification of models for meta-analysis of diagnostic accuracy studies. Biostatistics 8:239–251
Higgins JP, Thompson SG, Deeks JJ, Altman DG (2003) Measuring inconsistency in meta-analyses. BMJ 327:557–560
DerSimonian R, Laird N (1986) Meta-analysis in clinical trials. Control Clin Trials 7:177–188
Higgins JP, Thompson SG (2002) Quantifying heterogeneity in a meta-analysis. Stat Med 21:1539–1558
Doebler P (2012) mada: Meta-analysis of diagnostic accuracy. https://r-forge.r-project.org/projects/mada/. Accessed 19 Aug 2016
Doebler P, Holling H, Bohning D (2012) A mixed model approach to meta-analysis of diagnostic studies with binary test outcome. Psychol Methods 17:418–436
Siddiqui MM, Rais-Bahrami S, Truong H, et al. (2013) Magnetic resonance imaging/ultrasound-fusion biopsy significantly upgrades prostate cancer versus systematic 12-core transrectal ultrasound biopsy. Eur Urol 64:713–719
Siddiqui MM, Rais-Bahrami S, Turkbey B, et al. (2015) Comparison of MR/ultrasound fusion-guided biopsy with ultrasound-guided biopsy for the diagnosis of prostate cancer. JAMA 313:390–397
Le JD, Stephenson S, Brugger M, et al. (2014) Magnetic resonance imaging-ultrasound fusion biopsy for prediction of final prostate pathology. J Urol 192:1367–1373
Sonn GA, Chang E, Natarajan S, et al. (2014) Value of targeted prostate biopsy using magnetic resonance-ultrasound fusion in men with prior negative biopsy and elevated prostate-specific antigen. Eur Urol 65:809–815
Shariat SF, Karakiewicz PI, Suardi N, Kattan MW (2008) Comparison of nomograms with other methods for predicting outcomes in prostate cancer: a critical analysis of the literature. Clin Cancer Res 14:4400–4407
Metens T, Miranda D, Absil J, Matos C (2012) What is the optimal b value in diffusion-weighted MR imaging to depict prostate cancer at 3T? Eur Radiol 22:703–709
Katahira K, Takahara T, Kwee TC, et al. (2011) Ultra-high-b-value diffusion-weighted MR imaging for the detection of prostate cancer: evaluation in 201 cases with histopathological correlation. Eur Radiol 21:188–196
Rosenkrantz AB, Padhani AR, Chenevert TL, et al. (2015) Body diffusion kurtosis imaging: basic principles, applications, and considerations for clinical practice. J Magn Reson Imaging: JMRI 42:1190–1202
Rosenkrantz AB, Hindman N, Lim RP, et al. (2013) Diffusion-weighted imaging of the prostate: comparison of b1000 and b2000 image sets for index lesion detection. J Magn Reson Imaging: JMRI 38:694–700
Kitajima K, Kaji Y, Kuroda K, Sugimura K (2008) High b-value diffusion-weighted imaging in normal and malignant peripheral zone tissue of the prostate: effect of signal-to-noise ratio. Magn Reson Med Sci: MRMS 7:93–99
Kim CK, Park BK, Kim B (2010) High-b-value diffusion-weighted imaging at 3 T to detect prostate cancer: comparisons between b values of 1,000 and 2,000 s/mm2. AJR Am J Roentgenol 194:W33–W37
Oyama T, Allsbrook WC Jr, Kurokawa K, et al. (2005) A comparison of interobserver reproducibility of Gleason grading of prostatic carcinoma in Japan and the United States. Arch Pathol Lab Med 129(8):1004–1010
De la Taille A, Villefond A, Berger N, et al. (2003) Evaluation of the interobserver reproducibility of Gleason grading of prostatic adenocarcinoma using tissue microarrays. Hum Pathol 34(5):444–449
Allsbrok WC Jr, Mangold KA, Johnson MH, et al. (2001) Interobserver reproducibility of Gleason grading of prostatic carcinoma: urologic pathologists. Hum Pathol 32(1):74–80
Acknowledgments
We would like to thank Chenchan Huang MD for assistance in English translation of articles.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Funding
This study did not receive funding.
Conflicts of interest
Author A.B.R. receives royalties from Thieme Medical Publishers. Author S. K. K. is supported in part by an AUR GE Radiology Research Academic Fellowship Award. Author H. S. has no conflicts of interests.
Ethical approval
This meta-analysis did not involve human subjects participants and thus did not require institutional review board approval.
Informed consent
n/a (not human subjects research).
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Shaish, H., Kang, S.K. & Rosenkrantz, A.B. The utility of quantitative ADC values for differentiating high-risk from low-risk prostate cancer: a systematic review and meta-analysis. Abdom Radiol 42, 260–270 (2017). https://doi.org/10.1007/s00261-016-0848-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00261-016-0848-y