The detection of small, asymptomatic renal lesions is increasing, related in part to increased utilization of cross-sectional imaging [1]. The risk of malignancy increases with the size of the renal lesion [2]. This in turn means that a significant number of small renal masses, typically defined as less than 4 cm in maximal diameter [3], are benign lesions. The difficulty in distinguishing between benign and malignant renal lesions, especially when lesions are small, is reflected in the benign outcome of some surgically resected renal lesions [2]. As there is potential for morbidity in the ablative or surgical management of small renal lesions [4], accurate radiologic assessment may improve patient outcomes.

The overlap in imaging appearance between benign and malignant renal lesions is in part related to the diverse histologic subtypes of renal cell carcinoma and the various benign histologic diagnoses which may manifest as a focal renal mass [5]. There are qualitative imaging features for some renal lesion subtypes which are generally well accepted and helpful in characterizing a renal mass. This includes T2 hypointensity and modest enhancement in papillary RCC (pRCC) [5]. Contrast hyperenhancement and intravoxel lipid is seen with clear cell RCC (ccRCC), the most common RCC subtype [5]. However, these same qualitative features may be observed with benign lesions such as angiomyolipomas (AML) and oncocytomas [6, 7].

MR imaging yields several distinct quantifiable features that may better differentiate renal masses than standard qualitative imaging assessment. Several authors have demonstrated that apparent diffusion coefficient (ADC) values measured on diffusion-weighted imaging are lower in solid malignant lesions than benign [8,9,10,11]. Likewise, studies regarding chemical shift suggest that quantitative analysis of signal intensity may be more robust in demonstrating differences between lesion subtypes than qualitative analysis [6]. Similar principles may apply to differences in contrast enhancement of benign versus malignant renal lesion subtypes [12, 13].

The goal of our study was to evaluate the utility of quantitative parameters derived from diffusion-weighted imaging, chemical shift imaging, and contrast enhancement, in distinguishing between benign lesions and renal cell carcinoma. Specific benign vs. malignant renal lesion subtype comparisons guided by established qualitative characteristics were performed for some parameters.

Materials and methods

Informed consent was waived by the institutional review board for this Heath Insurance Portability and Accountability Act-compliant retrospective study.

Study population

A surgical pathology database search for “renal mass” at our institution from January 2011 to September 2013 yielded 757 renal masses, of which 177 had pre-operative MRI (Fig. 1). Diffusion-weighted images or MR images were not available for 12 masses, yielding 165 masses. Additionally, 2 masses were excluded because no renal mass was seen, one was excluded for ESRD with pyelonephritis, and one case was excluded for pancreatic cancer enveloping the kidney. Additionally, 63 masses were excluded because lesion size was larger than 3 cm. No renal masses were excluded due to cystic appearance. This yielded a final population of 98 masses in 98 patients. Parameter-specific exclusions due to poor image quality from these 98 masses were as follows: ADC n = 11, corticomedullary phase n = 8, nephrographic phase n = 2, and chemical shift n = 8. In scenarios when multiple masses were present in the surgical specimen, only a single mass (largest under 3 cm and malignant) was included for each patient.

Fig. 1
figure 1

Flow chart for study population. Parameter-specific exclusions for poor image quality from the final population of 98 included n = 11 for ADC, n = 8 for corticomedullary phase of enhancement, n = 2 for nephrographic phase of enhancement, and n = 8 for chemical shift imaging

MR image acquisition

Abdominal MRI was performed on one of eleven 1.5T or two 3T Siemens scanners (Avanto, Symphony, Espree, Aera, Trio, Skyra) with a phased array body coil. The imaging protocol on the various scanners was similar and included diffusion-weighted imaging, free-breathing non-EKG gated, slice thickness 8–10 mm, with b values of 50, 400, and 800 s/mm2. ADC maps generated by the scanner were utilized for measurement of ADC values. Axial and coronal pre- and post-contrast T1w fat-suppressed volumetric interpolated breath-held examination (VIBE) images were obtained using an approximate matrix size of 256 × 320, slice thickness of 3 mm, parallel imaging factor of two, and opposed-phase echo time, with a set scanner delay of 18 seconds, followed by three additional post-contrast phases to capture arteriographic, corticomedullary, and nephrographic phases. Dual echo chemical shift gradient recall T1w images were obtained at nominal in- and opposed-phase echo times with slice thickness of 6–8 mm, similar resolution to the VIBE sequence. Additionally, T2w half Fourier acquisition single-shot turbo spin echo and T2 Turbo spin echo with and without fat suppression images were also acquired though were not used for quantitative analysis.

Image analysis

Images were reviewed by an abdominal radiologist with over 20 years of abdominal imaging experience to identify the resected renal lesion by correlating data from pre- and post-operative imaging with operative summary and surgical pathology report. The laterality and size of the lesion, presence of cystic components, as well as the series and image number on which the lesion was best seen was recorded. Two separate readers with 5 and 3 years of post-fellowship abdominal imaging experience, blinded to surgical pathology data, measured quantitative data for three parameters: ADC, signal intensity index (SII), and contrast enhancement (Fig. 2). The largest ellipsoid region of interest (ROI) was placed within the confines of the lesion to measure ADCmean, in-phase signal intensity, opposed-phase signal intensity, precontrast signal intensity, corticomedullary-phase signal intensity, and nephrographic-phase signal intensity. ADCmean of the ipsilateral non-lesion kidney (cortex and medulla) was measured by both readers. ADCmean of the contralateral kidney was measured by a single reader (reader 2). Both readers also measured precontrast, corticomedullary, and nephrographic-phase enhancement of non-lesion ipsilateral cortex.

Fig. 2
figure 2

A 60-year-old man with clear cell renal cell carcinoma in the right kidney. Quantitative parameter measurement technique is depicted. A The largest ellipsoid region of interest (ROI) was placed in the lesion (solid ellipse). The ADC of the non-tumor ipsilateral kidney, inclusive of cortex and medulla, was also measured on the same image whenever possible (dashed ellipse). B Similar ROI was placed within the lesion on in-phase (circle, top) and opposed-phase (bottom) images. C ROI was placed in the lesion on corticomedullary- (solid circle, top) and nephrographic (bottom)-phase images. Cortical enhancement in each phase of contrast was measured using an ROI placed in the cortex (dashed circle, top)

The following calculations were performed for each renal lesion:

$$ {\text{ADC}}_{\text{ratio}} \, = \,\left( {{\text{ADC}}_{\text{mean}} {\text{lesion}}} \right)/\left( {{\text{ADC}}_{\text{mean}} {\text{ipsilateral kidney}}} \right) $$
$$ {\text{Signal intensity index}}\, = \,\left( {{\text{in phase }} - {\text{ opposed phase}}} \right)/{\text{in phase}} $$
$$ {\text{Absolute corticomedullary enhancement}}\, = \,\left( {\text{corticomedullary lesion}} \right) \, - \, \left( {\text{precontrast lesion}} \right) $$
$$ {\text{Absolute nephrographic enhancement}}\, = \,\left( {\text{nephrographic lesion}} \right) \, - \, \left( {\text{precontrast lesion}} \right) $$
$$ {\text{Relative corticomedullary enhancement}}\, = \,\left( {\text{corticomedullary lesion}} \right) \, - \, \left( {\text{corticomedullary nonlesion cortex}} \right) $$
$$ {\text{Relative nephrographic enhancement}}\, = \,\left( {\text{nephrographic lesion}} \right) - \left( {\text{nephrographic nonlesion cortex}} \right) $$
$$ {\text{Absolute washout}}\, = \,\left( {\text{corticomedullary lesion}} \right) \, - \, \left( {\text{nephrographic lesion}} \right) $$

Statistical analysis

Descriptive statistics was used to summarize the data (mean (SD) and frequencies and percent). Interreader reliability was assessed using the intraclass correlation coefficient (ICC). ICC values of 0.5–0.75 were considered moderate agreement, 0.75–0.90 were considered good agreement, and greater than 0.90 were excellent agreement. Measures obtained from each reader were averaged after ensuring adequate agreement between the readers. Differences between benign and malignant renal lesions or between specific subtype comparisons were evaluated using t tests or Wilcoxon rank tests, as appropriate. Multivariable logistic regression with backward stepwise selection (stopping based on minimum Bayesian information criterion) was used to identify predictors of malignancy. The significance level was set at 0.05 for this study. Analyses were conducted in SAS v9.4.

Results

Lesion diagnosis by surgical pathology

Of the 98 total renal lesions, 76 lesions were malignant and 22 lesions were benign (Table 1). All of the malignant lesions were renal cell carcinomas, and the majority were clear cell RCC (ccRCC; n = 42) and papillary RCC (pRCC; n = 19). The benign lesions included 8 oncocytomas and 6 angiomyolipomas (AML). Twelve renal lesions were predominantly cystic: 4 ccRCC, 3 pRCC, 2 cysts, 1 pseudocyst, 1 chronic inflammation, and 1 hemangioma.

Table 1 Surgical pathology diagnoses of the 98 small renal masses

Interobserver reliability

Interreader agreement was good for all quantitative measures assessed (Table 2) with ICC ranging from 0.709 to 0.912.

Table 2 Interreader reliability

Benign versus malignant lesions

There was no significant difference in ADCratio between benign or malignant lesions (1.03 ± 0.456 vs. 0.85 ± 0.294, p = 0.1130, Fig. 3A). There were also no differences in SII between benign and malignant lesions (0.10 ± 0.27 vs. 0.05 ± 0.25, p = 0.9468, Fig. 3C). Benign lesions demonstrated significantly greater absolute corticomedullary enhancement compared to malignant lesions (150.02 ± 111.52 vs. 81.13 ± 74.81, p = 0.0115; Fig. 3B). This difference was no longer present when pRCC were excluded from the malignant lesions (150.02 ± 111.5 vs. 96.80 ± 79.21, p = 0.0516). No difference in absolute nephrographic enhancement, relative corticomedullary enhancement, relative nephrographic enhancement, or absolute washout was found between benign and malignant lesions (160.41 ± 119.55 vs. 110.83 ± 86.58, p = 0.0812, − 142.93 ± 115.80 vs. − 137.34 ± 123.44, p = 0.8474, − 120.30 ± 112.72 vs. − 113.55 ± 99.95, p = 0.8021, − 10.39 ± 66.98 vs. − 21.99 ± 37.29, p = 0.4456, respectively).

Fig. 3
figure 3

Box plots for quantitative multiparametric assessment of benign vs. malignant renal lesions. A There is no significant difference in ADCratio between benign and malignant renal lesions. B There is greater absolute corticomedullary enhancement of benign compared to malignant lesions, but this difference is not apparent for absolute nephrographic enhancement. No difference in relative corticomedullary or nephrographic enhancement was found for benign vs. malignant lesions (not shown). C For SII, no difference was found. Boxes represent the 25th and 75th percentile, with horizontal line within the box representing the median, and lines extending from the box indicating minimum and maximum values. Asterisk indicates a significant difference between groups flanked by bracket

ADCmean for the non-lesion ipsilateral kidney was significantly higher for malignant lesions compared to benign lesions (2.17 × 10−3 ± 0.41 × 10−3 vs. 1.95 × 10−3 ± 0.44 × 10−3, p = 0.0398) and this was not dependent on scanner strength (not shown). ADCmean for the contralateral kidney in patients with malignant lesions was also greater than patients with benign lesions (2.19 × 10−3 ± 0.37 × 10−3 vs. 1.95 × 10−3 ± 0.36 × 10−3, p = 0.011). There was no significant difference in ADCmean values between ipsilateral and contralateral kidneys when controlling for benign or malignant lesions (not shown).

To jointly examine the variables that best predict malignancy, ADCmean lesion, ADCmean kidney, ADCratio, in-phase signal intensity, opposed-phase signal intensity, SII, absolute corticomedullary enhancement, absolute nephrographic enhancement, relative corticomedullary enhancement, relative nephrographic enhancement, and absolute washout were entered into stepwise logistic regression analysis. In the final model, only ADCratio (p = 0.042) and absolute corticomedullary enhancement (p = 0.013) are retained as predictors of malignancy (AUC = 0.785).

Lesion subtype comparisons

ADCratio, absolute corticomedullary phase enhancement, and SII for benign and malignant renal lesion subtypes are shown in Fig. 4. Several significant differences are present in specific subtype comparisons for each parameter.

Fig. 4
figure 4

Box plots showing quantitative multiparametric assessment of renal lesion subtypes. A ADCratio, B absolute corticomedullary-phase enhancement, and C signal intensity index are shown. Boxes represent the 25th and 75th percentile, with horizontal line within the box representing the median, and lines extending from the box indicating minimum and maximum values

For ADC, pRCC shows a significantly lower ADC ratio compared to benign lesions, but not compared to other RCC (0.74 ± 0.35 vs. 1.03 ± 0.46 vs. 0.89 ± 0.26, respectively, p = 0.0246; Fig. 5). No significant difference was found for similar analyses of ADCmean (1.58 ± 0.81 vs. 1.92 ± 0.74 vs. 1.92 ± 0.60, p = 0.1813). For specific subtype comparisons (Table 3), there was no difference in ADC ratio between oncocytoma and ccRCC, oncocytoma and cbRCC, or between AML and pRCC.

Fig. 5
figure 5

Box plot comparison of ADCratio between pRCC, non-pRCC, and benign lesions. ADC ratio of pRCC is significantly lower than benign lesions, but not different than non-pRCC. Boxes represent the 25th and 75th percentile, with horizontal line within the box representing the median, and lines extending from the box indicating minimum and maximum values. Asterisk indicates a significant difference between groups flanked by bracket

Table 3 Select comparisons for benign vs. malignant lesions

Absolute corticomedullary enhancement was greater for oncocytomas than ccRCC (Table 3). Oncocytomas demonstrated greater absolute nephrographic enhancement than cbRCC (194.4 ± 92.4 vs. 83.4 ± 92.4, p = 0.049) but no difference was found for absolute corticomedullary enhancement (Table 3). AML demonstrate greater enhancement than pRCC for both absolute corticomedullary enhancement (Table 3) and nephrographic enhancement (138.6 ± 51.6 vs. 60.8 ± 37.4, p = 0.0117). There was no difference in precontrast T1 signal intensity between ccRCC, cbRCC, pRCC, other RCC, and oncocytoma (p = 0.818).

For chemical shift imaging, there was a significant difference in SII between ccRCC and non-ccRCC subtypes (0.09 ± 0.22 vs. 0.001 ± 0.26, p = 0.0412; Fig. 6). There was no significant difference in SII between AML and ccRCC or between oncocytoma and ccRCC (Table 3).

Fig. 6
figure 6

Box plot comparison of ccRCC and non-ccRCC shows significantly higher SI index, or greater loss of signal on opposed-phase images, for ccRCC than other RCC. Boxes represent the 25th and 75th percentile, with horizontal line within the box representing the median, and lines extending from the box indicating minimum and maximum values. Asterisk indicates a significant difference between groups flanked by bracket

Discussion

Our study aimed to determine if quantitative data could be used to discriminate between benign and malignant renal lesions, and in this regard differs from many other studies which performed subtype-specific comparisons for specific MR parameters without a regard for lesion size. Our approach to focal renal lesions is potentially more reflective of clinical practice in which benign lesions need to be distinguished from malignant lesions. We found that ADC could not distinguish benign and malignant lesions, contrary to a prior report [8]. This is likely related to the greater diversity of benign lesions included in our study, including solid lesions such as oncocytoma and AML, and cystic appearing lesions such as chronic inflammation, hemangioma, cysts, and pseudocysts. SII was not helpful in distinguishing benign from malignant, as there are both malignant and benign lesions which demonstrate intravoxel lipid [6]. With regard to contrast enhancement, there was no difference in relative enhancement between benign and malignant lesions. However, absolute values calculated by subtracting precontrast lesion signal intensity from the corticomedullary phase, demonstrated differences. There was greater enhancement of benign lesions in the corticomedullary phase compared to malignant lesions, but this difference was no longer present when excluding pRCC from the malignant group, reflective of the well-established hypoenhancing nature of pRCC [5, 14, 15]. Therefore, hyperenhancing benign and malignant lesions are not distinguishable. Of all the variables assessed in our three MR sequence parameters, stepwise logistic regression analysis shows that ADCratio and absolute corticomedullary enhancement appear to be most helpful in distinguishing between benign and malignant lesions; this is supported by our subtype-specific comparisons as well.

Renal lesion subtype comparisons informed by qualitative MR features are more common in the literature. For example, oncocytomas and cbRCC have overlapping imaging appearance on MR [13] due to their common histologic classification as oncocytic neoplasms derived from intercalated cells of the collecting duct, and distinction on core needle biopsy is difficult [16]. Quantitatively, some authors have demonstrated greater ADC in oncocytomas than cbRCC [12, 17, 18], while others have not [14]. We did not find a difference in ADCratio between oncocytoma and cbRCC. Contrary to other reports [12, 19, 20], we did not find a significant difference in contrast enhancement between oncocytoma and cbRCC.

When comparing oncocytomas to ccRCC, we found no difference in SII or ADCratio, but did find that oncocytomas demonstrate greater enhancement than ccRCC in the corticomedullary phase. The difference in enhancement is consistent with one report in the literature [21], but is discrepant compared to others who have found that ccRCC enhance greater than oncocytomas [22, 23] and others showing no difference in enhancement [7, 24, 25].

Conventional AML contain detectable bulk fat, allowing for their benign diagnosis, but minimal fat AML can have a T2 hypointense appearance similar to pRCC [26]. We found no difference in ADC between AML and pRCC, contrary to Park and Kim who found that minimal fat AML demonstrated greater ADC than pRCC (and other T2 hypointense RCCs) [27]. However, these lesions can be distinguished by their enhancement. We found that pRCC enhance less than AML, consistent with prior reports in the literature [26, 28,29,30].

We found that ADCratio of pRCC was significantly lower than other renal lesions, consistent with the literature [9, 14, 17, 31,32,33]. In particular, we found that ADC ratio of pRCC was significantly lower than benign lesions. This finding is of interest in cases where lesion enhancement is difficult to ascertain, as the ADC ratio may aid in differentiating a benign-complicated cyst from a pRCC. However, since T1 hyperintense (hemorrhagic or proteinaceous) cysts can have lower ADC than T1 hypointense simple cysts [8], additional studies which include a large number of T1 hyperintense cysts will likely be necessary to confirm the utility of ADC in distinguishing between a hemorrhagic cyst and pRCC.

The SII of ccRCC is significantly greater than other RCC, consistent with prior reports [5, 6, 34, 35], and is reflective of the intracytoplasmic lipid within these lesions [5]. However, we found no difference in SII between ccRCC and benign lesions (AML and oncocytoma), which limits the utility of chemical shift imaging in distinguishing between malignant and benign lesions, and supports the notion that loss of signal on chemical shift imaging is not a specific feature of ccRCC [36].

There are several explanations for our paucity of discriminatory values in benign vs. malignant lesions compared to others. First, we limited our study population to lesions less than 3 cm in size, contrary to other studies evaluating ADC [8,9,10,11, 14, 17, 31, 32] or contrast enhancement [12, 14, 17, 19, 20], without limits on lesion size. Large lesions allow for the selective sampling of the most diffusion restricting portion of the lesion [8, 9, 31, 32] or assessment of enhancement in the portion of the lesion which meets a minimum threshold of enhancement [12, 14]. This approach is difficult to apply in the small lesions included in our study, and therefore, we assessed the entirety of the lesion (largest ellipsoid ROI which could fit in the lesion) for all parameters, potentially “diluting” our quantitative results compared to others. Second, the MR examinations included in our study were performed on one of eleven 1.5 T scanners and two 3 T scanners rather than single scanners [9, 14, 27, 31, 33, 37] or single field strength [11, 17, 32] used by others. There is likely variability in ADC values across scanners. For example, reported normal ADC values of kidneys range from 1.79 × 10−3 to 3.56 × 10−3 mm2 s−1 [38,39,40,41]. While our population accurately reflects clinical practice in a large academic institution, single-scanner studies are potentially more robust because they are controlling for this variance.

As a measure to control for scanner variability, we normalized our tumor ADC values to the ipsilateral non-tumor kidney. Normalization in this manner resulted in significant differences between pRCC and benign lesions, otherwise not appreciated. This is driven in part by greater ADCmean values in non-tumor ipsilateral kidney of malignant lesions compared to benign lesions. Normalization of ADC is not novel [10, 12, 14, 18], but none have directly compared the ADC of the non-tumor kidney between benign and malignant lesions. This finding may perhaps be attributed to perfusion and flow phenomena in addition to Brownian motion of water molecules reflected in the monoexponential calculation of ADC from commercially available scanner output [42]. Combined with prior reports showing decreased renal ADC in acute renal failure, chronic renal failure, dehydration, ureteral obstruction, and renal artery stenosis [38, 41, 43], one can speculate that there may be increased perfusion to the kidneys in the presence of malignant lesions. Validation in larger data sets would be necessary, but if a difference in ADCmean of the background kidney of benign vs. malignant lesions can be redemonstrated independent of scanner type, this may be helpful in lesion characterization.

There are some limitations to this study. First, this was a retrospective study and this inherently leads to a selection bias. Second, our results reflect a variable pool of MR scanners and contrast agents. As previously discussed however, the heterogeneity of scanners may in fact be a strength of this study, challenging the applicability of previously reported single-scanner quantitative data in the clinical setting; if quantitative data are to be of clinical utility, it must be applicable across multiple scanner platforms. Third, we did not control for the Fuhrman grade of tumor, which has previously been shown to affect both ADC values [9] and the amount of intracellular lipid within ccRCC [44].

Conclusion

Despite good interreader reliability of quantitative MR parameters, no significant differences in ADC and SII could be demonstrated between benign and malignant lesions. There was greater corticomedullary phase enhancement of benign compared to malignant lesions. Future quantitative analyses to distinguish between benign and malignant small renal lesions on MR should focus on ADCratio and corticomedullary-phase contrast enhancement.