Introduction

Angiomyolipoma (AML), the most common benign renal tumour, is histologically composed of various proportions of adipose tissue, smooth muscle, and thick-walled blood vessels [1]. Originally, evidence of macroscopic fat within a renal cortical tumour on computed tomography (CT) or magnetic resonance imaging (MRI) was considered a unique, identifying feature of classic AML [2,3,4]. However, macroscopic fat may be subtle or completely absent in minimal-fat AML (mfAML), which account for approximately 5% of AML [5]. Meanwhile, there are extensive findings that some renal cell carcinomas (RCC) can contain either some macroscopic fat or minimal fat content owing to lipid-producing tumour necrosis, bone metaplasia with fatty marrow elements, or perinephric or renal sinus fat entrapment [6, 7]. Moreover, clear cell RCC (cc-RCC) frequently contain varying amounts of intracellular lipid and glycogen in addition to macroscopic fat [8, 9]. Consequently, these overlapping imaging features may lead to the misinterpretation of mfAML as RCC, and lead to some patients undergoing unnecessary surgery.

Chemical shift MRI (CS-MRI), which has been widely used for differentiating adrenal adenoma from other neoplasms by quantifying the minimal fat content [10,11,12], is a useful technique for diagnosing mfAML [13, 14]. In CS-MRI, a decrease in signal intensity (SI) on opposed-phase gradient-echo images is a function of the ratio of lipid content to the total amount of tissue in each voxel. In general, a substantial CS SI index [CS-SII = (SIin − SIopp)/(SIin) × 100], where SIin is the in-phase SI and SIopp is the opposed-phase SI, decrease (CS-SII > 20–25%) appears indicative of either cc-RCC or AML [15, 16]. In mouse liver, Peng et al. [17] proved a strong correlation between liver fat content and CS-MRI (r = 0.882). Yet, no pathological or in vitro radiological research comparing the actual kidney lipid content between mfAML and cc-RCC has been performed so far, and the added value of in vivo evaluation of the small intratumoural lipid content by MRI using double-echo CS sequences is controversial. Some studies [18, 19] have reported that the CS-SII is a useful value for distinguishing mfAML from other renal neoplasms, whereas others [20,21,22,23,24] have observed a negative result.

Therefore, we aimed in this meta-analysis to review published studies that used the CS-SII value to differentiate mfAML and RCC to define the impact of this technique in routine practice for the characterisation of renal tumours and furthermore to assess its capability for classifying RCC subtypes.

Materials and methods

Literature search and selection

The PubMed, Cochrane Library and Embase databases were searched systematically for relevant published articles. The search strategy was based on the combination of the following keywords: (“MRI” OR “magnetic resonance imaging”) AND (“renal” OR “kidney”) AND (“neoplasm” OR “tumor” OR “cancer” OR “carcinoma” OR “lesion”). The inclusion criteria were: (1) CS-SII value was used to determine mfAML or primary malignancy of a renal lesion; (2) data were analysed on a per-lesion basis; (3) histopathological results and/or clinical follow-up were used as the reference standard; (4) absolute data of CS-SII values could be obtained. The article search was limited to those published in English. Review articles, letters, case reports and conference abstracts were excluded due to insufficient data.

Data extraction and quality assessment

Two investigators reviewed the included articles and extracted the relevant details independently for the meta-analysis, resolving any differences by consensus. The extracted study characteristics were: first author, year of publication, country of origin, sample characteristics [number of patients and lesions, mean age, sex, study design, type of renal lesion, region of interest (ROI)], CS-SII parameters (modality, magnetic field strength, imaging sequences) and reference standard (Table 1). In addition, the absolute data of CS-SII values were recorded for further analysis. The Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) tool [25] was used to extract the appropriate study design characteristics from each study, and consisted of 11 items: each was answered with “yes”, “no”, or “unknown”. “Yes” was assigned 1 score, and “no” or “unknown” was assigned 0 score. A score of 9 was used as a cut-off value for high versus low quality.

Table 1 Characteristics of studies included in the meta-analysis

Statistical analysis

The CS-SII values of mfAML or RCC subtypes were extracted on a per-lesion basis. Based on these data, and regardless of the analysis population (intention to treat vs per protocol), meta-regression analyses based on a linear mixed model for pooled mean CS-SII values with 95% confidence intervals (CIs) was conducted using STATA software (ver. 12.0; StataCorp, College Station, TX, USA). Owing to the lack of information on the different analysis populations, we did not perform sensitivity analyses. We performed subgroup analyses on the RCC subtypes. In addition, publication bias was examined using Begg’s test. The level of statistical significance was set to a two-sided p value of 0.05.

Heterogeneity across studies was evaluated using Cochran’s Q-statistic (p < 0.05 was considered significant) and the I 2 test (0%, no heterogeneity; 100%, maximal heterogeneity). A random- or fixed-effects model was used based on the heterogeneity analysis. A random-effects model was used when there was significant heterogeneity among studies (p < 0.05 or I 2 > 50%); otherwise, a fixed-effects model was used. In either case, the results should be interpreted with care.

Results

Literature search and study description

The search initially yielded 11,689 articles (deadline, 31 May 2017). Figure 1 shows the detailed flow chart of the literature search. Eventually, 11 articles involving 850 patients and 1,183 CS-SII measurements met the inclusion and exclusion criteria and were selected for data extraction and data analysis. There were a total 127 and 436 measurements for mfAML and RCC, respectively. RCC was divided into cc-RCC (n = 427), papillary RCC (p-RCC, n = 156), and chromophobe RCC (ch-RCC, n = 37). Table 1 lists the pooled characteristics of these 11 articles.

Fig. 1
figure 1

Flow diagram of the study selection procedure

Quality assessment and publication bias

The quality assessment results in Table 1 demonstrate the QUADAS-2 questions for each study. There were overall high scores for the questions relating to quality of patient selection [91% (10/11 studies) to 100% (11/11 studies) of studies received a score for “yes”] and quality of index test interpretation [91% (10/11 studies) to 100% (11/11 studies)]. However, the scores for the question on the quality of reference standard interpretation were relatively lower [55% (6/11 studies)], largely related to the lack of exhibition in the literature four of the 11 studies (36%). The question on the interval between the index test and reference standard examination also had lower scores [18% (2/11 studies)], generally related to the inclusion of patients with an undocumented time interval between MRI and the reference standard examination. In summary, ten studies were high quality (score ≥ 9), and only one study was low quality (score = 7). In addition, Begg’s test found no significant publication bias (mfAML, p = 0.175; RCC, p = 0.308; cc-RCC, p = 0.536; ch-RCC, p = 0.296; p-RCC, p = 0.566).

Pooled estimates and subgroup analyses

Table 2 shows the pooled results, Figs. 2 and 3 show forest plots of the CS-SII values of mfAML, and RCC and its subtypes. The summarised CS-SII values for mfAML, RCC, and cc-RCC were 13.63 (95% CI, 10.15–17.12), 7.92 (95% CI, 4.78–11.07) and 9.99 (95% CI, 7.17–12.82), respectively. The CS-SII measurement of mfAML was significantly higher than that of RCC (p = 0.017), yet no significant differences were observed in the comparison between mfAML and cc-RCC (p = 0.11). For RCC subtype, the CS-SII measurement of cc-RCC was significantly higher than that of p-RCC [9.99 (95% CI, 7.17–12.82) vs -5.69 (95% CI, -8.40 to -2.98), p < 0.001] and ch-RCC [9.99 (95% CI, 7.17–12.82) vs 1.82 (95% CI, -5.68 to 9.32), p = 0.045]. However, the CS-SII values of p-RCC and ch-RCC were not significantly different (p = 0.06). This is illustrated in Fig. 4.

Table 2 Number of cases and CS-SII values of renal lesion subtypes
Fig. 2
figure 2

Forest plots of CS-SII values for mfAML (a) and RCC (b)

Fig. 3
figure 3

Forest plots of CS-SII values for cc-RCC (a), p-RCC (b), and ch-RCC (c)

Fig. 4
figure 4

Box and whisker plot of CS-SII values of mfAML, overall RCC, and the three main RCC subtypes. *p < 0.05, **p < 0.01, ***p < 0.001

Discussion

Generally, the diagnosis of renal AML depends on the detection of intratumoural macroscopic fat on CT. Nevertheless, only 3–10% mature fat is microscopically detected in AML [26], in which fat is not visible on CT and is distributed heterogeneously [5, 27, 28]. New evidence [20, 29] shows that fat levels are substantially lower in the estimated percentage volume of intratumoural fat of surgical mfAML compared with surgically removed “classic” AML. In contrast, the presence of microscopic fat is non-specific and can also be seen in RCC, especially cc-RCC. Owing to the histological characteristic of dissolved lipids and cholesterol accumulation in the cytoplasm in cc-RCC [9, 30], abundant cytoplasmic lipid is observed in as many as 60% of cc-RCC [16]. Therefore, making a correct diagnosis for preoperative differentiation of mfAML and RCC, especially cc-RCC, is challenging.

CS-MRI (also known as in-phase and out-of-phase imaging, or opposed-phase imaging) has been proved extremely useful for characterising lesions and organs with fatty [12, 31,32,33] or lipid-poor [12, 34] components, and appears to be the most reliable method for differentiating adenomas from non-adenomas [34,35,36]. Recently, the CS-SII value has also been used to differentiate mfAML and other renal neoplasms. Outwater et al. [16] first used CS-MRI for identifying intracellular lipid within renal neoplasms, and suggested that both AML and cc-RCC exhibit a decreased SI on opposed-phase images. However, the value of CS-MRI for differentiating mfAML and RCC is controversial. Kim et al. [18] and Sasiwimonphan et al. [19] reported that SI loss on opposed-phase images of CS-MRI was higher for mfAML compared with cc-RCC. Nevertheless, Jhaveri et al. [24] confirmed a significantly higher percentage SI decrease in cc-RCC cases (median, 24.3%) than in either mfAML (median, 3.2%) or non–cc-RCC (median, −0.8%). Yet, other studies [20,21,22,23] have reported conflicting results, with no difference between the CS-SII values of mfAML and cc-RCC. The present meta-analysis pooled those estimates and showed that, overall, mfAML and cc-RCC exhibit a similar decrease in SI, reflecting a similar amount of intravoxel fat (p = 0.11) (Fig. 4). This means that CS-SII might not be a suitable MRI parameter for differentiating mfAML and cc-RCC, and may thereby lead to inaccurate diagnosis. However, the use of other multivariate analyses for MRI findings, and the combination of low T2-weighted imaging (T2WI) and/or apparent diffusion coefficient (ADC) signal is highly accurate for diagnosing mfAML, while cc-RCC always manifests as heterogeneously isointense to hyperintense to the renal cortex on T2WI and with moderate restricted diffusion [37]. The enhancement pattern, i.e. hyperenhancement during the corticomedullary phase of enhancement with gadolinium washout over time, overlapped between the two tumours [38]. Furthermore, the presence of intratumoural haemorrhage or calcification are highly specific for RCC and are not observed in mfAML [37].

Different RCC subtypes have unique histopathological features, genetic expression patterns and clinical behaviour [36, 39]. Previous studies have suggested that patients with p-RCC or ch-RCC have better prognosis than patients with cc-RCC [40, 41]. In addition, particularly in patients with advanced and metastatic RCC, these subtypes respond differently to molecular targeted therapies: the tyrosine kinase inhibitors sunitinib and sorafenib are more effective against cc-RCC, whereas temsirolimus has recently been shown to be more effective against non-cc-RCC [42,43,44]. Therefore, accurate identification of the specific pathologic diagnosis prior to treatment is critical. Current multiparametric MRI, such as dynamic contrast enhancement (DCE) and diffusion-weighted imaging (DWI), might be a valid diagnostic approach for characterising these two renal masses accurately. Sun et al. [45] reported that signal intensity changes of the corticomedullary phase on DCE MRI were the most effective parameter for distinguishing cc-RCC and p-RCC. Mytsyk et al. [46] demonstrated that cc-RCC had the largest mean ADC value among the three subtypes, while ch-RCC had the lowest ADC value. Nevertheless, no report has compared the CS-SII values among the three subtypes. Our analysis of RCC subtypes also showed significantly different CS-SII values between cc-RCC and p-RCC (9.99 ± 1.44 vs -5.69 ± 1.38, p < 0.001), and between cc-RCC and ch-RCC (9.99 ± 1.44 vs 1.82 ± 3.83, p < 0.05) (Fig. 4), i.e. CS-SII measurements might aid the differentiation of cc-RCC from the other two major RCC subtypes.

Under electron microscopy, cc-RCC contains substantially more intracytoplasmic lipids than other histological RCC subtypes [9], yet only moderate amounts of intratumoural lipids (limited to that in the clear cell component) have been shown in p-RCC, and only small amounts have been shown in ch-RCC [9], which characterise the distinct SI values on opposed-phase MR images [3, 16, 47]. In their 2014 study, Childs et al. [48] found that visual in-phase SI loss occurred in 42% of p-RCC. In our study, p-RCC had negative CS-SII values, suggesting the presence of areas exhibiting increased SI on opposed-phase images in p-RCC. This might be helpful for identifying hemosiderin in p-RCC [49]. Moreover, the negative CS-SII value of p-RCC may be the reason for the decreased RCC CS-SII value, rendering the CS-SII values of mfAML and RCC indistinguishable (13.63 ± 1.77 vs 7.92 ± 1.61, p < 0.05) (Fig. 4). Nevertheless, the difference between the p-RCC and ch-RCC CS-SII values was not significant (p = 0.06). Current multiparametric MRI, such as DCE and DWI, might be a valid diagnostic approach for accurate characterisation of these two renal masses.

Our meta-analysis has several limitations. First, as renal tumours are generally non-homogeneous, the various definitions of the ROIs and their reproducibility among different readers and papers (Table 1) may lead to different results in the literature on the value of CS-SII. Second, the number of included studies is relatively small, and to some extent the pooled results might be affected by inherent factors such as random error. Third, the CS-MRI series and field-strength parameters lack consensus, which would have influenced the CS-SII measurements. Furthermore, owing to the limitations of the published data (e.g. unavailability of individual patient data) and to heterogeneity, it was not possible to calculate the receiver operating characteristic (ROC) curves or a reliable threshold value. Fourth, publication bias is unavoidable for clinical evidence, as the relevant data were extracted from non-randomised controlled trials. Finally, as we excluded reports in languages other than English, there might have been language bias.

In summary, we conclude that the CS-SII values of CS-MRI cannot be used to accurately distinguish mfAML from cc-RCC, but cc-RCC has significantly higher CS-SII values than p-RCC and ch-RCC. Further adequately designed prospective studies with CS-MRI standardisation, especially standardisation of the cut-off threshold value, should be conducted to confirm our results.