Introduction

Hepatic steatosis, defined by greater than 5% hepatocytes containing triglyceride vesicles [1], can be caused by a number of chronic liver diseases including alcohol consumption and nonalcoholic Fatty Liver Disease (NAFLD). NAFLD is increasing worldwide with the epidemic of obesity and has become the most common chronic liver disorder worldwide [2, 3]. Early detection and quantification of hepatic steatosis is critical to the management of NAFLD before its progression to irreversible liver injury. Hepatic siderosis refers to excessive accumulation of iron in hepatocytes or Kuppfer cells. It is the histological hallmark of hereditary hemochromatosis and transfusion hemosiderosis. Iron can also accumulate within the liver in chronic liver diseases. Excessive hepatic iron (> 7 mg/g of dry liver tissue) can result in hepatocellular damage/cirrhosis. Liver biopsy is the gold standard for diagnosing hepatic steatosis and iron deposition. However, biopsy is an invasive procedure associated with potential complications such as hemorrhage and pseudoaneurysms. It is also limited by high inter-/intra-observer variability [4] and risk of sampling bias. Hence, there is an unmet need for an alternative noninvasive tool to detect and quantify hepatic steatosis and iron overload.

MRI has a unique advantage for evaluation of hepatic steatosis because of its ability to differentiate chemical shift signals from water, triglyceride, and iron [5]. MR spectroscopy (MRS) directly measures and compares the relative proton abundance from water and fat signals, and is considered the most reliable and accurate method for quantification of fat [5]. The most widely used MR spectroscopy sequence for this purpose is a multi-echo single-voxel STEAM sequence. An implementation with inline evaluation acquires five echoes in a single breath-hold of 15 s [6], which separately quantifies T2 of hepatic water and fat, and can measure T2-corrected fat fraction.

In contrast to the MRS which evaluates a single-voxel, chemical-shift-encoded MR imaging is able to evaluate the entire liver. The conventional chemical-shift-encoded MR technique is the T1-weighted in- and opposed-phase dual-gradient-echo method with Dixon reconstruction. However, accurate quantification of steatosis by MRI requires compensation for multiple variables including T2* relaxation effects [7] and the spectral complexity of fat [8, 9]. Proton density fat fraction (PDFF) measured from multi-echo Dixon is a more accurate MR measure of tissue triglyceride content by compensating for theses multiple confounding factors. It is a well-established quantitative biomarker for hepatic steatosis, equivalent to MR spectroscopy [10,11,12]. However, the generalized application of multi-echo Dixon may not be necessary, even if it only requires a single breath-hold. One solution is a hybrid approach combining two-echo Dixon and multi-echo Dixon. It is possible to detect both liver steatosis and iron overload from routinely used two-echo Dixon via dual-ratio signal analysis [13]. In addition, liver segmentation and a rough evaluation of water/fat contents can be derived from two-echo Dixon. If an abnormal liver state is detected in the first step, multi-echo Dixon [12] can be adaptively applied.

Iron overload can occur simultaneously with steatosis, confounding evaluation at conventional MR imaging. Accurate MR imaging quantification of liver iron concentration can provide an estimate of total body iron concentration, enabling timely treatment to prevent complications arising from iron overload. Indeed, MR quantification of liver iron has been shown to be superior to liver biopsy for assessment of total body iron due to high sampling variability of liver biopsy [14]. Several MR-imaging-based iron quantification techniques have been proposed, all of which need distinct protocols besides having their own limitations [15, 16]. Hence, development of a robust sequence that can be incorporated into routine liver MRI protocols in clinical practice is necessary.

3D multi-echo Dixon at 1.5T has been shown to be promising in accurate estimation of liver fat irrespective of iron overload [17] and simultaneous estimation of iron content based on R2* [18]. This sequence allows fast evaluation of the entire liver in one breath-hold with inline calculation of R2* (iron) and PDFF (fat) maps, without the need for offline software or post-processing tools. Iron quantification using R2* relaxometry at 1.5T has demonstrated good calibration with biopsy-measured hepatic iron concentration [19]. Accurate quantification of iron content based on R2* values at 3T is challenging due to rapid signal decays and low reproducibility, and no accurate calibration has been established yet [20]. Theoretically, liver iron concentration can also be calculated by R2 values obtained on MRS sequence based on phantom studies. Utility of R2 values obtained from MRS sequence has rarely been evaluated in clinical studies at 3T [16].

Currently, few published literature exists comparing the performance of automated detection of both hepatic fat and iron overload from two-point Dixon with quantitative fat fraction and R2* and R2water values obtained from multi-echo Dixon and MRS in a clinical setting at 3T [21].

This study has two aims: (1) compare the diagnostic performance of computer-aided detection of fat and iron overload based on two-point Dixon and dual-ratio discrimination with quantitative results obtained by the multi-echo Dixon technique and high-speed T2-corrected multi-echo single-voxel spectroscopy (HISTO) MRS sequences; (2) investigate whether R2* values obtained from multi-echo Dixon and R2water values from the MRS sequence can be used to assess hepatic iron overload at 3T.

Methods

This retrospective, Health Insurance Probability and Accountability Act compliant study was approved by NYU institutional review board. 479 patients with known or suspected liver disease were included in this study.

MR protocol

3T MRI (MAGNETOM Skyra, Siemens Healthcare, Erlangen, Germany) was performed with T1 VIBE two-point Dixon (TEout-phase = 1.3 ms, TEin-phase = 2.5 ms,TR = 4.0 ms, flip angle=9°, matrix: 320 × 195 field of view 380 × 308 mm), which provides inline liver segmentation, suggests an ROI placement, and yields a qualitative characterization of fat/iron contents (Fig. 1a). The results are displayed as “Normal”, “Fat deposition”, “Fat and Iron deposition”, or “Iron Deposition”.

Fig. 1
figure 1

Hepatic steatosis techniques using MRI PDFF and MR spectroscopy. a T1 VIBE two-point Dixon technique included fat and water images, in- and opposed-phase images, and provided inline liver segmentation as well as a qualitative description of fat/iron deposition. b T1-independent T2*-corrected multi-echo Dixon technique quantifies fat/water content and R2* value in each voxel and provides a report based on ROI as well as whole liver segmentation (white box). c MR spectroscopy at multiple echo times with T2 curve fit of water and lipid spectrum calculates T2-corrected lipid fraction and the R2water (white box)

Then, following the same breath-holding instruction, the multi-echo Dixon VIBE sequence is performed with the following parameters (TE 1.09/2.3/3.57/4.84/6.11/7.38 ms, TR 9.17 ms, flip angle: 4°, matrix: 160 × 111; field of view [FOV]: 380 × 332 mm, slice thickness: 3 mm, number of slices: 72). This sequence generates a fat fraction map and R2* map; projects the previous liver segmentation onto them, and calculates the mean PDFF and R2* values for the entire segmented liver as well as for a single predefined ROI (Fig. 1b). The low flip angle minimizes T1 bias [22]. CAIPIRINHA (“Controlled Aliasing in Parallel Imaging Results in Higher Acceleration”) undersampling with an acceleration factor of 3 allows acquisition in a single breath-hold of 19 s. Two-point Dixon and multi-echo Dixon were imaged in two consecutive breath holds.

Finally, high-speed T2-corrected multi-echo single-voxel spectroscopy (HISTO) was performed with five echoes (TE = 12/24/36/48/72 ms, TR = 3000 ms, voxel size: 30 × 30 × 30 mm3, single breath-hold of 15 s). The voxel was placed in the right lobe of the liver avoiding major vessels, liver edges, large bile ducts, and lesions. The calculation of PDFF involves extrapolating fat and water integrals for TE = 0 ms using an exponential fit. The sequence generates a graph with exponential decay fit for five echoes (Fig. 1c).

Imaging data collection

Computer-aided inline dual-ratio analysis on presence or absence of fat and iron deposition was obtained from the results generated from the two-point Dixon technique. PDFF and R2* values were calculated by manually placing an ROI (minimum size of 3 cm2, in the right hepatic lobe in the area devoid of blood vessels and bile ducts) on the fat fraction and R2* maps generated by the multi-echo Dixon technique. Fat fraction and R2water values were obtained directly from the MR spectroscopy sequence report.

Histopathology correlation

80 patients had liver biopsies as a part of clinical work-up, within 3 months of the MRI. Hepatic steatosis was graded on a four-point scale based on percentage of fat containing hepatocytes on H and E stain (0 = < 5%; 1 = 5%–33%; 2 = 33%–66%; 3 = > 66%). Iron deposition was graded semi-quantitatively based on Perls Prussian blue staining, as follows: grade 0, iron granules absent or barely discernible in a high-power field (× 400); grade 1, granules easily confirmed at × 400 or barely discernible at × 250; grade 2, granules resolved at × 100; grade 3, granules resolved at × 25; and grade 4, masses visible at low power (× 10) or with the naked eye. ROC analyses were conducted to assess the diagnostic performance of dual-ratio Dixon discrimination, multi-echo Dixon, and MRS sequences for fat fraction and hepatic iron overload. The accuracy was characterized in terms of area under the ROC curve (AUC). The optimal cutoff values were determined by the shortest distance to the left upper corner in the ROC plot.

Imaging analysis

Qualitative automated results generated from two-point Dixon and fat fraction obtained from multi-echo Dixon were correlated with MRS as the reference standard. The optimal cutoff for fat fraction (6.05% for MRS and 3.4% for multi-echo Dixon) and for iron overload (60.5 s−1 for R2* and 40.85 s−1 for R2water) were determined based on ROC analysis of pathology correlation in a subset of 80 patients. These cutoff values were applied in the analysis and comparison of the entire cohort population of 479 patients.

Results

This study population consists of 479 patients with known or suspected liver diseases, including 257 males (17–88 years) and 222 females (20–87 years). The most common known liver diseases included Hepatitis C (N = 83), Hepatitis B (N = 63), NAFLD (N = 60) and alcoholic liver disease (N = 27) (Fig. 2). The range of fat fractions and iron levels by biopsy is grade 0–3 and grade 0–2, respectively.

Fig. 2
figure 2

Indications of MRI study for patient population

Multi-echo Dixon failed in 4.0% of cases (19/479) due to fat/water swap. MRS failed in 0.6% of cases (3/479). After excluding all the failed cases, Multi-echo Dixon results correlate with PDFF from MR spectroscopy well, with a R2 of 0.908, slope 1.03 (95% CI 1.00–1.06), and intercept 1.03 (95% CI 0.72–1.33) (Fig. 3). The correlation between R2* from multi-echo Dixon and R2water from MRS is poor, with a R2 of 0.191, slope 0.12 (95% CI 0.09–0.14), and intercept 31.06 (95% CI 29.81–32.31) (Fig. 4).

Fig. 3
figure 3

Correlation of steatosis quantification by MRI PDFF and MR spectroscopy. The slope is 1.03 (95% CI 1.00–1.06), and intercept is 1.03 (95% CI 0.72–1.33)

Fig. 4
figure 4

Correlation R2* obtained from multi-echo Dixon and R2water obtained from MRS. The slope is 0.12 (95% CI 0.09–0.14), and the intercept is 31.06 (95% CI 29.81–32.31)

Using biopsy as the reference standard for a subset of 80 patients, the AUC for detection of hepatic steatosis was 0.88 (95% CI 0.80–0.96) for PDFF obtained from multi-echo Dixon and 0.90 (95% CI 0.84–0.90) for PDFF obtained from MRS (Fig. 5). 38 of the 80 patients who underwent biopsy had hepatic steatosis (20 with mild steatosis, 11 patients had moderate steatosis and 7 with severe steatosis). The results are summarized in Table 1. The optimal cutoff value is 3.4% (sensitivity of 0.82, specificity of 0.82) for PDFF based on multi-echo Dixon, and 6.05% (sensitivity 0.77, specificity 0.90) for PDFF from MRS. Of note, a cutoff of 3.95% for multi-echo Dixon PDFF yields a similar distance to the left upper corner of ROC, with sensitivity (0.76) and specificity (0.90) close to those of MRS PDFF at 6.05% cutoff (Table 1). Eight out of the 80 patients who underwent biopsy had iron overload. The AUC for detection of hepatic iron overload was 0.89 (95% CI 0.77–1.00) for R2* values obtained from multi-echo Dixon and 0.88 (95% CI 0.76–1.00) for R2water obtained from MRS. The optimal cutoff value for iron overload is 60.5 s−1 for R2* from multi-echo Dixon (sensitivity 0.88, specificity 0.83) and 40.85 s−1 for R2water from MRS (sensitivity 0.88, specificity 0.86).

Fig. 5
figure 5

ROC analysis for PDFF and iron content assessment using pathology as the reference standard. PDFF was calculated by multi-echo Dixon and MRS sequences. Iron content was evaluated by R2* from multi-echo Dixon and R2water from MRS sequence

Table 1 Comparison of methods for hepatic steatosis and siderosis characterization on 80 patients using pathology as standard

The results from dual-ratio Dixon discrimination were also compared with pathology for these 80 patients. For fat detection (“Fat”+”Fat and Iron”), sensitivity, specificity, PPV, and NPV of two-echo Dixon were 0.95 (95% CI 0.95–0.95), 0.48 (95% CI 0.47–0.48), 0.64 (95% CI 0.63–0.64), and 0.90 (95% CI 0.90–0.91), respectively. For iron overload (“Iron”+”Fat and Iron”), sensitivity, specificity, PPV and NPV of two-echo Dixon were 0.63 (95% CI 0.61–0.64), 0.62 (95% CI 0.62–0.62), 0.15 (95% CI 0.15–0.16), and 0.94 (95% CI 0.93–0.94), respectively.

Using the optimal cutoff determined by pathology-based ROC analysis, PDFF obtained from multi-echo Dixon for the entire study cohort was compared with MRS as the reference standard, and results are summarized in Table 2. The PDFF from multi-echo Dixon demonstrated sensitivity of 0.96 (95% CI 0.96–0.97), specificity of 0.85 (95% CI 0.85–0.85), with positive prediction value (PPV) of 0.83 (95% CI 0.83–0.83), and negative prediction value (NPV) of 0.97 (95% CI 0.97–0.97) for detection of any degree of hepatic steatosis. In comparison, using the MRS as the reference standard, the dual-ratio discrimination obtained from two-point Dixon showed sensitivity of 0.97 (95% CI 0.97–0.97) and specificity of 0.47 (95% CI 0.47–0.47), PPV 0.58 (95% CI 0.58–0.58), and NPV 0.95 (95% CI 0.95–0.95) for detection of hepatic steatosis.

Table 2 Comparison of methods for hepatic steatosis and siderosis characterization on the entire study cohort of 479 patients

Using R2* from multi-echo Dixon with a cutoff of 115 s−1 as reference standard for iron overload, two-point Dixon showed a sensitivity of 0.78 (95% CI 0.77–0.79) and specificity of 0.73 (95% CI 0.73–0.73), PPV 0.06 (0.05–0.06), and NPV 0.99 (0.99–0.99) for detection of hepatic siderosis.

Discussion

This retrospective clinical study compares qualitative computer-aided results for detection of hepatic fat and iron overload obtained from the 3D T1-weighted two-point Dixon technique with dual-ratio discrimination with the quantitative results obtained from T1-independent T2*-corrected multi-echo Dixon technique (PDFF, R2*) and T2-corrected multi-echo single-voxel MR spectroscopy (MRS, PDFF, R2water). Biopsy was used as the reference standard in a subset of these patients.

Hepatic steatosis evaluation

ROC analysis in our study showed that multi-echo PDFF (AUC 0.88) and MRS PDFF (AUC 0.90) have equivalent accuracy for fat quantification (Table 1). This is comparable to previously published literature [23, 24]. However, the optimal cutoff for multi-echo PDFF (3.4%) is lower than that of MRS PDFF (6.05%). This difference might partially result from higher fat fraction estimation by MRS than multi-echo Dixon (Fig. 2). In addition, the shape of the multi-echo PDFF ROC curve could also contribute to the different balance of specificity and sensitivity, with a higher cutoff (3.95%) of Multi-echo Dixon yielding a similar distance to the left upper corner of ROC plot as the optimal cutoff (3.4%) (Table 1). Applying the optimal cutoff, multi-echo Dixon and MRS sequences yielded similar sensitivity and specificity, which are comparable to previously published literature [25]. In comparison, qualitative data obtained from two-point Dixon was more sensitive (0.95) but less specific (0.48) for detection of hepatic steatosis as proved by pathology (Table 1). This is consistent with the original publication [13], and mirrors the “screening” aspect of the method [26].

A previously published study on dual-echo Dixon and multi-echo Dixon for hepatic steatosis in asymptomatic individuals using MRS as the reference standard showed that the dual-echo Dixon has high sensitivity (0.94) but lower specificity (0.75) for detection of hepatic steatosis [21]. This study only involved asymptomatic individuals, and none of the patients in this study underwent liver biopsy. Our results are similar showing high sensitivity and negative predictive value of automated dual-ratio Dixon discrimination in excluding hepatic steatosis. However, specificity of the automated results is significantly lower in our study (0.48 using biopsy, and 0.47 using MRS as the reference standard).

Two-point Dixon had however lower failure rate (0%). Multi-echo Dixon failed in 4.0% of cases due to fat/water swap. Fat-water swap is however not a specific problem of the multi-echo DIXON technique and is possibly attributable to the artifacts or suboptimal execution of multi-echo DIXON compared to the two-point technique. After excluding the failed cases, linear regression analysis showed that multi-echo Dixon and MRS results have good correlation with R2 of 0.908, in the same range of previously published results (0.76–0.99) [12, 17, 21]. Applying the optimal cutoff determined by pathology correlation, multi-echo Dixon has similar sensitivity but higher specificity than dual-ratio Dixon discrimination for steatosis detection (Table 2).

Hepatic siderosis evaluation

Results from previously published studies at 1.5T on simultaneous iron and fat quantification using 3D multi-echo Dixon have been promising [27]. The effect of fat on R2* can be minimized by multi-peak spectral modeling of fat [28]. Henninger et al. showed that there is strong correlation (r = 0.992) of R2* values measured at 1.5T by 3D multi-echo Dixon and 2D fat-saturated multi-echo GRE sequences calibrated by biopsy [18]. More recently, simultaneous quantification of hepatic steatosis and siderosis by multi-echo chemical-shift-encoded multi-gradient-echo MRI at 3T has been shown to correlate well with biopsy grades (R = 0.718 for steatosis, and R = 0.704 for siderosis) [25]. R2* values at 3T can be extrapolated by multiplying the R2* value at 1.5T by a factor of 2 [20]. However, reported cutoff R2* values for hepatic iron overload at 3T vary considerably. The normative mean R2* value at 3T from a 2D multi-echo GRE sequence with minimum echo times was reported as 69.1 ± 21.9 s−1 [29]. Banerjee et al. reported a T2* threshold of 12.5 ms (R2* of 80 s−1) from a 2D multi-echo GRE sequence with in-phase echo times had a sensitivity of 86% and a specificity of 93% for detection of any degree of hepatic siderosis [30]. Using an iron content cutoff value of 2 mg/g in dry weight, as well as the calibration and conversion formulae of Hankins et al. and Storey et al. an R2* threshold of 115 s−1 suggested itself [29]. The ROC analysis in this study showed an optimal R2* cutoff value of 60.5 s−1 at 3T for the detection of iron overload, with sensitivity of 0.88 and specificity of 0.83 (Table 1). Our results further support that R2* obtained from multi-echo Dixon can be utilized for detection of iron overload at 3T with high performance.

Using R2* cutoff of 115 s−1 as the reference, the sensitivity, specificity, and NPV of the iron overload detection at 3T by the two-point Dixon for the entire study cohort is 0.78, 0.73, and 0.99, respectively. Comparing with pathology results of 80 patients, the sensitivity and NPV of two-point Dixon for detecting iron overload are 0.63 and 0.94, respectively. Bashir et al. showed that computer-aided dual-ratio Dixon MRI at 1.5T is excellent at differentiating isolated steatosis or iron deposition from combined disease, with sensitivity of 100% [13]. A previously published study on automated two-point Dixon at 1.5T also showed excellent correlation between the screening sequence and the multi-echo sequence for detection of iron overload (r = 0.988) as well as hepatic steatosis (r = 0.978) [31]. This study however did not have pathology correlation and MR spectroscopy was not done either. Compared with these studies at 1.5T, our results for two-point Dixon at 3T demonstrated relatively lower sensitivity but high NPV. This might be related to the low prevalence of patients with pathology proven iron overload (8/80 patients) or with high R2* (9/479 patients with R2* > 115 s−1). Therefore, two-point Dixon might be an effective screening tool for siderosis detection in a patient population similar to this study. It is to be noted that the dual-ratio Dixon discrimination technique was aimed at incidental findings in a general patient population. If liver disease is suspected, quantitative techniques like multi-echo Dixon will always be performed.

R2 water obtained from the MRS sequence is generally not used for iron quantification since the echo times are relatively long. Pineda et al. showed excellent correlation between R2water obtained on MRS, with iron levels in a phantom, and suggested shorter echo time (TE) ranges of 12/15/18/21/24 ms for high iron levels, instead of the conventional 12/24/36/48/ 72 ms for regular acquisitions [6]. Recently, newer modification of MRS sequence (HISTOV) involving eight echoes has been proposed to improve its ability to detect and quantify iron. Results from the modified MRS sequence at 1.5T showed a strong correlation between MRS-R2_water and FerriScan-R2 (R2 = 0.889) with excellent overall accuracy (AUC of 0.948, 0.970 and 1 for LIC thresholds of 1.8, 3.2, and 7.0 mg/g dw, respectively) [32]. However, accuracy of MRS in vivo at 3T was not clear. In our study, pathology-based ROC analysis showed that R2water (AUC 0.88) has equivalent accuracy for iron overload detection compared with R2* (AUC 0.89) (Table 1). With an optimal cutoff of 40.85 s−1, R2water based iron overload detection has a sensitivity of 0.88 and specificity of 0.87, almost identical to that of R2* based analysis. However, the correlation of R2 and R2* in our study is relatively poor (R2 = 0.191), suggesting that R2* and R2water might have different advantages and limits in iron assessment and therefore might be complementary to each other. Note that, while relaxation rates from gradient echo acquisitions have a linear relationship with iron, it is curvilinear for relaxation rates from spin-echo acquisitions [33]. Further research is necessary to investigate whether higher diagnostic accuracy can be achieved by a combinational analysis of R2* and R2water values for detection of iron overload at 3T.

Limitations

There are several limitations of this study. First, the study population included patients with mixed known or suspected liver diseases instead of the general population or a targeted disease population such as NAFLD. This population bias may limit the generalization of the study result. Second, the reference standard for steatosis used in the entire study cohort is MR spectroscopy instead of biopsy. Biopsy for diffuse liver disease has been shown limited by sampling variability [4, 17], and MR spectroscopy has excellent correlation with histology [17]. Thus, using biopsy as the reference for steatosis may not be necessary for all patients. Third, a relatively small number of patients who underwent biopsy had iron overload and absolute liver iron concentration was not calculated in these patients. Larger study with pathology correlation is needed to analyze the relationship between R2* and R2 values to establish R2* and R2 calibration curve for iron overload at 3T. Fourth, the ROI to obtained PDFF and R2* values on multi-echo DIXON was not exactly co-localized to the Voxel on MR spectroscopy. However, the MRS Voxel and the ROI in multi-echo DIXON are both consistently placed in the right lobe of the liver, in an area devoid of blood vessels, and bile ducts, limiting the chances of significant variation.

In conclusion, this retrospective clinical study showed that qualitative results obtained from computer-aided dual-ratio discrimination with T1 VIBE two-point Dixon sequence is a useful qualitative screening tool with high sensitivity and success rate for detection of hepatic steatosis and iron overload. The multi-echo Dixon technique is a more specific and quantitative technique with equivalent diagnostic performance as MR spectroscopy, but demonstrated higher failure rates possibly reflecting higher complexity. Both R2* values obtained from multi-echo Dixon and R2water values obtained from MRS sequences have relatively good diagnostic performance for detection of liver iron overload.