Introduction

Hepatic steatosis is a condition in which large vacuoles of triglyceride fat accumulate in liver cells. Common causes of steatosis include obesity, insulin resistance, hepatitis C infection, and certain medications [1]. Most cases of hepatic steatosis are reversible and may be the key histologic feature of alcoholic and nonalcoholic fatty liver disease (NAFLD) [2]. NAFLD is the most common cause of chronic liver disease in Western societies and is closely associated with the metabolic syndrome, a constellation of diseases including type 2 diabetes mellitus, hypertension, obesity, and dyslipidemia. NAFLD encompasses a broad spectrum of liver diseases including isolated steatosis, nonalcoholic steatohepatitis (NASH), cirrhosis, and hepatocellular carcinoma [35]. Regardless of the cause of steatosis, progression to cirrhosis and end-stage liver disease may occur [69]. Because the earliest and most reliable feature of NAFLD is steatosis, early and accurate diagnosis is important so that proper management may be used to prevent long-term complications [1014].

Currently, core liver biopsy is considered the reference standard for diagnosing and grading steatosis. However, widespread use of biopsy is limited due to the invasiveness and potential for significant bleeding or organ perforation. Furthermore, high rates of sampling error and variability are common problems with this procedure. Additionally, histologic interpretation of biopsy results is subjective, semi-quantitative and based on the observation of only a fraction of a potentially steatotic liver. Due to the fact that hepatic steatosis is often heterogeneous, quantifying liver fat using a core biopsy may lead to sampling bias and thus underestimation or overestimation of steatosis [1517]. Therefore, a noninvasive, objective, quantitative diagnostic alternative to the core liver biopsy is needed for more accurate whole-liver diagnosis of steatosis.

Magnetic resonance (MR) techniques, such as MR spectroscopy (MRS), the Dixon method, and imaging with and without fat saturation, have been used for detection and quantification of steatosis [18, 19]. Among such techniques, MRS is regarded as the most accurate method for the assessment of hepatic triglyceride content and uses 5.56 % as the diagnostic threshold for steatosis [20]. When performed correctly, MRS measures the hepatic proton density fat fraction (PDFF), which is platform and protocol independent, and can be used as a measurement of liver fat content. However, a tradeoff of using MRS is extended acquisition time due to automatic shimming. Also, sampling bias, when working with small sample volumes, has been reported because only one voxel-of-interest (VOI) is usually measured [2022].

MR imaging-based proton density fat fraction (MRI-PDFF) calculation is a recently described chemical shift-based technique that uses either magnitude or complex-based algorithms to separate water and fat. It can be acquired in a single breath hold and allows for simple calculation of the fat fraction (FF) of the entire liver. This method also corrects for factors that confound the MR signal, such as T1 bias, T2* decay, spectral complexity of fat, eddy currents, and noise bias [2326]. Over the past few years, several studies have reported the accuracy and usefulness of MRS as a reference standard [19, 24, 27]. Additionally, recent studies have shown that MRI-PDFF correlates well with liver biopsy results, as well as in phantom and in vivo patient studies [2833]. However, previous studies estimated MRI-PDFF from a relatively small area using a round or elliptical region-of-interest (ROI) measurement that has the potential to affect the accuracy of the overall hepatic fat measurements because hepatic steatosis is often unevenly distributed throughout the liver [34, 35]. There have been no previous studies that measured hepatic fat deposition with MRI-PDFF over the almost entire liver. Thus, the aim of this study was to evaluate the diagnostic performance of MRI-PDFF with variable ROI measurement, including free-drawn ROI measurement to quantify hepatic fat over nearly the entire liver using MRS as the reference standard.

Materials and methods

Study population

This is a retrospective, single-center study that was approved by the institutional review board of our hospital and informed consent was obtained. Between February 2013 and January 2014, 199 patients who underwent abdominal MRI with PDFF measurement as part of a diagnostic protocol were included in this study. Patients were excluded if: (a) a severe MR artifact was present (n = 10), (b) parenchyma was replaced by large space-occupying lesions (>5 cm or infiltrative lesions) (n = 31), or (c) the MRS voxel was inappropriately placed during the procedure (n = 2). Ultimately, 156 patients were included in the study (105 males and 51 females, age range: 12–83 years, mean age: 57.8 years). Among them, 47 patients had liver cirrhosis due to hepatitis B (n = 42) or C (n = 5), 38 patients had hepatocellular carcinomas, and 82 patients had benign hepatic masses, such as hemangiomas or cysts.

MR imaging and spectroscopy

All imaging was performed on a 3T MR scanner (Magnetom Verio with Syngo MR B17 software; Siemens Healthcare, Erlangen, Germany) using a typical set-up of 12 elements of the integrated body and spine RF coil system. Patients fasted for 8 hours before their MR examination and were imaged while in the supine position. To estimate MRI-PDFF, an investigational variant of “hybrid multi-step adaptive fitting approach with multi-echo volume interpolated breath-hold examination (VIBE) acquisition” was used, which combines the strengths of both magnitude and complex-based methods and provides liver PDFF and R2* maps for hepatic iron quantification [26, 31]. The parameters of this sequence were as follows: repetition time (TR) = 9.2 ms (ms); first echo time (TE) = 1.23 ms with 6 echoes collected with ΔTE = 1.23 ms; flip angle = 5º; slice thickness = 4 mm; field of view = 42 × 32 cm; matrix = 256 × 165; number of signals acquired = 1; bandwidth = 1090 Hz/Pixel; bipolar readout. A parallel acceleration technique (Controlled Aliasing In Parallel Imaging Results IN Higher Acceleration, CAIPIRINHA) was used, with acceleration factors of 2 in both phase encoding and partition directions [26, 31]. The sequence was acquired in a single breath hold with a scan time of 16–20 s. The images were processed using online software to create water/fat images, water/fat R2* maps, an effective R2* map, and water/fat percentage maps.

Single-voxel high-speed T2-corrected multiple-echo 1H-MR spectroscopy (SVS) [36] was performed, which is a single-voxel stimulated echo acquisition mode (STEAM) spectroscopy sequence with five different TEs to serve as the reference standard. The parameters included mixing time (TM) = 10 ms; TR = 3000 ms; TE = 12, 24, 36, 48, and 72 ms; bandwidth = 1200 Hz; sampling points = 1024; voxel size of 30 × 30 × 30 mm, and total acquisition time = 15 s. The voxel was placed in the posterior segment of the right hepatic lobe (segment VI or VII), taking care to avoid major blood vessels, bile ducts, or any space-occupying abnormalities and was shimmed automatically. The fat and water peaks retrieved from the spectroscopy data were detected automatically, fat and water values were integrated over each peak, and then the different TE values were fitted to an exponential. The fat percentage was calculated from the fat and water values extrapolated to TE = 0 and the result was reported as a DICOM text report.

Image processing

All ROI measurements were performed by one radiologist with 3 years of clinical experience reading liver MRIs. To assess the agreement between MRI-PDFF and SVS, three square-shaped ROIs of 3 × 3 cm2 from three contiguous slices were placed at the same location on a single VOI—the same used in SVS (“VOI-PDFF”) (Fig. 1a). A 4-cm2 elliptical ROI was placed on the MRI-PDFF fat percentage map at seven Couinaud segments for estimation of fat deposition values from segment II to VIII (“segmental-PDFF”) (Fig. 1b–d). Also, to cover nearly the entire liver at three different levels, free-drawn ROIs along the liver margin were measured (“free-drawn-PDFF”). The three levels included the confluence of the right hepatic vein, the umbilical portion of the left portal vein, and the posterior branch of the right portal vein. To take precise measurements, free-drawn ROIs were carefully drafted using a tablet pen on the most clearly distinguishable image of the six echo images (Fig. 1e–g). Drawings were placed at least 5 mm from the liver margin to avoid peri-hepatic fat. Any space-occupying lesions and artifacts that could affect hepatic fat estimation were avoided where possible. These ROIs were copied to the fat percentage map using the copy and paste function of the PACS system (Maroview 5.4; Marotech, Seoul, Korea). By using a similar free-drawn ROI measurement, we measured the fat content in each hepatic lobe separately (“free-drawn-PDFF-2” as a mean fat percentage of both lobes, “free-drawn-PDFF-Rt” and “free-drawn-PDFF-Lt” as fat percentages of right and left lobes separately) (Fig. 1h, i). A weighted average using the ROI size was calculated such that an average fat fraction across the liver was reported.

Fig. 1
figure 1figure 1

A 66-year-old male with 6.17 % fat fraction (FF) in SVS: Three square-shaped ROIs from three contiguous slices were placed on the same location of a single-voxel of interest (VOI) used in SVS. The mean FF value of three square-shaped ROIs (5.8 %) was defined as “VOI-PDFF” (a). The elliptical region-of-interest (ROI) was placed on the fat percentage map at seven Couinaud segment locations from II to VIII (“segmental-PDFF”). The mean FF of the seven segments was 5.5 % (bd). Three free-drawn ROIs were made using a tablet pen along the liver margin. The mean FF value of the three free-drawn ROIs (5.4 %) was defined as “free-drawn-PDFF” (eg). The mean FF values of the six free-drawn ROIs of both lobes (three each for right and left lobe) were defined as “free-drawn-PDFF-2” (h, i)

Statistical analysis

All statistical analyses were performed using SPSS software, version 20.0 (IBM Corp, Armonk, NY, USA). All results are expressed as a mean ± standard deviation (SD).

All patients were dichotomized into either the steatotic or the non-steatotic group based on whether they were above or below the SVS-determined fat percentage of 5.56 % [20]. An independent t test was used to compare liver fat content between two lobes of liver for each MRI-PDFF method (VOI-PDFF, segmental-PDFF, free-drawn-PDFF, and free-drawn-PDFF-2) as well as SVS. An analysis of variance (ANOVA) with Bonferroni correction was used to compare fat contents measured by each MRI-PDFF methods and SVS. To assess the accuracy of MRI-PDFF, correlations between each MRI-PDFF method and SVS were assessed using Pearson’s correlation and a linear regression test. Bland–Altman analysis was performed to estimate the agreement between each of the PDFF methods.

The liver fat content difference between the right and the left lobe was compared using free-drawn-PDFF-2 with an independent t test. Repeated ANOVA with one repeated factor was used to compare liver fat content as determined by each PDFF method and SVS. To test segmental heterogeneity, repeated ANOVA with one repeated factor was also used to compare the fat content of seven segments using segmental-PDFF.

Diagnostic performance of each method was compared using McNemar’s test. Receiver operating characteristic (ROC) analysis with Youden index was used to determine the optimal cut-off values of MRI-PDFF methods to diagnose hepatic steatosis. A p value of less than 0.05 was considered to be a statistically significant difference.

Results

Liver fat content

The mean fat percentage of each method in all patients, the steatotic group, the non-steatotic group, and separately in each liver lobe were estimated (Table 1). There were no significant differences among the different methods in all patients and the steatotic group. However, in the non-steatotic group, mean fat percentage estimated by free-drawn-PDFF was significantly higher than segmental-PDFF measurements (2.78 and 2.35 %; p = 0.024).

Table 1 Liver fat content measured by each method

Estimation accuracy and agreement between ROI methods

Correlation between SVS and each MRI-PDFF method showed overall good agreement (r = 0.909–0.977). The strongest correlation was found between SVS and VOI-PDFF (r = 0.977, p < 0.001) (Fig. 2). Free-drawn-PDFF-2 of the right lobe also showed good correlation (r = 0.970, p < 0.001). Free-drawn-PDFF-2 of the left lobe showed the worst relative correlation (r = 0.909). Figure 3 shows Bland–Altman plots between VOI-PDFF and segmental-PDFF, VOI-PDFF and free-drawn-PDFF, free-drawn-PDFF and segmental-PDFF, and free-drawn-PDFF and free-drawn-PDFF-2. Among them, the Bland–Altman plot comparing fat percentage with VOI-PDFF and segmental-PDFF showed the highest mean difference and widest 95 % limits of agreement (0.32 and −2.02, −2.65 %, respectively). The 95 % limits of agreement between the fat percentage of free-drawn-PDFF and free-drawn-PDFF-2 was the narrowest in this study (−0.82 to −1.39 %).

Fig. 2
figure 2

Scatter plot and regression line correlations between SVS and PDFF estimation from the same voxel-of-interest (VOI) location used in SVS (VOI-PDFF)

Fig. 3
figure 3

Bland–Altman plot representing the difference between VOI-PDFF, segmental-PDFF, free-drawn-PDFF, and free-drawn-PDFF-2. Among them, the plot between free-drawn-PDFF and free-drawn-PDFF-2 demonstrates the best agreement (mean difference, 0.29 %; 95 % Bland–Altman limits of agreement, −0.82 to 1.39 %)

Heterogeneity of hepatic fat deposition

In general, we found that the right lobe had a greater degree of fat deposition than the left lobe after using free-drawn-PDFF-2 method (p < 0.001) (Table 1). The mean difference in fat percentage, estimated by free-drawn-PDFF-2 between the two lobes, was 0.93 % (range 0.01–5.01 %). The difference in fat percentage between the two lobes was statistically significant in both the steatotic and non-steatotic groups (p < 0.001) (Fig. 4). There were five cases of discrepancy in which the estimated free-drawn-PDFF-2 was lower than 5.56 % on one side of the liver and higher than 5.56 % on the remaining side (Fig. 5).

Fig. 4
figure 4

Waterfall plot highlighting the differences in steatosis between the right and left liver lobe

Fig. 5
figure 5

An 18-year-old female with hepatic steatosis: In-phase (a), opposed-phase (b) and PDFF fat percentage map (c) show uneven fatty distribution. A free-drawn region-of-interest (ROI) was placed at the umbilical portion of the left portal vein as representative image of “free-drawn-PDFF” (d), the free-drawn ROI of the right lobe (e), and the free-drawn ROI of the left lobe (f) were placed at the same level as representative images of “free-drawn-PDFF-2”. Fat fractions (FF) of both lobes, right lobe, left lobe, and SVS were 7.5, 9.8, 6.1, and 15.7 %, respectively

The average maximum difference in fat content between segments across the entire study sample was 3.83 % (range 0.90–17.10 %). The maximum difference between segments was greater than 5.56 % in 34 patients (21.8 %). There was no systematic variation in segmental-PDFF between segments in steatotic and non-steatotic groups when analyzed by repeated ANOVA (F = 0.74, p = 0.62).

Diagnostic performance of each PDFF method

Among 156 patients, 54 patients (34.6 %) were classified as having hepatic steatosis using SVS with a cut-off value of 5.56 %. Using VOI-PDFF with the same cut-off value as with SVS (5.56 %), 51 patients (32.7 %) were classified as having hepatic steatosis. Therefore, VOI-PDFF had a sensitivity and specificity of 94.4 and 100 % for diagnosing hepatic steatosis. When using segmental-PDFF for diagnosing hepatic steatosis, 44 patients (28.2 %) were diagnosed, and sensitivity and specificity were calculated as 81.5 and 100 %, respectively. Using free-drawn-PDFF with the same cut-off value, 46 patients (29.5 %) were classified as having hepatic steatosis, and a sensitivity and specificity of 85.2 and 100 % were calculated. Using the free-drawn-PDFF-2 technique, 46 patients (29.5 %) were diagnosed, and a sensitivity and specificity of 85.2 and 100 % were calculated. Free-drawn-PDFF-2 of the right lobe showed same sensitivity and specificity of mean free-drawn-PDFF-2, but the left lobe had calculated sensitivity and specificity values of 79.6 and 100 %. All cases of discrepancy between SVS and each PDFF methods were classified as having hepatic steatosis only by SVS and not by PDFF methods.

SVS had statistically significant difference in terms of diagnostic performance from all PDFF methods, except VOI-PDFF (p < 0.05). Performance of VOI-PDFF was significantly different from that of free-drawn-PDFF-2 of the left lobe (p = 0.021).

Using SVS as a reference, the area under the ROC curve (AUROC) was calculated for MRI-PDFF methods. AUROC for VOI-PDFF was 0.997 and the cut-off value for steatosis was 5.1 %. Using the same reference, AUROC for segmental-PDFF was 0.999 and the cut-off value was 4.2 %. AUROC for free-drawn-PDFF and free-drawn-PDFF-2 were 0.997 and 0.998 with cut-off values of 4.5 and 5.48 %, respectively. AUROC for free-drawn-PDFF-2 of the right lobe and free-drawn-PDFF-2 of the left lobe were 0.996 and 0.993 with cut-off values of 4.5 and 4.0 %, respectively.

Discussion

We used a hybrid multi-step adaptive fitting approach with multi-echo VIBE acquisition for hepatic fat quantification, which is a newly developed MRI-PDFF estimation method. It is similar to the iterative decomposition of water and fat with echo asymmetry and least squares estimation (IDEAL), which has been used for hepatic fat quantification and has shown good histologic correlation [28, 30]. Our MRI-PDFF is different from IDEAL in that it uses both magnitude and complex data calculations, combining the strength of both methods, in which the insensitivity to phase errors/eddy current effects inherent to magnitude-based methods and the broader dynamic range of complex-based methods [31]. Many other factors, such as T2* decay, T1 bias, and multi-peak fat modeling, are also taken into account in this method for accurate fat quantification. Additionally, R2*/T2* measurements are obtained, which are valuable indicators of hepatic iron deposition. Finally, separate measurements of water and fat R2* are obtained, which may also be informative.

In this study, we found the best correlation to be between the SVS and the VOI-PDFF methods. However, correlation between the SVS and the free-drawn-PDFF-Lt was relatively poor. Because VOI-PDFF measurements were performed in almost the same anatomical region as in SVS, fat estimation by VOI-PDFF should be very similar to that of SVS, which was one of our findings (r = 0.977). According to a previous study by Pineda et al., SVS is an accurate and reproducible MRS for hepatic fat quantification [36]. Considering there have been no previous in vivo studies on multi-echo VIBE acquisition with adaptive fitting with a large patient cohort, we showed that this technique enables accurate hepatic fat quantification.

Recent studies showed that PDFF was well correlated with histological methods, but there were some important differences in agreement. One of the reasons for the low accuracy of agreement is that, when using the PDFF method, one estimates the proportion of mobile protons contained within fat molecules in a three-dimensional liver voxel, while histological analysis assesses the proportion of hepatocytes that contain macrovesicles of fat in a two-dimensional slide [28, 30]. Another reason for the agreement difference is the possibility for liver fat content to change over time, which may be due to the time interval that passes between liver biopsy and PDFF analysis. Also, liver biopsy may suffer from sampling variability, as liver biopsies are typically performed percutaneously from only one segment. Tissue samples only represent 1/50,000–1/65,000 of the whole liver, a significant weakness of percutaneous liver biopsy. Interobserver and intraobserver variability is another well-known limitation of histological assessment [17, 37, 38].

Additionally, the heterogeneity of fat deposition in the liver could simultaneously affect MR fat quantification and biopsy results. Several recent studies showed heterogeneity in liver fat content using MR fat quantification. Idilman et al. [30] reported that the right lobe had higher fat content than the left, but this difference was not significant. Capitan et al. [34] also found that steatosis is usually greater in the right lobe than the left lobe; they further found that heterogeneous fat deposition can sometimes lead to a misdiagnosis of hepatic steatosis. To explain this phenomenon, Capitan et al. suggested a streamline theory, wherein the right and left liver lobes act as functionally separate organs. In support of this theory, they cited that portal blood flow, which is conveyed via the superior mesenteric vein, contains dietary fat and flows mainly into the right liver and blood from the splenic vein flows mainly into the left liver [39]. Additionally, steatosis within the right and left lobes could be segmentally heterogeneous. Therefore, Capitan et al. [34] cited a need for methods that quantify steatosis over a larger region.

The findings of our study were similar to those cited above regarding heterogeneity of liver fat content. The fat percentage of the right lobe, as determined by free-drawn–PDFF-2, was higher than that of the left lobe in both the steatotic and non-steatotic groups (p < 0.001). Voxels used in SVS were usually located in segment VI or VII of the right lobe. Since the right lobe had a relatively higher fat content as compared to the left lobe, SVS results may have had a tendency to overestimate the fat content of the liver, particularly if the difference between the two lobes was large. There were five discrepant cases in our study in which the fat content of one side of the liver was high enough to qualify as steatotic while the other side did not meet such criteria. Among the discrepant cases, four had steatosis only in the right lobe and one had steatosis only in the left lobe. This asymmetry could account for possible over- or underestimation when fat percentage is estimated in only one side of the liver. Furthermore, in cases of severely asymmetrical fatty infiltration, SVS may lead to erroneous results because this technique reflects only fat deposition in a single voxel (Fig. 5).

To overcome the heterogeneity of fat deposition and sampling variability, and to acquire a more representative measure of the fat content of the entire liver, a large area of estimation is needed. We used a free-drawn-PDFF to measure a large area of the liver without shape or size limitations. Free-drawn-PDFF was measured by averaging the FF of three different levels of free-hand drawn ROI measurements. Extreme cranial- and caudal-level views of the liver that may have been affected by an artifact of motion in the diaphragm or intestine were excluded. The three levels that we used were relatively easy to define, nearly free from motion artifact, and can be distinguished as discrete anatomical structures. We found that large area estimation such as free-drawn-PDFF and free-drawn-PDFF-2 had relatively lower agreements to the VOI-PDFF. Also, diagnostic performance difference between the VOI-PDFF and free-drawn-PDFF was almost statistically significant (p = 0.063) and a significant difference was present between the SVS and free-drawn-PDFF (p = 0.008). Although there is a lack of statistical evidence to prove that free-drawn-PDFF is superior to the conventional reference standard, we speculate that MRS has a relatively low agreement and the difference in diagnostic performance is due to the large-area coverage as a means of whole liver fat quantification that is achievable with free-drawn-PDFF. Therefore, this technique could be regarded as a new method to accurately measure liver fat content.

With an emphasis on the cut-off value for discriminating the grade of steatosis, three recent studies comparing core needle biopsy or 1-cm3-sized surgical specimen results and MRI-PDFF with a mean fat percentage of eight to nine Couinaud segments had different cut-off values compared with previous studies [28, 30, 40]. According to a study by Tang et al. [28], MRI-PDFF cut-off values of 6.4, 17.5, and 22.1 % proved to be highly accurate and showed moderate sensitivity and high specificity for distinguishing between steatosis grade 0 and grade 1 or greater, between grade 1 or less and grade 2 or greater, and between grade 2 or less from grade 3. Also, Idilman et al. [30] showed that the sensitivity and specificity was 93 and 85 % with an MRI-PDFF measured cut-off value of 15.03 % to differentiate moderate or severe steatosis from mild or no steatosis. Another study with living liver donors showed that an MRI-PDFF determined cut-off value of 5 % resulted in 100 % sensitivity and 91 % specificity for detecting more than 5 % steatosis [40]. However, our study showed lower cut-off values (4.0–5.5 %) in MRI-PDFF methods for identification of steatosis compared to previous studies. Discrepancy of these cut-off values may be due to various reasons, as previously mentioned, however, the small sampling size of liver biopsy compared to multiple ROI measurements using MRI-PDFF is likely a key aspect. As mentioned above, the right lobe had a greater extent of steatosis than left lobe in our study. Therefore, hepatic fat heterogeneity could be considered in novel standards for PDFF and our method will be particularly useful to establish new reference values.

Our study has several limitations. First, it was a retrospective study and our study population had a limited number of patients with alcoholic liver disease and homogeneous fatty liver, which may have biased our results. Our results require a large-scale prospective validation study to be applied in a more general population. Second, the reference standard we used for measuring liver fat content was a single-voxel MRS instead of histology. Because we sought to estimate fat content in the largest area possible, comparing free-drawn-PDFF with MRS or histology with whole liver coverage is impractical. However, based on previous studies regarding the accuracy of MRS, we first compared VOI-PDFF and MRS to show that our PDFF method is accurate and feasible for estimating a wide range of fat percentages. Third, we could not estimate the fat content of segment I because the anatomical border was vague in many samples. Fourth, as mentioned above, because there is not yet a clear cut-off value for diagnosing steatosis via PDFF, we adopted 5.56 % as the cut-off value. There have been no studies focusing on the MRI-PDFF-determined cut-off value and therefore a large, prospective study with healthy subjects without a history of liver disease or risk factors for hepatic steatosis is needed. We assume that after further studies, the cut-off value may shift from the 5.56 % MRS-determined value used in this study [20].

In conclusion, free-drawn-PDFF measurements provide fast, accurate, and more generalized information regarding hepatic fat deposition; this is particularly true if fat deposition is heterogeneous. This method combined with a large area of estimation could be considered to be a new reference standard for MR fat quantification to replace MRS.