Introduction

Nonalcoholic fatty liver disease (NAFLD) comprises a continuum from isolated hepatic steatosis (HS) to steatohepatitis (NASH), through bridging fibrosis and eventually cirrhosis, and is emerging as the leading cause of hepatic failure in the Western world [1, 2]. The prevalence of NAFLD may be as high as 25 % of overweight adolescent girls and up to 38 % of all overweight children [1, 3, 4]. Insulin resistance (IR) and metabolic syndrome are strongly associated with NAFLD and contribute to development of NASH [5, 6] by facilitating intrahepatocellular accumulation of triglycerides and fatty acids [7, 8]. Accumulation of fatty acids causes oxidative stress and activation of stellate cells, which can lead to hepatocellular injury [9].

Early diagnosis is important because prognosis is improved when NAFLD is identified before progression to NASH [1, 10]. Unfortunately, identification of isolated steatosis in children is difficult and up to 68 % of children and adolescents with NAFLD already have NASH at diagnosis [4, 11]. While elevations in liver transaminases are often used to screen for NAFLD, multiple studies in children have shown that alanine aminotransferase (ALT) correlates poorly or not at all with early steatosis [1, 3, 7, 12, 13]. Given the insensitivity of ALT as a marker of NAFLD, it is likely that NAFLD in children and adolescents is under-diagnosed, particularly in the early stages [14].

A number of imaging techniques have been used to detect and quantify HS. Ultrasound (US) is commonly used, but sensitivity is poor when histological steatosis grading is <30 % [15]. Computed tomography (CT) is more specific than US, but it also performs poorly at lower degrees of steatosis [16] and requires ionizing radiation.

Quantitative MR spectroscopy (MRS) is widely considered to be the non-invasive reference standard to quantify liver fat and correlates strongly with steatosis measured by biopsy [17, 18]. In recent adult studies, emerging confounder-corrected quantitative MRI methods for estimating hepatic triglyceride concentration demonstrated equivalent accuracy to single-voxel spectroscopy MRS with the added advantage of providing high spatial resolution over the entire liver [1922]. Both MRS and quantitative MRI methods estimate the proton density fat-fraction (PDFF), which is a fundamental property of tissue that measures hepatic triglyceride concentration [23]. Although studies in adults are promising, there are a paucity of data on the use of quantitative MRI to measure hepatic PDFF in healthy populations of children and adolescents [24].

The purpose of this work was to perform a prospective comparison of a complex confounder-corrected chemical shift-encoded quantitative MR imaging method with MR spectroscopy for quantification of HS in adolescent girls. A secondary goal of this work was to determine the clinically significant PDFF threshold of HS in this population.

Materials and methods

Study design and subjects

This Health Insurance Portability and Accountability Act-compliant study was approved by our institutional review board. Study subjects comprised females who responded to a general invitation to participate in this study that was distributed to our general and endocrine paediatric clinics and a local middle school. After informed written consent and assent were obtained, an MRI safety screen, a brief survey of personal and family medical history, medication use, and self-identified race and ethnicity (per National Institutes of Health race and ethnicity criteria for subjects in clinical research) were collected. Study entrance criteria included female sex and age between 11 and 22 years. Exclusion criteria included a history of chronic disease that affected hepatic or renal function including: Type 1 or Type 2 diabetes mellitus, known liver disease or other chronic illness, treatment with medications including oral contraceptives, lipid-lowering or glucose metabolism altering agents, or vitamin E supplements greater than 100 IU daily, pregnancy, or excess alcohol consumption defined as greater than an average of 1.5 drinks per day, and standard contraindications to MRI (metallic implants, claustrophobia, etc.). We enrolled 136 subjects, and 132 subjects successfully completed both MRI and MRS measures. It should be noted that data acquired from the complete group of subjects were previously reported in a study which proposed a risk assessment model for early detection of HS using common anthropometric and metabolic markers [25]. The only overlapping data are patient characteristics, and comparison of MRI and MRS was not evaluated in the previous manuscript.

Height was measured using a stadiometer and recorded to the nearest 0.5 cm. Waist circumference (WC) was measured twice just above the iliac crests with Graham-Field® cloth woven measuring tape, and the average was recorded to the nearest 1 mm. Weight was measured without shoes in light clothes on a beam balance platform scale to the nearest 0.1 kg. Body mass index (BMI) was then calculated. Self-assessment of Tanner staging for breast and pubic hair was also performed [26].

Laboratory

Fasting blood samples were obtained within 30 days of MRI and analyzed at the University Wisconsin Laboratory for lipids [total cholesterol, high-density lipoprotein (HDL), low density lipoprotein (LDL)-calculated, and triglycerides], AST, ALT, glucose, and insulin. Glucose was determined by hexokinase method, insulin by chemiluminescent immunoassay. ALT determined by NADH with Pyridoxoal-5 phosphate assay. Total cholesterol and triglycerides determined by enzymatic assay, and HDL with a direct homogeneous assay. At the time of this study, the normal reference ranges of ALT assays at the university lab was less than or equal to 65 U/L. The homeostasis model of assessment of insulin resistance (HOMA-IR) was calculated as [fasting glucose (mg/dL) × fasting insulin (μU/mL)/405]; [27]. The presence of metabolic syndrome was identified using two different sets of criteria. The first, Met-IFG, refers to the presence of at least three of the five following criteria: fasting blood glucose ≥100 mg/dL, blood pressure >90th percentile for age/height/sex [28], waist circumference >90th percentile for age/sex [29], HDL <40 mg/dl, triglycerides >150 mg/dL [30]. The second, Met-IR, substitutes HOMA-IR ≥ 4.0, for impaired fasting glucose [31].

Quantitative MRI-PDFF measurements

Imaging was performed using a clinical 3 T system (MR750, GE Healthcare, Waukesha, WI, USA) with a 32-channel phased array body coil (Neocoil, Pewaukee WI, USA). Volumetric imaging of the liver was performed using an investigational version of a 3D multi-echo complex-based chemical shift-encoded water-fat separation method, similar to that previously reported at 1.5 T [21, 22], to generate PDFF maps over the entire liver. Specific image acquisition parameters included: field-of-view = 44 x 40 cm, first echo time (TE)/repetition time (TR) = 1.2/8.6 ms, echo spacing = 2.0 ms, echo-train length = 6 (two shots of three echoes), BW = ±111 kHz, flip angle = 3o to minimize T1 bias, 8 mm slices, 32 slices, and 256 x 160 matrix. An autocalibrated 2D parallel imaging method [32] with an effective acceleration factor of 2.86 was used to reduce imaging time to a 23-s breath-hold.

Separated water-only and fat-only images, as well as MRI-PDFF maps [23] were automatically generated using an online reconstruction method that addresses or corrects for all known confounders of fat quantification. These include: spectral modelling of fat [33, 34], eddy currents [35], T1 bias [36], T2* decay [33], and noise-related bias [36]. Because all known confounders have been addressed, the resulting MRI-PDFF maps provide an accurate and fundamental measure of the triglyceride concentration in tissue [23].

MRI-PDFF was measured in two ways. First, MRI-PFF was measured from PDFF maps by using a 2.0 x 2.0-cm two ROI (167 pixels) co-localized with the MR spectroscopy voxel and identical in size (in-plane) to the MR spectroscopy voxel. Co-localization was performed by using the coordinates of the MR spectroscopy voxel recorded in the header of the MR spectroscopy data from a single imaging slice that was closest to the centre of the MR spectroscopy voxel. The ROI was centred at the same anterior-posterior and/or left-right in-plane coordinates as the MR spectroscopy voxel. MRI-PDFF was also measured by placing a single region of interest (ROI) in each of the nine Couinaud segments of the liver. The largest circular ROI that could be placed while avoiding large vessels or bile ducts was used. The final estimate of MRI-PDFF was determined from the average of these values [21]. HS was defined as a hepatic MRS PDFF >5.6 % [37].

Quantitative MRS-PDFF measurements

Single-voxel MRS was performed to serve as the reference for PDFF, using a single-voxel STEAM (stimulated echo acquisition mode) acquisition without water suppression [38]. A 2.0 x 2.0 x 2.0 cm3 voxel was placed in the posterior segment of the right hepatic lobe (segment VI or VII) in an area that avoided the lung base, large vessels, bile ducts, or obvious abnormalities (e.g., mass). After a single pre-acquisition excitation, five single-average spectra with a repetition time TR of 3500 ms to avoid T1-weighting, were acquired consecutively at progressively longer echo times of 10, 15, 20, 25, and 30 ms for a total breath-hold time of 21 s. Mixing time was 5 ms, receiver bandwidth was ±2.5 kHz with 2048 readout points. All MRS spectra were analyzed using the AMARES method under jMRUI, as previously described [21, 22]. Correction for T2-decay was performed for both the water and fat peaks, providing a T2-corrected estimate of MRS-PDFF.

Statistical analysis

Subject characteristics and metabolic markers were summarized using means and standard deviations or frequencies and percentages. The comparison of ALT and metabolic markers between subjects with HS and subjects without HS was performed using a two-sample t-test. Regression analysis was conducted to evaluate the association between MRI-PDFF and MRS-PDFF measurements. Since the distribution of PDFF was skewed at lower PDFF values, all MRI and MRS-PDFF values were log-transformed when conducting the regression analysis. Furthermore, to quantify the level of reproducibility between MRI and MRS-PDFF measurements, the intra-class correlation (ICC) coefficient was calculated using a one-way random effects model. The bootstrap method was used to calculate the 95 % confidence interval of the ICC. The reproducibility between MRI and MRS-PDFF measurements was displayed in graphical format using a Bland-Altman plot [39]. Sensitivity and specificity of MRI-PDFF was evaluated using MRS-PDFF as the reference with commonly used threshold of 5.6 % [37]. Non-parametric Spearman’s rank correlation analysis was conducted to examine the association between MRI-PDFF and metabolic measures. To evaluate the clinical utility of MRI-PDFF and its relationship with markers of metabolic syndrome, a receiver operating characteristics (ROC) curve analysis was also conducted. The predictive power of MRI-PDFF for identifying subjects with metabolic syndrome was quantified by calculating the area under the curve (AUC) of the ROC curve. The Youden method was used to determine optimal thresholds for predicting metabolic syndrome. Statistical analysis was conducted using SAS software (SAS Institute Inc., Cary, NC, USA) version 9.3. All p-values are two-sided, and p < 0.05 was used to determine statistical significance.

Results

Subjects and anthropometric markers

Characteristics of the subjects are presented in Table 1. Using the diagnostic criteria for HS of MRI-PDFF >5.6 % [37], HS was detected in 15 % (20/132) of all subjects, in 25 % of overweight subjects (18/71 of those with BMI >85th percentile), and in two subjects who were not overweight. Median MRI-PDFF in subjects with HS was 9.8 % (mean 13.5, SD 9.3). There were no significant differences in mean age, BMI, or waist circumference between overweight subjects with and without HS. All subjects were pubertal with self-assessed breast Tanner stage [26] of 2 or greater, and the average self-assessed breast Tanner stage [26] was not statistically different for overweight subjects with HS (4.5, SD 0.9) and overweight subjects without HS (4.3, SD 1.0); p-value 0.58.

Table 1 Subject characteristics in those with and without hepatic steatosis (HS)a

Measurement of MRI-PDFF

Representative MRI-PDFF maps and the corresponding MR spectra for three subjects with low, medium, and high PDFF are shown in Fig. 1. Subjective agreement between MRI-PDFF with MRS-PDFF is noted in these examples. Linear regression analysis to compare MRI-PDFF with MRS-PDFF (Fig. 2) demonstrated excellent correlation and agreement with an r2 of 0.96, a slope parameter estimate of 0.97 (95 % CI: 0.94–1.00) and an intercept of 0.78 % (95 % CI: 0.58–0.98 %) when MRI-PDFF was measured as the average of ROIs obtained in all nine Couinaud segments of the liver, and an r2 of 0.97, a slope parameter estimate of 1.04 (95 % CI: 1.01–1.07) and an intercept of 0.84 % (95 % CI: 0.64–1.03 %) when MRI-PDFF was measured co-localized with the MRS voxel. Since the distribution of PDFF was skewed at lower PDFF values, all MRI and MRS-PDFF values were also log-transformed for regression analysis and continued to show strong correlation (Fig. 3) with r2 = 0.75 when MRI-PDFF was measured as the average of ROIs obtained in all nine Couinaud segments of the liver and r2 = 0.73 when MRI-PDFF was measured co-localized at the MRS-voxel. For the remainder of our analysis, we will use the MRI-PDFF measured as an average of the nine Couinaud liver segments, as this method was superior to the co-localized measurement on log-transformation and the majority of our subjects had PDFF values at the lower end of the scale. ICC reproducibility analysis between MRI-PDFF and MRS PDFF found an ICC = 0.74 (95 % CI: 0.65–0.81), indicating an excellent level of reproducibility between the two measurements [40]. Furthermore, the Bland-Altman plot (Fig. 4) between the PDFF measures confirms the excellent level of reproducibility between the PDFF measures with an estimated bias of 0.8 % (95 % CI: 0.52–0.88 %) for the MRI-PDFF measurements when compared to the MRS-PDFF reference standard.

Fig. 1
figure 1

Representative examples of MRI-PDFF maps and T2-corrected MRS in three subjects, with low, medium, and high concentrations of fat.

Fig. 2
figure 2

Scatterplots shown of MRI-PDFF plotted against MRS-PDFF in all 132 subjects; (a) MRI-PDFF measured as the average value of ROIs obtained in the nine Couinaud segments of the liver and (b) MRI-PDFF measured from ROIs that were co-localized with the MR spectroscopy voxel. Linear regression analysis with both plots demonstrated excellent correlation and agreement.

Fig. 3
figure 3

Scatterplots of MRI-PDFF plotted against MRS-PDFF on a logarithmic scale were performed because clustering was observed at lower PDFF values (Fig. 2). (a) MRI-PDFF measured as the average value of ROIs obtained in the nine Couinaud segments of the liver and (b) MRI-PDFF measured from ROIs that were co-localized with the MR spectroscopy voxel. Although excellent logarithmic correlation was observed, a small positive bias appears to be present at low PDFF values.

Fig. 4
figure 4

Bland-Altman plot between MRI- and MRS-PDFF measurements. The centre dotted line represents the estimated bias of the MRI-PDFF when compared to MRS-PDFF. The upper and lower dotted lines represent the 95 % confidence limits of the mean difference.

Fig. 5
figure 5

Linear correlation of MRI-PDFF with common metabolic indicators was analyzed for three groups: all subjects (black linear regression line), overweight subjects (BMI >85th percentile) with hepatic steatosis (HS) (light gray linear regression line), and overweight subjects without HS (medium gray linear regression line). MRI-PDFF correlated with both BMI (a) and waist circumference (b) in all subjects, but neither correlated with MRI-PDFF in a sub-analysis of overweight subjects with and without HS. MRI-PDFF correlated strongly with ALT (c) and fasting insulin (d) in all subjects and in overweight subjects with HS, but not in overweight subjects without HS.

To evaluate the clinical utility of MRI to diagnose HS (i.e. PDFF >5.6 %), we calculated the sensitivity and specificity of MRI-PDF to determine HS using MRS-PDFF as the reference. MRI-PDFF diagnosis of HS had a sensitivity of 100 % (95 % CI: 0.79–1.00), a specificity of 96.6 % (95 % CI: 0.91–0.99), and a kappa index of 87 % (95 % CI: 0.75–0.99), which represents an excellent level of agreement [40].

MR-PDFF and metabolic markers of HS

Figure 5 shows the associations between MRI-PDFF and common metabolic indicators in all subjects, overweight subjects with HS, and overweight subjects without HS. As shown in Fig. 5a, b, MRI-PDFF had a moderately strong correlation with both BMI (r = 0.46, p < 0.0001) and WC (r = 0.30, p < 0.001) in all subjects. However, in a sub-analysis of overweight subjects, MRI-PDFF did not correlate with either BMI or WC.

As shown in Fig. 5c, MRI-PDFF correlated moderately with ALT in all subjects (r = 0.24, p = 0.005). Sub-analysis of overweight subjects showed that MRI-PDFF correlated strongly with ALT in those with HS (r = 0.84, p < 0.0001), but not in those without HS.

Similarly, Fig. 5d shows a moderately strong correlation between MRI-PDFF and fasting insulin levels in all subjects (r = 0.63, p < 0.001). Sub-analysis of overweight subjects showed a strong correlation of MRI-PDFF with fasting insulin in those with HS (r = 0.83, p < 0.001), but no correlation in those without HS.

Additional sub-analysis of overweight subjects with and without HS is shown in Table 2. Fasting glucose, fasting insulin, HOMA-IR, triglycerides, and Met-IR were significantly higher for those overweight subjects with HS (p < 0.02). However, ALT was not significantly different between these two groups. In addition, mean ALT for subjects with HS was 39 U/L (SD 25.6 U/L), which was within the laboratory reference range (<65 U/L) in 16/18 of the subjects with HS.

Table 2 Comparison of metabolic markers of hepatic steatosis in overweight subjectsa

Analysis of a metabolically significant MRI-PDFF threshold

ROC analysis was performed to evaluate the relationship between MRI-PDFF and clinical markers of metabolic syndrome. MRI-PDFF was found to be a good predictor of metabolic syndrome based on Met-IFG criteria with an AUC of 0.81 (95 % CI: 0.67–0.95)] and Met-IR criteria with an AUC 0.81 (95 % CI: 0.67–0.95)]. The optimal MRI-PDFF threshold, based on the Youden method, for predicting metabolic syndrome using Met-IFG criteria was 3.5 %, with a sensitivity of 83 % (95 % CI: 55–95 %) and a specificity of 7 5 % (95 % CI: 67–83 %). Analogously, the optimal threshold for predicting metabolic syndrome using Met-IR criteria was 3.0 % with a sensitivity of 80 % (95 % CI: 63–90 %) and specificity of 81 % (95 % CI: 71–86 %).

Discussion

In this group of adolescent girls and young women, complex confounder-corrected chemical shift-encoded quantitative MRI accurately quantified hepatic steatosis, using MRS as the reference. Thus, this study extends findings of quantitative MRI-based methods in adult studies [21, 22] to younger subjects and demonstrates the feasibility and potential clinical utility for use in a paediatric population.

With regard to clinical relevance, MRI-PDFF proved to be a highly sensitive and specific predictor of HS and therefore may be a potential aid in early detection of NAFLD. MRI-PDFF thresholds of 3.0 % and 3.5 % were predictive of metabolic syndrome using two commonly accepted criteria incorporating fasting glucose and HOMA-IR. Importantly, these thresholds are lower than the commonly used threshold of 5.6 % to define HS in adults [37]. This value was based upon the 95th percentile of MRS-derived hepatic triglyceride content in adult subjects with no risk factors for HS, and these data were not correlated with metabolic disease markers. Data from our study suggest that a lower threshold for hepatic PDFF may be clinically relevant as an indicator of emerging metabolic syndrome, in children and adolescents.

Although anthropometric markers (BMI and WC) were predictive of MRI-PDFF in the entire group, they did not correlate significantly with MRI-PDFF in overweight subjects with or without HS. This implies that BMI and WC are not useful discriminators of HS risk for adolescents and young women. In this population, overweight subjects with HS showed adverse metabolic effects, including significantly elevated fasting glucose, fasting insulin, HOMA-IR, triglycerides, and rates of metabolic syndrome compared to similar weight children without HS. This observation strengthens previous findings that hepatic triglyceride content is associated with higher rates of dyslipidemia and insulin resistance in adolescents [4144].

Interestingly, levels of ALT, a marker of hepatocellular injury, did not significantly differ between overweight subjects with and without HS. Further, 90 % (18/20) of all subjects with HS and 89 % (16/18) of overweight subjects with HS had an ALT within the laboratory reference range (normal <65 U/L). Based on data reported in the Screening ALT for Elevation in Today’s Youth (SAFETY) study [45], Schwimmer et al. recommended using an ALT threshold of 22.1 U/L to improve sensitivity for detection of NAFLD. When applied to our subjects, this threshold identified 80 % (16/20) of all subjects with HS and 78 % (14/18) of overweight subjects with HS. However, this ALT threshold is less specific, as 42 % (22/53) of overweight subjects without HS also had an ALT ≥22.1 U/L. ALT is limited as a predictor of HS in this population. However, in the sub-group of overweight subjects who were identified as having HS, as defined by an MRS-PDFF > 5.6 %, MRI-PDFF correlated strongly with ALT. This suggests that increasing liver fat content may be associated with hepatocellular injury in these subjects.

In a previous study, we also found ALT to be a poor predictor of HS risk and developed a clinically feasible risk assessment model using fasting insulin, total cholesterol, waist circumference, and ethnicity to improve early identification of hepatic steatosis in adolescents [25]. The combination of clinical risk assessment with diagnostic imaging (e.g. ultrasound, CT, or MRI) in the evaluation of liver disease may allow for early detection of disease. In particular, MRI-PDFF may be a useful means to establish the presence of HS, while ALT may be a useful marker of hepatocellular injury once HS has been identified. Further, the low MRI-PDFF threshold identified by our ROC analysis, suggests that quantitative MRI, which is more accurate than ultrasound and CT at low fat concentrations, may be useful as part of the clinical evaluation of early HS in this population.

A unique contribution of this study is the simultaneous acquisition of both imaging and serum metabolic markers in a large, relatively healthy paediatric population. A limitation is that only female subjects were enrolled. Given the significance of pubertal progression on development of IR and NAFLD, the choice to limit enrolment to girls was intentionally designed to limit variability in stages of puberty in the age range of the study group. Several studies, including the SAFETY study, suggest that gender-specific guidelines are necessary to increase sensitivity for early detection of NAFLD [45]. Consequently, future studies of male and female adolescents that include determination of Tanner stage by clinician exam are needed.

Another limitation of this study is that liver biopsy was not performed. However, the aim of this study was to evaluate the prevalence of HS and its relationship to metabolic markers in a large, generally healthy population, in whom liver biopsy was impractical. Other studies evaluating quantitative MRI-based methods have primarily focused on adult populations with known or suspected liver disease. One paediatric study included percutaneous biopsy [24] in subjects with known liver disease, but did not assess the relationship of serum markers of metabolic syndrome with MRI-PDFF.

While there was close agreement between complex quantitative MRI and MRS in this study, there was considerable variability in the lower PDFF range (0–5 %). This may reflect the fact that prior technical development, optimization, and validation of these methods have all been performed over a wide PDFF range, in contrast to the relatively low PDFF levels observed in this population. A small positive bias in low PDFF values was best observed in the logarithmic regression. Therefore, further technical development is needed to reduce the variability at low PDFF values. A reduction in PDFF variability will likely improve the accuracy and precision of quantitative MRI near clinically relevant PDFF thresholds, such as those identified by this study.

In conclusion, this study demonstrated excellent correlation and agreement of confounder-corrected chemical shift-encoded MRI with MRS to measure hepatic steatosis healthy of adolescent girls and young women, and identified an MRI-PDFF threshold that is predictive of metabolic syndrome in this group.