Nonalcoholic fatty liver disease (NAFLD) is the leading cause of chronic liver disease in the United States [1, 2]. It can progress to nonalcoholic steatohepatitis (NASH) and cirrhosis as early as in the first decade of life [3, 4], is implicated in development of type 2 diabetes and cardiovascular disease [5], and is associated with reduced quality of life [68]. NAFLD is estimated to affect about ten percent of the US pediatric population [9]; eight million children are thus at risk for long-term morbidity and potential complications of NAFLD unless appropriately diagnosed and treated.

Fat accumulation within hepatocytes, or steatosis, is a defining histologic feature of most NAFLD [1012]. The current gold standard for NAFLD diagnosis is liver biopsy; however, biopsy is invasive and often requires sedation in children. These factors may contribute to reluctance towards NAFLD workup or follow-up in children, and may adversely affect long-term outcome.

Non-invasive imaging modalities are needed to assess pediatric hepatic steatosis [13]. Possible modalities include ultrasonography (US), computed tomography (CT), magnetic resonance spectroscopy (MRS), and magnetic resonance imaging (MRI). Available published evidence does not support US to diagnose fatty liver or to grade or monitor hepatic steatosis [13]. CT is objective but use of ionizing radiation precludes routine utilization in children [14, 15]. MRS allows direct estimation of proton density fat fraction (PDFF; a measure of hepatic fat content), is non-invasive, and is considered to be accurate and precise [1618]. However, MRS is time consuming, technically demanding, restricted in spatial coverage, and currently only available in select centers. MRI is minimal risk and non-invasive, more rapid and easier to perform than MRS, more accurate and precise than ultrasound, does not use ionizing radiation, and provides spatial coverage of the entire liver. Awai et al. concluded that available published evidence for MRI was promising, but that data were as yet insufficient to recommend its use in children to assess hepatic steatosis [13].

Recent clinical studies have shown close agreement of MRI with MRS for hepatic PDFF estimation in adults [1923]. However, no study has systematically assessed feasibility in children of both MRI and MRS, together. MR examinations might be less well tolerated in children, which could affect their tolerance for these examinations in a clinical setting and feasibility for their inclusion in clinical trials.

Hence, the purpose of this study was to assess feasibility of and agreement between MRI and MRS for estimating hepatic PDFF in children with known or suspected NAFLD.

Materials and methods

Study design

This was a retrospective, single-site, cross-sectional, observational analysis of MRI and MRS examinations performed prospectively in prior research studies. The current analysis and the prior studies were approved by an institutional review board, and are compliant with the Health Insurance Portability and Accountability Act.

Between March 2008 and December 2009, children aged 8–18 years were referred for MR examination as part of two prior prospective observational studies by a UCSD pediatric hepatologist. The lower bound of 8 years of age in the parent studies was most likely due to the fact that younger children often have trouble undergoing MR examinations without sedation. As part of those studies, children gave written informed assent, and parent(s)/guardian(s) gave written informed consent. Inclusion criteria for those studies were known diagnosis of NAFLD, or the presence of risk factors for NAFLD [2427]. Exclusion criteria were liver disease other than NAFLD, contraindication to MR, pregnancy, severe claustrophobia, or severe mental or developmental disorder. Age, sex, and body mass index (BMI) were recorded in those studies. Biopsy data were not uniformly available from the parent studies from which we selected subjects, and so were not included in this retrospective analysis.

Subjects for this retrospective analysis were selected by review of children who had at least one MR examination as part of those prior studies in which at least three MRI and at least three MRS acquisitions were obtained. If subjects had more than one qualifying MR examination, the first one was reviewed for this study. If more than three MRI or MRS acquisitions were obtained in that examination, the first three (of each) were used for the current analysis.

MR examination protocol

Subjects were positioned supine on a 3T MR scanner (GE Signa EXCITE HDxt, GE Medical Systems, Milwaukee, Wisconsin) with an 8-channel torso phased-array coil and a dielectric pad placed over their upper abdomen. Before imaging, subjects practiced breath holding with the MR technologist. Abdominal bellows were used to monitor respiratory motion. No anxiolytics, sedatives, or contrast agents were administered.

MR imaging

Breath hold axial multi-echo 2D spoiled gradient-recalled echo (SPGR) liver imaging was performed using a previously described magnitude-MRI (M-MRI) PDFF estimation protocol [19]. Default parameters were TR of 150–175 ms and low flip angle (10°) to avoid T1 weighting, nominally out-of-phase and in-phase echo times (1.15, 2.3, 3.45, 4.6, 5.75, and 6.9 ms) to permit correction for T2* signal decay, single breath hold, one signal average, echo fraction of 0.8 and receive bandwidth of ±142 kHz to permit 1.15 ms echo spacing, 8–10-mm slice thickness, 0-mm inter-slice gap, acquisition matrix 192 × 192, parallel imaging off, and rectangular field of view (FOV) adjusted for body habitus (70%–100% phase FOV). In children with breath-hold difficulty, default parameters were modified in the following order as needed: (a) TR was reduced as low as 100 ms to shorten acquisition time, (b) number of phase-encoding steps was reduced as low as 96 by reducing base matrix and/or phase FOV (as low as 65%), and c) Array Spatial Sensitivity Encoding Technique (ASSET) parallel imaging with an acceleration factor of 2.0 was applied. In subjects with good breath-hold capacity, TR and/or matrix size were increased above default values at MR technologist discretion. Parameters were modified to maximize liver coverage, keeping acquisition time within breath-hold capacity. The imaged liver volume always included the MRS voxel location. The MRI sequence was acquired at least three times covering the same volume, each in a separate breath hold, with no change in subject positioning or scan parameters. MRI acquisition time was recorded.

MR spectroscopy

A single 20 × 20 × 20 mm voxel was selected in the right lobe of the liver, avoiding blood vessels, bile ducts, and liver edges. The MRS voxel was shimmed automatically with manual adjustment as necessary. A localizing axial image was stored showing the MRS voxel position. No frequency or spatial saturation was applied. Breath-hold stimulated echo acquisition mode (STEAM) proton spectroscopy was acquired with one signal average using a previously described PDFF estimation protocol [28]. An initial excitation was done to better approach steady state. Long TR (3500 ms) was used to avoid T1 weighting. Short mixing time (5 ms) was used to minimize j-coupling effects. T2 correction was permitted by acquiring TEs of 10, 15, 20, 25, and 30 ms. Spectra collected from each of the eight surface coil elements were combined using singular value decomposition [29]. The STEAM sequence was acquired at least three times at the same voxel location, each in a separate breath hold, with no change in subject positioning or scan parameters. MRS acquisition time was constant (21 s).

Hepatic PDFF analysis

MR imaging

M-MRI source images were post-processed to generate hepatic parametric PDFF maps by fitting multi-TE signal intensity values pixel-by-pixel to a fat–water spectral model [23]. Water signal was modeled as a single-frequency signal (4.7 ppm). Fat signal was modeled as a sum of 0.9, 1.3, 2.1, 2.75, 4.2, and 5.3 ppm frequency signals with relative weights 0.088, 0.70, 0.12, 0.006, 0.039, and 0.047, respectively. Blinded to clinical history and MRS results, a trained research assistant (E.A.; 3 years experience) manually placed a circular region-of-interest (ROI; 20-mm diameter) on source MR images at the MRS voxel location, propagated that ROI to the PDFF map, and recorded mean PDFF.

MR spectroscopy

MR spectra were analyzed offline, blinded to clinical history and MRI results, using the AMARES spectral fitting algorithm [30] in Java-based Magnetic Resonance User Interface (jMRUI) software [31] by an MR physicist (G.H.; >11 years experience) using a multi-peak spectral model based on prior knowledge [32]. T2-corrected peak areas were calculated by non-linear least square fitting that minimized differences between observed peak areas and the exponential decay of water (4.7 ppm) and sum of visible fat peaks (0–3 ppm). PDFF values were corrected for fat included in the water peak from a previously established standard MRS liver spectrum [32]. The R2 values for each fit were recorded.

Assessment of acquisition and examination acceptability

MR imaging

Motion artifact was scored subjectively on source MR images, blinded to clinical data, by a senior radiology resident (T.Y.; 4 years experience) for each MR examination using a modified 4-point ordinal scale (Table 1) [33]. An MRI acquisition was considered acceptable if motion artifact score was ≤2 (absent or mild motion artifact). The MRI portion of each MR examination was considered acceptable if at least one of the three MRI acquisitions was acceptable. Modifications to the MRI protocol to accommodate breath-hold limitations were noted.

Table 1 Motion artifact assessment

MR spectroscopy

MRS acceptability was assessed subjectively, blinded to subject clinical data, by one co-author (G.H.; >11 years experience). An MRS acquisition was considered acceptable if the water peak MRS signal was normal (clearly distinct from the 0–3 ppm fat peaks and lacking obvious artifact) and the water peak T2-estimation goodness-of-fit Pearson’s r correlation coefficient was >0.90. The MRS portion of each MR examination was considered acceptable if at least one of the three selected MRS acquisitions was acceptable.

Statistical analyses

Pediatric age- and sex-adjusted BMI-percentile (BMI-Z) score was computed according to the U.S. Centers for Disease Control (CDC) guidelines and used in the analyses (downloaded from CDC website 05 July 2014, http://www.cdc.gov/growthcharts/percentile_data_files.htm). Demographic, anthropometric, and MRI acquisition time data were summarized. Means and standard deviations of the three MRI and MRS acquisitions for each subject were computed and summarized.

MRI motion artifact scores, protocol modifications, and reasons for unacceptability were summarized. Subject factors that might affect mean MRI motion artifact score were assessed by multiple linear regression, using age, sex, BMI-Z score, and acquisition time as the predictor variables, and mean MRI motion artifact score as the response variable.

Accuracy of MRI-estimated PDFF relative to MRS-estimated PDFF (referred to hereinafter as MRI-PDFF and MRS-PDFF, respectively) was assessed using linear regression and Bland–Altman analysis. The average of the three acquisitions of MRS-PDFF was regressed on the average of the three acquisitions of MRI-PDFF, resulting in an equation describing the transformation of MRI-PDFF to predict MRS-PDFF. Four parameters were computed from the regression model: the intercept of the regression line, the slope of the regression line, average bias of the regression (defined as the square root of squared differences between the regression line and the y = x identity line), and the regression coefficient of determination R2. Bootstrap-based 95% confidence intervals (CIs) were computed around each of the regression parameters. Bland–Altman analysis assessed MRI–MRS differences over the entire range of observed mean PDFF. Bland–Altman bias (the mean of the MRS-MRI differences) and the 95% limit of agreement were calculated.

Subject factors that may affect agreement between MRI-PDFF and MRS-PDFF were assessed by multiple linear regression, using age, sex, BMI, acquisition time, and mean MRI motion artifact score as the predictor variables, and the size of PDFF estimation difference between MRI and MRS (absolute MRS-MRI difference) as the response variable.

Results

Subjects

Eighty-six children (61 boys and 25 girls; mean age 14.7 ± 2.3 years) were included. Demographic and anthropometric data are summarized in Table 2. MRI acquisition times were 25.5 ± 5.1 s (range 13.4–46.1 s).

Table 2 Summary of study population

Examination acceptability

MRI examinations

MRI examination acceptability is summarized in Table 3. In all (86/86) subjects, at least one MRI acquisition was acceptable, and in 93% (80/86) all three MRI acquisitions were acceptable. Average PDFF standard deviation of the three MRI acquisitions was 0.35 ± 0.21% (range 0.06%–1.35%). Of 258 completed MRI acquisitions, 204 had no appreciable motion artifact, whereas 45 had mild and nine had moderate artifact. No acquisition had marked motion artifact. Multiple regression analysis demonstrated that only sex was associated with mean MRI motion artifact score (multiple regression coefficient, p = 0.020), with boys having a higher motion artifact score overall (Fig. 1). There was no significant effect of age, BMI-Z, or acquisition time on motion artifact (p = 0.179, 0.286, and 0.998, respectively).

Table 3 Summary of MRI examination acceptability
Fig. 1
figure 1

MRI average motion score vs. sex.

In 8% of children (7/86), sequence parameter modification was required due to breath-hold limitations (Table 4).

Table 4 Summary of MRI parameter modifications

MRS examinations

All three spectra were acceptable for all (86/86) subjects. Average PDFF standard deviation of the three MRS acquisitions was 0.50% ± 0.36% (range 0.06%–1.89%).

MRI- vs. MRS-PDFF agreement

Figure 2 plots the linear regression of MRI- vs. MRS-PDFF. Regression slope and intercept were 0.969 (95% CI: 0.94, 0.997) and 1.591% (95% CI: 0.1207, 2.043), respectively. The average bias and R2 were 1.212 (95% CI: 0.967, 1.468) and 0.982 (95% CI: 0.972, 0.988), respectively.

Fig. 2
figure 2

MRS- vs. MRI-PDFF scatterplot. Linear regression analysis between MRS and MRI (n = 86). The regression line (solid) is close to the identity line (dotted), indicating close agreement between MRI- and MRS-PDFF values.

The Bland–Altman plot (Fig. 3) shows close agreement between MRI- and MRS-PDFF. The 95% limit of agreement between MRI- and MRS-PDFF was 1.17% ± 2.61%. Close MRI and MRS agreement for estimated PDFF is illustrated in Fig. 4.

Fig. 3
figure 3

Bland–Altman plot. Assessment of difference between MRI- and MRS-PDFF as a function of mean PDFF. Open circle data points correspond to subjects with BMI-Z <2.5; solid circle data points to subjects with BMI-Z >2.5. The solid line is the regression line of difference as a function of average (illustrating the relationship of PDFF magnitude and error). Slope is not significantly different from zero. Note close agreement of MRI- with MRS-PDFF across a wide PDFF range. The nominal 95% confidence interval is 1.17% ± 2.61%.

Fig. 4
figure 4

MRI hepatic parametric PDFF maps. MRI hepatic parametric PDFF maps of four subjects in this study (age–sex from left to right: 17-M, 17-M, 13-M, 11-M). Listed PDFFs are within an MRI ROI co-localized to the single-voxel MRS location.

Modeling confounders

Multiple regression analysis demonstrated that MRI–MRS-PDFF estimation difference was affected by BMI-Z at a trend level (multiple regression coefficient p = 0.0731, higher BMI-Z is associated with greater absolute error) but not by age, sex, PDFF level, mean MRI motion artifact score, or acquisition time (p = 0.859, 0.825, 0.859, 0.436, and 0.213, respectively).

Discussion

To assess feasibility of MRI and MRS estimation of hepatic PDFF, and to demonstrate agreement between MRI- and MRS-PDFF, we performed a retrospective, single-site, cross-sectional, observational analysis of children with known or suspected NAFLD enrolled in prior prospective studies. In principle, MR examinations in children can be challenging because of limited breath-hold capacity or tolerance for MR imaging, potentially resulting in unacceptable motion artifact and/or requiring MRI parameter modification to reduce breath-hold time.

We found that both MRI and MRS estimation of hepatic PDFF are well tolerated, and thus feasible in children with known or suspected NAFLD. All subjects had acceptable examinations, defined as having at least one of three acceptable MRI, and at least one of three acceptable MRS acquisitions. Motion artifact was infrequent, and when present was mild or moderate; no acquisitions showed marked artifact. Only a minority of children required MRI parameter modification. Hence, planning for possible repeat sequences with sequence parameter modification might be necessary for children clinically and for research.

MRI-PDFF in our population of 86 children aged 10–18 years closely agreed with MRS-PDFF, independent of age, sex, BMI, acquisition time, and PDFF value. Close agreement with MRS suggests that MRI is accurate since MRS is widely regarded as accurate and has been show to correlate with hepatic triglyceride in animal studies [34, 35].

Our close agreement between MRI and MRS was similar to that reported for previous adult or predominantly adult studies [1923]. While statistically significant, our observed Bland–Altman bias of 1.17% probably is not meaningful clinically. Where it may matter is in classification of normal vs. fatty liver, where a 1% PDFF difference could be relevant. The cause of this 1% bias is not clear to us. Both the MRS voxel and the MRI regions of interest were selected so as to avoid large vascular structures, and volume averaging with small blood vessels (which do not contain fat) probably also does not account for the discrepancy. One possible explanation is that with magnitude imaging noise is always positive and has a Rician distribution, which may cause systematic underestimation of the observed fat fraction.

Our study extends the findings of prior mainly adult studies to children, provides evidence that PDFF estimation by MRI is accurate in children, and supports the feasibility of its application in pediatric research and clinical practice. Previous studies have shown that, for two-dimensional MRI at 3T, keeping TR ≥150 ms and setting flip angle to 10° ensure sufficient T1 independence so that the expected T1-bias effect is less than 0.01 in absolute PDFF [23]. MRI examinations may be shortened by decreasing TR, using fewer phase-encoding steps, and using parallel imaging. It has been reported that TR should not be decreased to below 150 ms so that PDFF is not overestimated due to T1 dependence [23]. In our study, we decreased TR as low as 100 ms in three subjects (Table 4). That is too small a number of subjects to conclude that additional T1 weighting is not meaningful. Using fewer phase-encoding steps results in lower spatial resolution, but resolution is not a critical issue for liver PDFF estimation [36].

Both MRI and MRS are minimal risk, non-invasive methods that are well tolerated in children, but MRI is likely better suited for pediatric applications unless some additional information only available by MRS is needed. MRI can be tailored to reduce total examination time. By modifying MRI parameters, acquisition time in our study was reduced to as little as 13 s. Motion artifact and acquisition failure are readily detected during MRI examinations, permitting recognition of need for repeat imaging, and sequence parameter modification.

MRS usually requires offline analysis, and so recognition of examination unacceptability may not occur until after the examination is over. Limited availability is another problem associated with MRS; by comparison, MRI is more widely available. MRI also assesses the entire liver which likely makes it more appropriate for longitudinal follow-up than MRS; hepatic PDFF can be estimated with MRI over the entire liver at different time points, whereas with MRS measurement voxel co-localization over different time points is technically challenging.

A limitation of this study is that subjects were selected from two previous clinical studies, in part based on their ability to tolerate three MRI and three MRS acquisitions, so our strong feasibility results are to some extent expected. However, our selected study population included children aged eighteen down to 10 years, as well as both boys and girls with a wide range of BMI and PDFF. This study was designed mainly to show feasibility, and to determine whether sequence parameter modification might be helpful in children, and so this limitation probably does not seriously affect those goals.

Another limitation of this study is that it was performed at a single site, which over the study period accrued experience in performing MR examinations in children. Hence our results may not be generalizable to other sites with less pediatric experience. MR examinations in this study were obtained on a 3T General Electric MR scanner. However, recent studies have shown reproducibility of results obtained on MR scanners from different manufacturers at different field strengths, so this limitation likely is not significant [37, 38].

In conclusion, this study showed that MRI and MRS examination for hepatic PDFF estimation is feasible in children as young as 10 years of age, that MRI and MRS estimates of hepatic PDFF closely agree with each other, and suggest that MRI sequence parameter modification sometimes is necessary to obtain acceptable examinations. Extension of these results to longitudinal and multi-center studies should help better define the role of MRI and MRS for children with NAFLD.