Keywords

Introduction

Non-alcoholic fatty liver disease (NAFLD) is the most common cause of chronic liver disease in the Western World affecting up to a third of the adult population [1]. The disease varies in severity from accumulation of liver fat only (simple steatosis) to fat associated with inflammation (non-alcoholic steatohepatitis; NASH) and fibrosis and cirrhosis. It is now well established that patients with fibrosis are at increased risk of morbidity and mortality, while patients with simple steatosis generally have better prognosis [2, 3]. The prognostic importance of NASH remains a matter of debate [4].

The diagnostic classification of NAFLD into simple steatosis and NASH and the assessment of fibrosis relies on liver biopsy. This presents a challenge in clinical practice and in the conduct of clinical trials. In clinical practice, it is important to identify patients with high risk of morbidity/mortality from NAFLD so that they be prioritised for follow up in secondary care and for appropriate surveillance in cases of liver cirrhosis. As NAFLD is highly prevalent, liver biopsy is not practical as a diagnostic tool that needs to be applied at the level of the population, due to its, costs and invasiveness.

Liver biopsy and histological assessment of fibrosis and NASH are the only approved surrogate end points in clinical trials. Patients taking part in clinical trials therefore need to have repeated liver biopsies. Sampling errors and observer dependent variability in liver biopsy reporting means that more patients have to be recruited to achieve sufficient statistical power, while studies also suffer from high screening failure rates and dropouts. Presence of NASH and fibrosis are also important for inclusion into clinical trials.

Steatosis has traditionally been regarded as a “benign” feature in NAFLD that has no bearing on the progression of liver disease. This may be in part because steatosis is routinely quantified histologically as the “number of hepatocytes containing lipid droplets” which may not give an accurate estimate of the liver fat content. MRI on the other hand can quantify liver fat as a proportion (fat fraction; %). of liver tissue and is more accurate than histology [5]. In a natural history study, patients with liver fat fraction ≥15.7% as measured by MRI proton density fat fraction (PDFF) were more likely to progress their fibrosis than patients with fat fraction <15.7% were (multivariable adjusted odds ratio 6.7; 95% CI 1.01–44.1; p = 0.049) [6].

Furthermore, evidence is emerging from clinical trials where liver fat is being assessed with MRI PDFF that suggests that reduction in liver fat is associated with histologic improvements. Data suggest that a relative decrease of 30% in liver fat is associated with improvements in NASH (≥2 points reduction in the NAS score) [7,8,9] and steatosis on biopsy and serum markers of fibrosis and NASH activity [10,11,12,13]. Furthermore, the study of the fibroblast growth factor −19 analogue NGM282 produced a relative reduction in liver fat of 58% and 67% in those treated with 1 mg and 3 mg respectively and this was associated with improvement in fibrosis [14, 15].

In summary, there is an unmet need for non-invasive biomarkers of fibrosis as this is an important prognostic factor, for the diagnosis of NASH for inclusion in clinical trials and assessment of effectiveness and for steatosis that can be an early predictor of response to treatment. To address these areas of unmet clinical need and to reduce reliance on liver biopsy for the assessment of NAFLD in different contexts, several non-invasive techniques have been developed. These are generally divided into serum-based biomarkers (direct and indirect), ultrasound elastography based biomarkers and magnetic resonance based biomarkers, which will be the focus of this chapter. In general, simple indirect serum based markers are recommended for population screening in the community with direct serum markers [16] and transient elastography [17] reserved as a second tier of assessment. MR based biomarkers are generally reserved for cases where transient elastography fails [17].

Several MR biomarkers have been explored for several aspects of liver disease, focusing mainly on the distinction of NASH vs. non-NASH, the quantification of fibrosis, and for the monitoring of treatment response.

Magnetic Resonance Elastography

Overview

Magnetic Resonance Elastography (MRE; Resoundant, Rochester, US) is an MR technique that measures liver stiffness. Additional hardware and software is needed in order to carry out MRE and adaptations need to be made to the MR suite to accommodate these. During MRE, a plastic circular device is attached to the patient over the region of the liver. Mechanically generated shear waves are transmitted through the circular device to the liver and their propagation is visualised using specific MR sequences (e.g. 2D-gradient recalled echo (GRE) pulse sequences). These data are then used to provide an estimate of the liver stiffness, which is mostly considered a biomarker of fibrosis. 2D-MRE is clinically available and is the most validated of the MR based biomarkers in NAFLD having been tested in approximately 700 patients. 3D-MRE is also in development and this has also been explored in the assessment of patients with NAFLD. 3D-MRE gives information additional to stiffness and early studies show that it may result in improved performance.

MRE has a low failure rate (4.3%) [18] and excellent inter-observer agreement (intraclass correlation coefficient 0.95) [19].

NASH vs. Non NASH

In a retrospective study the area under the receiver operating curve (AUROC) of 2D MRE for the diagnosis of NASH was reported as 0.93 [20] (threshold 2.74 kPa, Se 0.94, Sp 0.73, PPV 0.85, NPV 0.89; Threshold 2.90 kPa Se 0.83, Sp 0.82, PPV 0.88, NPV 0.75). However, this level of performance was not replicated in five prospective studies that reported area under the curve (AUC) ranging from 0.70 to 0.81 [21,22,23,24,25]. Furthermore, these studies report on the best thresholds derived on their population. There is therefore no prospective validation on the performance of pre-defined cut-offs. MRE does not offer any improvement in the diagnosis of NASH compared to transient elastography [22, 25].

Studies that have examined 3D MRE for the diagnosis of NASH have also reported only moderate diagnostic accuracy. In a study of 100 patients 3D MRE (60 Hz) and 3D MRE (40 Hz) had AUROC of 0.76 and 0.74 respectively, compared to 2D MRE (60 Hz) of 0.75 [23]. In patients who were undergoing bariatric surgery, the AUROC for the diagnosis of NASH was 0.73 and for the evaluation of disease activity using the NAS score was 0.82 [26].

Staging of Fibrosis in NAFLD

The performance of MRE for the assessment of fibrosis has been the subject of meta-analysis. In a meta-analysis of 5 studies including 628 patients, the mean AUC of the pooled data for the diagnosis of significant fibrosis (F ≥ 2), advanced fibrosis (≥3) and cirrhosis were 0.88 (95% CI 0.83–0.92), 0.93 (0.90–0.97) and 0.92 (0.80–1.00) respectively. In an individual patient data meta-analysis of 115 patients from eight studies, the AUC for the diagnosis of fibrosis stage ≥1, ≥2, ≥3, and 4 were 0.89 (0.81–0.97), 0.90 (0.79–0.93), 0.94 (0.91–0.98) and 0.90 (0.64–0.94) respectively. MRE performed better than TE in a comparative individual patient data meta-analysis of 230 patients [27]. 2D MRE also performs better than serum based indirect biomarkers [28]. Data on diagnostic performance of MRE in selected individual studies are shown in Table 8.1.

Table 8.1 Diagnostic performance of magnetic resonance elastography for the assessment of fibrosis in patients with non-alcoholic fatty liver disease

Monitoring Treatment Response

MRE has been validated as an exploratory end point in several clinical trials. In an analysis of the data from the phase II trial of selonsertib [31], MRE had an AUC of 0.62 (95% CI: 0.46–0.78) for the prediction of fibrosis improvement, and an AUC of 0.57 (95% CI of 0.36–0.79 for the prediction of fibrosis progression [32]. In another secondary analysis of the placebo arms of two clinical trials [7, 33], a decrease of ≥5% in body mass index, was associated with a decrease in MRE liver stiffness, while patients who did not lose weight did not show any MRE changes [34].

Predicting Adverse Clinical Outcomes

There are no studies looking at the predictive value of MRE in patients with NAFLD. In a retrospective study of patients with advanced fibrosis (25% had NAFLD), MRE liver stiffness predicted decompensation independently of age, MELD score, serum albumin and hepatitis C diagnosis [35].

LiverMultiscan

Overview

LiverMultiScan™ (LMS; Perspectum Diagnostics, Oxford, UK) uses multiple MRI parameters (shMOLLI T1 mapping, T2∗ and PDFF) to provide quantitative measures of liver fibrosis and inflammation, fat and iron. Central to this technology is the correction of the T1 relaxation time, as measured by the shMOLLI technique [36], for iron. T1 is an inherent property of tissues that can change with varying fibrosis and inflammation. T1 is however confounded by the presence of iron. In LMS, the measured T1 is corrected for the amount of iron present (as measured by T2∗), to produce the “iron corrected T1 (cT1)”, something that improves the diagnostic accuracy [37]. Even though, this technique has not been validated to the same extent as MRE in patients with NAFLD, it is being used as part of the abdominal imaging protocol in the UK Biobank study [38,39,40,41], something that makes it by far the most validated technique in terms of total participants scanned and whose data were subsequently published. Figure 8.1 illustrates this technique in a patient who has undergone bariatric surgery.

Fig. 8.1
figure 1

Liver Multiscan iron corrected T1 maps. Liver Multiscan produces iron corrected T1 maps that can be used to measure mean cT1. The figure illustrates how the technique can be used to measure change in cT1 after therapeutic intervention, like bariatric surgery

The failure rate of LMS is very low (2–5%) [42, 43] in clinical studies. The main reasons for failed scans are participant related factors (e.g. claustrophobia). The failure rate remains at the same low levels when LMS is used in population level studies [38, 39]. LMS cT1 is also a robust technique with excellent reproducibility across scanners and magnet strengths (coefficient of variance 3.3%, bias 6.5 ms, 95% Level of agreement: −76.3 to 89.2 ms) and scan-rescan repeatability (coefficient of variance 1.7%, bias −7.5 ms, 95% Level of agreement: −53.6 to 38.5 ms) [44]. In head to head comparison LMS had superior test re-test repeatability compared to MR elastography and transient elastography [45].

NASH vs. Non-NASH and Staging of NAFLD Fibrosis

Two studies have examined the value of LMS in the staging of fibrosis and the identification of NASH compared to liver biopsy. In a study of 71 patients from one centre [46], LMS cT1 had an excellent diagnostic accuracy for the identification of significant NAFLD as defined by the FLIP consortium algorithm [47] (AUROC 0.89), while there was good performance for the differentiation of NASH vs. simple steatosis (AUROC 0.80). Furthermore, LMS cT1 could identify patients with significant activity (ballooning + lobular inflammation; AUROC 0.83) and cirrhosis (AUROC 0.85). In a two centre study of 50 patients [48], LMS cT1 had moderate diagnostic performance for the separation of NASH vs. simple steatosis (AUROC 0.69), but it must be noted that a different definition of NASH [49] was used in this study. Even though LMS cT1 did not perform as well for the diagnosis of fibrosis compared to alternative tests, it had the highest negative predictive value for the exclusion of significant disease where biopsy could be avoided, and an algorithm in combination with transient elastography had the lowest cost per correct diagnosis [48].

Monitoring Treatment Response

In a study of an engineered fibroblast growth factor 19 analogue (NGM282), both LMS cT1 and PDFF decreased as early as 6 weeks after treatment indicating that this method could be used to assess effectiveness at early time points. This can improve the design and conduct of clinical trials. LMS cT1 has also been used as a primary end-point in a study without histologic verification of effectiveness, that showed no therapeutic benefit of the investigational product [45]. Along with LMS cT1, there was no improvement in MRE or TE or liver fat measured by LMS PDFF.

Predicting Adverse Clinical Outcomes

LMS has not been specifically tested for the prediction of clinical outcomes in cohorts of patients with NAFLD. In a study including patients with mixed aetiologies (35% NAFLD) and varying degrees of fibrosis, LMS cT1 had a hazard ratio of 9.7 for the prediction of liver related events [50]. In the same study, a model including all three LMS variables (cT1, T2∗ and PDFF) had a hazard ratio of 75.7 demonstrating how the multi-parameter approach in this test can provide improved performance.

It should also be noted that liver T1 was found to correlate with heart failure, atrial fibrillation, and coronary heart disease in the Multi-Ethnic Study of Atherosclerosis [51]. This is important, as it is well documented that cardiovascular disease is the main cause of mortality in patients with NAFLD [2, 3].

Detection of Metabolic Liver Injury (deMILI) MRI

Overview

Detection of metabolic liver injury (deMILI) MRI uses optical analysis of magnetic resonance images to define NASHMRI (0-1) and FibroMRI (0-1), measures of NASH and liver fibrosis respectively. Image acquisition does not require injection of intravenous contrast and include SSFSE-T2 (Single Shot Fast Spin Echo T2-weighted), FAST-STIR (Fast Short inversion Time Inversion Recovery), inPHASE-outPHASE (in and out Phase) and DYNAMIC [52]. Figure 8.2 illustrates the imaging processing and the report for NASHMRI and FibroMRI.

Fig. 8.2
figure 2

DeMILI image processing and report. The deMILI image processing includes steps for (a) the manual outlining of the liver boundary, (b) segmentation and overlapping of a grid, (c) a process for selection of valid regions of interest. (d) The final report is presented as NASHMRI (0-1) where a score above 0.5 indicates NASH and FibroMRI (0-1) where a score above 0.5 indicates significant fibrosis

This technique has been validated on 1.5T Phillips and General Electric scanners. Available data suggest that the between scanner reproducibility is good when tested using independent cohorts in Phillips and GE scanners [52]. In small number of patients (n = 9) assessed by both Philips and GE scanners, FibroMRI correctly detected in fibrosis in 3/3 cases and correctly excluded in 5/6 cases using both Philips and GE devices. Furthermore, NASH was correctly diagnosed in 3/4 cases and correctly excluded in 4/5 cases using NASHMRI on data from both scanners [52].

NASH vs. Non-NASH and Staging of NAFLD Fibrosis

In a prospective study, NASHMRI and FibroMRI were defined based on the most predictive parameters in an estimation and validation cohorts. For the diagnosis of NASH, that was defined histologically based on the overall distribution of lesions especially lobular inflammation and ballooning, NASHMRI had an AUROC of 0.88 (best cut-off 0.5, sensitivity (Se) 0.87, specificity (Sp) 0.74, positive predictive value (PPV): 0.8, negative predictive value (NPV): 0.82) in the estimation cohort and 0.83 (cut-off 0.5, Se 0.87, Sp 0.6, PPV 0.71, NPV 0.81) in the validation cohort. NASHMRI performed better than Cytokeratin 18 (CK-18) for the diagnosis of NASH [52].

For the diagnosis of significant fibrosis, (F0-F1 vs. F2-F4) FibroMRI had an AUROC of 0.94 (cut-off 0.5, Se 0.81, Sp 0.85, PPV 0.77 and NPV 0.86) in the estimation cohort and 0.85 (cut-off 0.5, Se 0.77, Sp 0.80, PPV 0.67, and NPV 0.87) in the validation cohort. FibroMRI had superior performance compared to serum based fibrosis scores, and similar performance to transient elastography [52].

Dynamic Contrast Enhanced MRI

Overview

Dynamic contrast enhanced MRI relies on the MR signal change in tissues after the injection of intravenous contrast agents. Several contrast agents are available. For the assessment of chronic liver disease, gadoxetic acid is preferred as the liver actively excretes it in bile. In these scans, gadoxetic acid is injected intravenously after acquisition of baseline data. Scans are then acquired at different time points to reflect how the contrast is distributed at the arterial and portal venous phases. Gadoxetic acid is actively taken up by liver cells and then it is selectively excreted into bile. Transmembrane transporters control uptake and excretion. The number of liver cells and their summative level of function ultimately determines how much contrast is taken up into and secreted from the liver. This can be assessed by measuring the resultant change in signal intensity in the liver. Figure 8.3 illustrates how the decrease in signal intensity (T1 in this case) can be used to distinguish normal liver from diseased livers.

Fig. 8.3
figure 3

Gadoxetic acid enhanced MRI. The relative reduction in T1 20 min after gadoxetic acid injection in a healthy male and patient with cirrhosis from non-alcoholic fatty liver disease (NAFLD). In the healthy male, T1 decreases from (a) baseline of 768 ms to a (b) post contrast T1 of 266 ms, a relative reduction (percentagerT1) of 65%, while in the case of the patient, the T1 decreases from (c) a baseline of 727 ms to a (d) post contrast T1 of 504 ms, a relative reduction of 31%

This technique requires the injection of intravenous contrast, which is contraindicated in patients with significant renal dysfunction. The advantage of this technique is that it can be applied across scanners and magnet strengths. As it is assessing relative change it requires no further standardisation to make it applicable between scanners. Most of the validation of this technique has been carried out in retrospective studies of patients who were having MR scans as part of their clinical care, so applicability to the wider NAFLD population has not been assessed.

NASH vs. Non-NASH and Staging of NAFLD Fibrosis

There have been some studies showing utility of this technique in animal models of NAFLD/NASH [53,54,55]. A retrospective human study of 81 patients showed that the relative signal enhancement after contrast injection was associated with lobular inflammation (p = 0.002), ballooning (p = 0.04) and fibrosis (p < 0.0001) but not with steatosis (p = 0.38) [56]. For the diagnosis of NASH as defined by the Steatosis Activity Fibrosis (SAF) classification [47], this technique had an AUC of 0.85 (threshold 1.24, Se 0.97, Sp 0.63).

Several studies have assessed DCE MRI in mixed cohorts of patients showing some utility in the assessment of liver fibrosis [57], cirrhosis severity [58,59,60], and liver function [60, 61], including some studies showing superior performance of DCE MRI for the assessment of fibrosis compared to unenhanced T1 and diffusion weighted imaging [62, 63]. However, generalisation of these results to NAFLD patients must not be assumed.

A related approach to using gadolinium based contrast agents is to use iron containing contrast agents. Superparamagnetic iron oxide particles have been tested, but these have since been taken off the market [64]. More recently, there has been some interest in ultrasmall superparamagnetic iron oxide particles. The iron containing contrast leads to changes in tissue R2∗ which can be measured. In a small, prospective, proof-of-concept study, the AUC for the diagnosis of NASH vs. simple steatosis was 0.87 (95% CI 0.72–1.0) [65]. However, the post contrast scans are acquired 72 h after injection something that is impractical in clinical practice.

Diffusion Weighted Imaging

Overview

Diffusion weighted imaging (DWI) uses MRI acquisition and analysis techniques to track diffusion of water in tissues. Quantitative measures of diffusion can be produced by measuring the magnitude (apparent diffusion coefficient; ADC) and directionality (fractional anisotropy) of diffusion. The accumulation of steatosis, inflammation and fibrosis can lead to changes in water diffusion and these can be measured using various DWI techniques. Intravoxel incoherent motion (IVIM) is a DWI method that can account for the diffusion signal contributed from blood flowing in vascular beds [66].

The failure rate of this technique was up to 17.5% in one study [67]. The method of analysis can also have a significant impact on results [68].

NASH vs. Non-NASH and Staging of NAFLD Fibrosis

A study of 59 patients with type 2 diabetes mellitus and NAFLD evaluated the IVIM parameters of “pure molecular diffusion; D”, “perfusion related diffusion, D∗ and “perfusion fraction; f”. The study found only moderate diagnostic accuracy for the diagnosis of NASH (AUC 0.74 for D, 0.68 for D∗, 0.61 for f) and fibrosis (AUC 0.69 for D, 0.68 for D∗, 0.62 for f) [67]. In a separate study of 89 patients with NAFLD, steatosis and fibrosis had significant and independent effects on D and f [68]. The effects of steatosis have also been observed in other studies [69,70,71,72].

In an interesting retrospective study of 15 patients (only 2 with NAFLD), a method is proposed by which IVIM can be used to generate a “virtual elastogram” based on a calibrated relationship between ADC and liver elasticity [73]. This lacks prospective validation in patients with NAFLD but could provide an added advantage to MRE as it could potentially produce equivalent data without the need for additional hardware.

Conclusions

The field of MR based biomarkers is relatively new compared to serum-based biomarkers and ultrasound based elastography techniques. Of the techniques that have been reviewed in this chapter, MRE (+PDFF for fat) and LMS have had most validation in NAFLD and they show promise for further clinical utility. MRE has the best performance for assessment of late stages of fibrosis. PDFF for liver fat content quantification is emerging as an important parameter for predicting histological response.

How various MR techniques are utilised in clinical pathways and clinical trials remains to be determined. Current recommendations [17] favour application of MR based techniques as a third tier of non-invasive tests after serum based and ultrasound elastography. While this approach may be more practical there are no cost effectiveness data to support it and it could be that application of MR based techniques “up-front” are more cost effective if they have superior diagnostic accuracy.

One other area that needs further attention is the validation of pre-defined thresholds to be used in different situations (contexts of use). For example, there is growing evidence that a relative reduction of 30% in liver fat content predicts histological response but data are still lacking on prospective validation of predefined cut-offs for varying fibrosis severities. Data on the prognostic value of MR based biomarkers in NAFLD cohorts are also needed.

MR based biomarkers will certainly have a role in the assessment of patients with NAFLD as the data reviewed here demonstrate advantages in some key areas beyond diagnostic accuracy. MR based biomarkers are robust with excellent reproducibility and repeatability, can be applied at population level as in the case of Liver Multiscan being used in the UK Biobank imaging study. Further technical improvements are also possible as in the use of diffusion weighted imaging to perform “virtual elastography”.