Introduction

Chronic liver disease accounts for approximately 3.5% of all deaths worldwide owing to complications of cirrhosis and hepatocellular carcinoma (HCC) [1]. While liver biopsy remains the gold standard for assessing hepatic fibrosis [2, 3], the significance of its results can be limited by sampling errors, and there is always the risk of complications due to its invasiveness. Moreover, although repeated assessment of hepatic fibrosis is becoming more important as recent studies suggest possible regression of fibrosis during treatment [4, 5], repeating liver biopy is impractical in actual clinical practice.

With the recent advances in imaging techniques, several noninvasive methods have been explored for diagnosing liver fibrosis including ultrasound elastography (USE), magnetic resonance elastography (MRE), and diffusion-weighted imaging (DWI) [6,7,8,9,10]. Among the USE techniques, transient elastography (TE) is the most widely used and validated technique [11,12,13]. It is rapid and user-friendly enabling almost immediate results with good reproducibility [11]. Among the MR techniques, MRE shows the highest diagnostic accuracy for liver fibrosis, but high cost and requirements for additional hardware and commercially available software are some of its disadvantages.

DWI using the mono-, bi-, and stretched exponential models have also been explored for staging hepatic fibrosis [7, 14, 15]. The stretched exponential model can reflect physiological characteristics in in vivo tissue without restriction of the number of tissue compartmentalization, in contrast to the two compartments of the biexponential model [7, 16]. Compared with the mono- and biexponential models, there have been fewer studies on the stretched exponential model for liver fibrosis, including a previous study that showed better performance of the stretched exponential model compared with the mono- and biexponential models [7]. However, no study has yet to compare the diagnostic performance of the stretched exponential model with TE. Additionally, although hepatic inflammation and steatosis are known to affect parameters in the mono- and biexponential DWI models [6, 10], no study has yet investigated the possible effect of these histolopathologic findings on the stretched exponential model when assessing liver fibrosis.

Therefore, the purposes of this study were to compare the diagnostic performance of the stretched exponential DWI model with those of TE and the mono- and biexponential DWI models and to determine the influence of hepatic inflammation and steatosis on these DWI models and TE.

Materials and methods

Study population

This study was approved by our Institutional Review Board and written informed consent was waived due to its retrospective study design. From October 2015 to November 2018, 1296 adult patients who underwent liver dynamic MRI including DWI with multiple b values were screened. Most of the patients underwent liver MRI for evaluation of focal liver lesions. Patients who underwent both liver MRI and TE and had pathologic results of liver fibrosis were eligible for our study. The exclusion criteria were as follows (Fig. 1): (a) patients who did not undergo TE, (b) no pathologic results of the liver, (c) interval longer than 1 year between liver pathology and imaging, (d) patients who received right hemihepatectomy thus precluding region of interest (ROI) measurement of parameters, (e) iron deposition in liver pathology, and (f) chemotherapy-related sinusoidal obstruction in liver pathology. Patients with either one of these two pathology results were excluded because these factors are known to affect diffusion parameters [6, 17, 18]. We searched electronic medical records to collect data on patient demographics, underlying liver disease, laboratory findings, and body mass index for the above patients.

Fig. 1
figure 1

Study flow diagram. MRI, magnetic resonance imaging; DWI, diffusion-weighted imaging; TE, transient elastography; SOS, sinusoidal obstruction syndrome

Transient elastography examination

Liver stiffness measurements (LSMs) were obtained using FibroScan® (Echosens) with a vibrator applied to the intercostal spaces at the level of the right hemiliver. LSM was expressed in kilopascals (kPa) and the median value was chosen to represent the elastic modulus of the liver. Interquartile range (IQR) was defined as an index of intrinsic variability for LSM, corresponding to the interval of LSM containing 50% of valid measurements between the 25th and 75th percentiles. LSMs with at least ten valid measurements with a success rate of at least 60% were considered reliable [19].

MR image acquisition

All images were acquired using a 3-T MR scanner (Achieva TX or Ingenia: Philips Healthcare) with anterior and posterior coils. The routine liver dynamic MRI protocol consisted of noncontrast T1-weighted and T2-weighted images, DWI, and contrast-enhanced T1-weighted images. DWI was obtained using free-breathing, fat-suppressed single-shot echo-planar imaging with the following parameters: echo time, 50.2 ms; repetition time, 5000 ms; echo train length, 27; receiver bandwidth, 2877 per pixel; field of view, 400 mm; matrix size, 90 × 92; number of excitation, 3; section thickness, 5 mm; spectral adiabatic inversion-recovery fat suppression; and acquisition time, approximately 4 min 15 s. Three orthogonal spatial directions were encoded with nine diffusion-weighted gradients using the b values of 0, 10, 25, 50, 75, 100, 200, 500, and 800 s/mm2.

Postprocessing and image analysis of diffusion-weighted MRI

DWI data were analyzed using a postprocessing software (EXPRESS; Philips Healthcare) which was developed for research purposes, and DWI parameters and parametric maps were acquired.

The ADC value was calculated using the monoexponential linear fitting (least-square) technique (i.e., monoexponential model) with all nine b values according to the following equation:

$$ S(b)/S(0)=\exp \left(-b\cdot \mathrm{ADC}\right) $$

where b is the b factor applied in DWI, S(b) is the signal intensity at a given b value, and S(0) is the signal intensity in the absence of any diffusion weighting.

In the case of the biexponential model, the true diffusion coefficient (Dt) was calculated using b values > 200 s/mm2 with the simple linear fit equation. The pseudodiffusion coefficient (Dp) and perfusion fraction (f) were then obtained via a nonlinear regression algorithm [20]:

$$ S(b)/S(0)=\left[\left(1-f\right)\cdotp \exp \left(-b\cdotp {D}_t\right)\right]+\left[f\cdotp \exp \left(-b\cdotp {D}_p\right)\right] $$

Likewise, in the stretched exponential model, the distributed diffusion coefficient (DDC) and intravoxel heterogeneity index (α) were calculated via the following equation [16]:

$$ S(b)/S(0)=\exp {\left[-\left(b\cdotp \mathrm{DDC}\right)\right]}^{\alpha } $$

Here, DDC represents the mean intravoxel diffusion rate and α represents the intravoxel water molecular diffusion heterogeneity which ranges from 0 to 1. To evaluate the diagnostic performance of DDC and α using abbreviated b values (0, 50, 500, 800 s/mm2) generally covered in liver MRI, DDC and α using these four b values were also calculated and are herein defined as DDCa and αa. Using the postprocessing software, diffusion parameters were calculated using selected b values for the same ROIs with the full b-value imaging set. For both biexponential and stretched exponential models, the Levenberg–Marquardt fit was used [21].

Two radiologists, one board-certified abdominal radiologist with 8 years of experience in liver MRI and one junior radiology resident, performed image analyses independently. They were blinded to clinicopathologic findings and ROIs were drawn on DWI images (b = 0 s/mm2) using the postprocessing software. Three circular ROIs were drawn in different locations on the right hemiliver while avoiding large intrahepatic vessels and focal hepatic lesions. The left hemiliver and right liver dome were avoided as measurements could be unreliable from cardiac motion artifacts [22]. The mean area of the ROIs was 335.0 ± 105.2 mm2. The six ROI values attained by the two reviewers were used to analyze variability according to liver location and to calculate interoberserver agreements for diffusion parameters. The average of the six ROI values was used for other statistical analyses.

Histopathological analysis

Pathological assessment was used as the reference standard. The mean time interval between the date of imaging studies and pathologic diagnosis was 34.5 ± 98.1 days for MRI and 94.5 ± 181.2 days for TE. Pathologic specimens were obtained via hepatic resection (n = 69), percutaneous liver biopsy (n = 6), and transplantation (n = 3). The degree of fibrosis, steatosis, and inflammation were assessed according to the METAVIR and steatosis, activity, and fibrosis (SAF) scoring systems [23, 24]. Fibrosis was staged as follows: F0, no fibrosis; F1, periportal fibrosis without septa; F2, periportal fibrosis with few septa; F3, septal fibrosis; and F4, cirrhosis. Hepatic steatosis was scored from 0 to 3 via visual assessment of the percentage of hepatocytes presenting large- or medium-sized fatty droplets as follows: S0, < 5%; S1, 5–33%; S2, 34–66%; and S3, > 67%. Inflammatory activity was graded as follows: A0, no activity; A1, mild activity; A2, moderate activity; and A3, severe activity.

Statistical analysis

Continuous variables are expressed as means ± standard deviations, while categorical variables are expressed as number of cases and percentages. TE and DWI parameters were compared between patients with F0–1, F2–3, and F4 using the one-way analysis of variance (ANOVA) test. Before the ANOVA test, data were tested for normal distribution using the Kolmogorov–Smirnov test. Post hoc analysis with Bonferroni multiple comparisons was performed for pairwise comparisons between each fibrosis group. The influence of histologic fibrosis, inflammation, and steatosis to the imaging parameters was determined by Spearman’s rank correlation and multiple regression analysis. Spearman’s rank correlation coefficient ρ ranges from − 1 to + 1 with the absolute value representing the strength of the correlation (0, no correlation; 0.2, weak correlation; 0.5, moderate correlation; 0.8, strong correlation; 1, perfect correlation) [25]. After confirming correlation, a stepwise multiple regression analysis with conditional backward elimination was performed to determine independent predictors for imaging parameters. Standardized coefficients called β weights and coefficients of determination (R2) were reported for each parameter. Variance inflation factors (VIFs) were calculated to evaluate the effect of collinearity between independent factors. A VIF greater than 10 indicated the presence of multicollinearity.

The overall diagnostic performances of TE and DWI parameters for staging liver fibrosis were evaluated using the Obuchowski measure [26], and the measures were compared between different parameters using the De Long test. The Obuchowski measure is a generalization of the area under the receiver-operating characteristic curve that can be used when a reference standard is multinominal [26]. Post hoc analysis with Bonferroni multiple comparisons was performed for pairwise comparisons between each parameter. The reliability of the DWI parameters was evaluated using the coefficent of variation (CoV) and intraclass correlation coefficients (ICC). Intraindividual variability among the six ROIs in the liver was calculated with CoV. Interreader agreement for ROI measurements between the two readers was assessed with ICC. ICC values of ≥ 0.75 indicate excellent agreement; 0.60–0.74, good; 0.40–0.59, fair; and < 0.40, poor [27]. Analyses were performed using R package version 3.6.0 (R Foundation for Statistical Computing), MedCalc version 19.0.7 (MedCalc Software), and SPSS software version 25.0. P values < 0.05 were considered statistically significant.

Results

Patient characteristics

Based on the aforementioned criteria, a total of 78 patients (mean age, 58 ± 10 years; 62 men) were found eligible for our study (Fig. 1). Clinical and histopathologic findings of the patients are summarized in Table 1. Of the 78 patients, 63 (80.8%) had underlying liver disease, and the most common etiology was chronic hepatitis B (64.1%, 50/78). Sixteen (20.5%) patients had serum alanine aminotransferase levels more than two times higher than the normal upper limit. Regarding pathologic liver fibrosis, 12, 7, 6, 21, and 32 patients had F0, F1, F2, F3, and F4, respectively. Most patients had hepatic malignancy (96.2%, 75/78) with the malignancies being diagnosed as hepatocellular carcinoma (HCC) (n = 56), metastasis (n = 14), cholangiocarcinoma (n = 2), combined HCC and cholangiocarcinoma (n = 1), adult-type hepatoblastoma (n = 1), and leukemic involvement (n = 1).

Table 1 Clinical–pathologic characteristics of 78 patients

Imaging parametric values according to the stage of hepatic fibrosis

LSM values obtained using TE and DWI parameters according to fibrosis stage are summarized in Table 2 and Figs. 2, 3, and 4. TE (p < 0.001), ADC (p = 0.007), f (p = 0.042), DDC (p < 0.001), and DDCa (p < 0.001) values were significantly different between the F0–1, F2–3, and F4 stages. Among these pameters, all parameters except f were significantly different between F2–3 and F4 (ps ≤ 0.043). However, only DDC was significantly different between F0–1 (1.272 ± 0.177) and F2–3 (1.067 ± 0.124; p < 0.001).

Table 2 Values of TE and DWI parameters according to the hepatic fibrosis stage
Fig. 2
figure 2

Box plots of transient elastography (TE) and diffusion-weighted imaging parameters according to fibrosis stage (F0–1, F2–3, and F4). DDC was the only parameter to show significant difference between F0–1 and F2–3 and between F2–3 and F4. ADC, apparent diffusion coefficient; f, perfusion fraction; Dp, pseudodiffusion coefficient; Dt, true diffusion coefficient; DDC, distributed diffusion coefficient; α, intravoxel diffusion heterogeneity index; DDCa, DDC obtained using a four b-value dataset; αa, α obtained using a four b-value dataset

Fig. 3
figure 3

Images of liver cirrhosis in a 67-year-old woman. a A native image acquired with a b value of 0 s/mm2. b A portal venous phase T1-weighted image. c A hepatobiliary phase image shows a cirrhotic liver. Parametric maps of d ADC (10−6 mm2/s), e f (10−3), f Dp (10−3 mm2/s), g Dt (10−6 mm2/s), h DDC (10−6 mm2/s), and i α (10−3) are shown. Liver stiffness measurement by transient elastography was 16.3 kPa. Results of liver pathology were F4 (cirrhosis), A2 (moderate inflammatory activity), and S0 (no steatosis)

Fig. 4
figure 4

DWI (b = 0 s/mm2), ADC (10−6 mm2/s), Dt (10−6 mm2/s), and DDC (10−6 mm2/s) maps comparing different degrees of hepatic fibrosis. DDC maps show values decrease with the progression of liver fibrosis: 1.659, 1.435, 1.227, 1.108, and 1.020 (10−3 mm2/s) for patients with F0, F1, F2, F3, and F4, respectively. Liver stiffness measured by transient elastography was 5.4 kPa, 4.6 kPa, 11.8 kPa, 10.5 kPa, and 9.5 kPa for patients with F0, F1, F2, F3, and F4, respectively. DWI, diffusion-weighted imaging; ADC, apparent diffusion coefficient; Dt, true diffusion coefficient; DDC, distributed diffusion coefficient

Influence of hepatic fibrosis, inflammation, and steatosis on transient elastography and diffusion parameters

TE showed a significant positive correlation (ρ = 0.539, p < 0.001) with hepatic fibrosis. DWI parameters except Dp, α, and αa also showed a significant correlation with fibrosis (Table 3). DDC had the strongest negative correlation (ρ = − 0.639, p < 0.001) followed by DDCa (ρ = − 0.496, p < 0.001) and ADC (ρ = − 0.343, p = 0.002). For hepatic inflammation, TE (ρ = 0.459, p < 0.001), DDC (ρ = − 0.363, p = 0.001), and DDCa (ρ = − 0.324, p = 0.004) showed a significant correlation with inflammatory activity. As for hepatic steatosis, the three DWI parameters ADC (ρ = − 0.325, p = 0.004), Dp (ρ = 0.267, p = 0.029), and Dt (ρ = − 0.236, p = 0.038) showed significant correlation. DDC (p = 0.619) and TE (p = 0.057) did not show a significant correlation with steatosis.

Table 3 Correlation and multiple regression analyses: influence of hepatic fibrosis, inflammation, and steatosis on the imaging parameters

In the multiple regression analysis, fibrosis was the only independent factor associated with TE (β = 2.981, p = 0.001), DDC (β = − 0.114, p < 0.001), and DDCa (β = − 0.071, p < 0.001). On the other hand, only steatosis was significantly associated with Dp (β = 15.233, p = 0.029), while both fibrosis and steatosis were associated with ADC (p = 0.001 for both fibrosis and steatosis) and Dt (p = 0.020 and p = 0.007).

Diagnostic performance of transient elastography and diffusion parameters for staging liver fibrosis

DDC had the highest Obuchowski measure for staging liver fibrosis (0.717; 95% CI, 0.653–0.765) followed by TE (0.681, 0.623–0.733), DDCa (0.663, 0.599–0.719), and ADC (0.608, 0.543–0.671) (Table 4). DDC showed a significantly better performance than ADC (p = 0.032), f (p < 0.001), Dp (p = 0.014), Dt (p = 0.015), α (p < 0.001), and αa (p < 0.001). However, there were no significant differences between DDC and TE (p > 0.999) and between DDC and DDCa (p = 0.473).

Table 4 Overall diagnostic performance for staging liver fibrosis

Reliability of TE and diffusion parameters

Reliability of TE was evaluated by the IQR/median ratio (IQR/med) of the LSM values. The median IQR/med value was 13 (range, 0–26), indicating reliable examinations. The reliability of the DWI parameters was assessed using CoV and ICC (Table 5). Higher CoV values indicate higher variability of DWI parameters among the different locations of the liver. CoV was lowest for ADC (12.5%) followed by DDC (17.2%), Dt (18.2%), and DDCa (21.0%). On the other hand, CoV was highest for perfusion-related parameters, including Dp (61.0%) and f (27.4%). Interreader agreement for DWI analysis between the two reviewers was good to excellent (ICC, 0.733–0.848).

Table 5 Reliability of DWI analysis

Discussion

Our results showed that DDC from the stretched exponential DWI model showed the best diagnostic performance for staging liver fibrosis, followed by TE, although the difference between the two was not significant. Moreover, only DDC could differentiate both between F0–1 and F2–3 and between F2–3 and F4. DDC as well as TE was significantly correlated with hepatic fibrosis and inflammation, but fibrosis was the only independent factor for DDC and TE. In addition, steatosis was not associated with stretched exponential DWI nor TE, but associated with ADC, Dp, and Dt.

A stretched exponential model reflects non-Gaussian molecular diffusion and represents the degree of intravoxel heterogeneity of biological tissue [7, 16]. Consistent with a previous study [7], DDC from the stretched exponential model showed significantly better diagnostic performance for staging liver fibrosis than both the mono- and biexponential DWI models. DDC, which represents a continuous distribution of diffusion coefficients from different diffusion compartments, seems to reflect the degree of liver fibrosis more accurately than the mono- or biexponential DWI models. DDC also showed smaller CoV than parameters from the biexponential DWI, indicating less variability across different liver locations. On the other hand, α, another parameter from the stretched exponential model which reflects intravoxel diffusion heterogeneity, did not show a significant correlation with fibrosis in this study. Previous studies have shown inconsistent results for the association between α and hepatic fibrosis [7, 28]. An animal study showed no significant correlation between α and hepatic fibrosis [28], whereas another human study reported that α as well as DDC had a significant negative correlation with fibrosis [7]. While the presence of more connective tissues, narrowing of sinusoids, and decreased blood flow may explain architectural heterogeneity in the fibrotic liver, these changes may not aggravate heterogeneity in advanced fibrosis stages [29, 30]. Also, the larger proportion of patients with no or early stage fibrosis (F0–1) included in the previous study (44/95, 46.3%) compared with our study (19/78, 24.4%) may have contributed to the discordant study results. Further larger studies are warranted to validate the role of α in hepatic fibrosis.

Regarding the diagnostic performance of TE and MRI for staging liver fibrosis, TE has been reported to be less accurate than MRE parameters, but comparable to the biexponential DWI models [10, 12, 31]. Our results showed that DDC has good diagnostic performance comparable to TE. Of note, unlike TE, DDC could differentiate between F0–1 and F2–3, as well as between F2–3 and F4. TE is reported to be less effective when identifying early-stage fibrosis [9, 10]. In addition, the diagnostic performance of DDCa using four abbreviated b values (0, 50, 500, 800 s/mm2) was also evaluated because DWI using fewer b values can reduce scanning time. Although the diagnostic performance of DDCa for staging liver fibrosis was slightly lower than DDC, the difference was not significant between DDCa and DDC. Considering the disavantages of TE including operator dependency, restrictions on its use in patients with ascites or obesity, and the inability to evaluate the background liver or focal liver lesion, the stretched exponential DWI model using abbreviated b values can be easily applied to the evaluation of fibrosis even in patients who undergo routine liver MRI without additional scan time.

Several different methods can be used for curve fitting in nonmonoexponential DWI models, and the chosen fitting method can affect the accuracy and reliability of the models. In this study, the Levenberg–Marquardt algorithm (LMA) was used for the biexponential and stretched exponential models. It is commonly used for nonlinear least-square fitting and has been applied for biexponential model fitting in previous studies [21, 32, 33]. LMA identifies parameters which best fit the datasets using iterative minimization. LMA requires an initial guess which acts as a stepping stone for the process of iterative minimization to begin on, and the chosen initial value affects the accuracy of the parameters. In our study, instead of using fixed initial values, initial values roughly calculated pixel by pixel from the actual required dataset were used. By using calculated initial values that had the same physical significance as the actual fitting results, the quality of the overall map improved with a better chance of accurate fitting. For IVIM analysis, we used segmented fitting instead of fully unconstrained free fitting. The segmented IVIM analysis used in this study with its precalculations of Dt can improve the mathematical stability of the results by reducing the degree of freedom during the fitting process.

This study investigated the possible confounding factors of the stretched exponential model for evaluating fibrosis. DDC showed a significant correlation with hepatic fibrosis and inflammation but not with steatosis. In the multiple regression analysis, fibrosis was the only independent factor associated with DDC. Inflammation may increase liver stiffness value by increasing cellularity, cell size, or hydrostatic pressure [34]. However, the effect of inflammation on DDC was weaker than fibrosis in this study, and no independent association was identified between inflammation and DDC. Regarding the other DWI models, both steatosis and fibrosis were independently associated with ADC and Dt, which is generally consistent with previous reports [6, 35, 36]. Although most patients in our study were categorized as having no or mild steatosis (55.1% and 37.2%, respectively), the effect of steatosis on ADC and Dt was significant, indicating a confounding effect of even mild steatosis on the mono- and biexponential DWI models.

This study has several limitations. First, this study has an inherent selection bias due to its retrospective design. While the etiology of chronic liver disease was diverse, most of the patients had chronic hepatitis B. In addition, the number of patients with each grade of fibrosis, steatosis, and inflammation was not even, as relatively large portions of patients had advanced fibrosis and no or mild steatosis. Second, the time interval between imaging and pathology confirmation was relatively long. Third, discrepancies may exist between the locations of the ROI measurements for TE or DWI models and pathologic evaluation, and this might have caused potential bias as hepatic fibrosis can be heterogeneously distributed. Finally, the abbreviated b values of the imaging set of the stretched exponential model was calculated by arbitrarily selecting four b values, generally included in the routine liver MRI protocol. Further prospective studies are needed to evaluate the optimal number and composition of b values for the stretched exponential model.

In conclusion, DDC from the stretched exponential model is the most accurate DWI parameter for staging liver fibrosis, with overall similar diagnostic performance to TE. Compared with other DWI models, the stretched exponential model seems more advantageous in the staging of liver fibrosis due to higher diagnostic performance, no confounding effects by steatosis, and better reliability.