Introduction

Chronic hepatitis conditions, such as viral hepatitis, steatohepatitis, and autoimmune liver disease, are risk factors for hepatocellular carcinoma, portal hypertension, and chronic liver failure [1,2,3]. To prevent these complications, it is important to assess liver fibrosis and steatosis accurately. Liver biopsy is the standard method for staging liver fibrosis. Although ultrasound-guided liver biopsy is generally safe, some complications including hemorrhage, pneumothorax, hemothorax, cholangitis, and bile leakage may occur with repeated biopsies [4,5,6] and may be subject to sampling errors [7]. There have been efforts to develop noninvasive diagnostic methods [8,9,10]. Transient elastography (TE) and the controlled attenuation parameter (CAP) have been used for the diagnosis of liver fibrosis and steatosis, respectively. However, the previous reports have indicated that obesity is a limiting factor to the accuracy of elastography [11]. Therefore, an XL probe was developed and validated for obese patients [12]. In addition, two-dimensional shear wave elastography (2D-SWE) and ultrasound-based attenuation imaging (ATI) have been developed, but there are few reports about the effects of obesity on the reliability of the measurement and the diagnostic ability of 2D-SWE and ATI. We assessed the association of SCD with the reliability of the measurement rate and diagnostic ability of each modality.

Materials and methods

Patients

A total of 85 patients who underwent TE/CAP, 2D-SWE/ATI, and liver biopsy on the same day at our hospital between October 2019 and July 2021 and were diagnosed with chronic hepatitis based on liver biopsy were enrolled in this study.

Exclusion criteria

Patients with acute hepatitis, autoimmune hepatitis, primary biliary cholangitis, and hepatic cancer, and patients who did not undergo either FibroScan or 2D-SWE, were excluded.

Transient elastography and controlled attenuation parameter

TE and CAP were performed using a FibroScan 502 Touch system (Ecosens, France) with both a 3.5-MHz “M” probe (Ecosens, France) and a 2.5-MHz “XL” probe (Ecosens, France). Patients were in the supine position for scanning. Scans were taken in the right lobe of the liver through the intercostal spaces with both arms in maximum abduction. Measurements were performed by four experienced examiners. We used the median value for the analysis.

Two-dimensional shear wave elastography and attenuation imaging

2D-SWE and ATI were performed using an Aplio i800 (Canon Medical System, Japan). Patients were in the supine position for scanning. Scans were taken in the right lobe of the liver through the intercostal spaces with both arms in maximum abduction. Measurements were performed by four experienced examiners. A circular region of interest (ROI) was placed within the acquisition box. During the ATI examination, a fan-shaped sampling box was placed. An ROI (2 cm wide by 4 cm high) was placed around the center of the sampling box. We used the median value for the analysis.

Skin and liver capsule distance (SCD)

SCD was defined as the distance between the skin and the liver capsule, measured with the probe positioned in the intercostal window.

Definition of measurement reliability

We defined the measurement reliability as follows (Table 1). TE and CAP measurements were considered reliable when they fulfilled all of the following criteria: 10 valid measurements, ≥ 60% success rate, and interquartile range/median ≤ 30%. For 2D-SWE and CAP, the manufacturer did not recommend any quality criteria; therefore, we defined the measurement reliability of 2D-SWE as follows: five valid measurements, ≥ 3 of the five measurements from the homogenous color map, and an IQR/median ≤ 30%. The measurement reliability of ATI was defined as follows: five valid measurements, ≥ 3 of the five measurements with an R2 (coefficient of determination) of 0.90 or higher, and an IQR/median ≤ 30%.

Table 1 Definition of measurement reliability

Liver biopsy and hepatopathologic evaluation

Liver biopsy was performed under real-time US guidance with a 16- or 18-gauge automatic biopsy gun (BioPince; Argon Medical Devices, Frisco, Texas, United States) or a suction biopsy needle (SoloCut; Create Medic, Yokohama, Japan).

We used the new Inuyama classification [13] for pathological fibrosis staging. Liver fibrosis was staged as follows: F0, no fibrosis; F1, mild fibrosis, fibrosis portal expansion; F2, moderate fibrosis, bridging fibrosis; F3, severe fibrosis, bridging fibrosis with distorted acinar architecture; and F4, cirrhosis. We used the steatosis grade of the NAFLD activity score [14] for liver steatosis. Steatosis was graded as follows: S0, 0–10% fat; S1, 11–30% fat; S2, 34–66% fat; S3, > 67% fat.

Statistical analysis

Univariate analysis was performed using the Mann–Whitney U test. A receiver-operating characteristic (ROC) curve was calculated to evaluate the diagnostic performance of the methods to detect each grade of steatosis and stage of fibrosis. A p value < 0.05 was considered statistically significant. All statistical analyses were performed using EZR (Saitama Medical Center, Jichi Medical University, Saitama, Japan), a graphical user interface for R (The R Foundation for Statistical Computing, Vienna, Austria). This modified version of R commander was designed to add statistical functions frequently used in biostatistics.

Results

Study 1: the adequate measurement rate of TE/2D-SWE and CAP/ATI

Patient background characteristics

The characteristics of the patients are summarized in Table 2. The median age was 57 (range 22–81) years, and 49 (57.6%) patients were male. The median body mass index (BMI) was 25.0 (range 14.9–40.2). The median SCD was 18.5 (range 6.7–30.8) mm.

Table 2 Patient characteristics

Histopathologic evaluation

Forty-five (52.9%) patients had non-alcoholic steatosis hepatitis (NASH) or non-alcoholic fatty liver disease (NAFLD). Thirty-five (41.2%) patients had non-specific chronic hepatitis diagnosed based on liver biopsy. The numbers of patients by stage of fibrosis were as follows: F0, n = 10; F1, n = 42; F2, n = 21; F3, n = 10; F4, n = 2. The numbers of patients by grade of hepatic steatosis were as follows: S0, n = 33; S1, n = 36; S2, n = 9; S3, n = 7.

The effect of skin–liver capsule distance on measurement reliability

A comparison of the reliability of the measurement rate for SCD is shown in Fig. 1. We divided the patients into three groups as follows: SCD < 17.4 mm, n = 35; 17.5 ≤ SCD < 22.4 mm, n = 34; 22.5 ≤ SCD, n = 16. Figure 1a and b indicates that the reliability of the measurement rate with the M probe was low in the group with SCD over 22.5 mm (M-TE: 43.8%; M-CAP: 43.8%). The rate for the XL probe was high in the group with SCD over 25.0 mm (XL-TE: 81.8%, 93.8%, respectively; XL-CAP: 100%, 83.3% respectively). The rate for 2D-SWE was high in all groups regardless of the SCD (at least 87.5%).

Fig. 1
figure 1

Comparison of the reliability of measurement by skin–liver capsule distance

Study 2: the diagnostic accuracy of TE/2D-SWE and CAP/ATI

Patient background characteristics

Based on the results of study 1, we used the following probes for TE/CAP; SCD < 22.5 mm: M probe, SCD ≥ 22.5 mm: XL probe. We analyzed 66 patients with a reliable measurement on both TE and 2D-SWE (elastography group), and 62 patients who had a reliable measurement on both CAP and ATI (steatography group), to assess the diagnostic accuracy of TE/2D-SWE and CAP-ATI.

The patient demographics are indicated in Table 3. In the elastography group, the median age was 56 (range 22–78) years. Thirty-nine patients were male. Thirty-four patients were diagnosed as either NASH or NAFLD and 27 as non-specific chronic hepatitis. The number of patients with fibrosis stage 0/1/2/3/4 was 9/31/17/7/2, respectively.

Table 3 Demographics of patients with reliable elastography and steatography measurements

In the steatography group, the median age was 58 (range 22–81) years. Thirty-eight patients were male. Thirty-eight patients were diagnosed as either NASH or NAFLD and 24 as non-specific chronic hepatitis. The number of patients with steatosis grade 0/1/2/3 was 17/31/7/7, respectively.

The validity of the liver stiffness measurements obtained with TE and 2D-SWE

We compared the elasticity on TE with 2D-SWE. For the detection of fibrosis (F0 or higher), no statistically significant difference was seen between the area under the curve of the ROC (AUROC) values with TE and 2D-SWE (0.723 and 0.652, respectively, p = 0.417) (Fig. 2a). There was no significant difference of the AUROC between TE and 2D-SWE for stratifying fibrosis stages F1 and F2 or higher (0.869 and 0.732, respectively, p = 0.051) (Fig. 2b). For the detection of severe fibrosis (F0/1/2 or higher), there was no significant difference between TE and 2D-SWE (0.895 and 0.761, respectively, p = 0.260) (Fig. 2c). The optimal cut-off values for TE and 2D-SWE for F ≥ 3 fibrosis were 8.8 kPa (AUROC 0.895, sensitivity 87.5%, and specificity 78.0%) and 1.59 m/s (AUROC 0.761, sensitivity 87.5%, and specificity 78.0%), respectively.

Fig. 2
figure 2

Difference in AUROC to stratify fibrosis stage between TE and 2D-SWE

The validity of liver steatosis measurements obtained with CAP and ATI

We compared CAP and ATI for evaluating steatosis. No statistically significant difference was seen between the AUROC values of CAP and ATI in the stratification of steatosis grade: S0 vs S > 1; 0.936 and 0.844, respectively, p = 0.059: S < 1 vs S > 2; 0.817 and 0.835, respectively, p = 0.724: S < 2 vs S3; 0.808 and 0.801, respectively, p = 0.904) (Fig. 3). The optimal cut-off values for CAP and ATI for S ≥ 1 steatosis were 253 dB/m (AUROC 0.936, sensitivity 86.4%, and specificity 88.9%) and 0.76 dB/cm/MHz (AUROC 0.844, sensitivity 70.5%, and specificity 83.3%), respectively.

Fig. 3
figure 3

Difference in AUROC to stratify steatosis grade between CAP and ATI

The effect of SCD on diagnostic performance of TE/2D-SWE and CAP/ATI

A comparison of the rate of diagnostic accuracy between measurement and pathological findings by SCD is shown in Fig. 4. Based on the results of this study, we defined the cut-off values for TE and 2D-SWE for F ≥ 3 fibrosis as 8.8 kPa and 1.59 m/s, respectively. The rate of accuracy for 2D-SWE, XL-TE, and M-TE in the group with SCD over 22.5 mm was 56.3%, 81.3%, and 31.3%, respectively (Fig. 4a). The rate for 2D-SWE and M-TE was very low in the group with SCD over 22.5 mm. And we defined the cut-off values for CAP and ATI for S ≥ 1 steatosis as 253 dB/m and 076 dB/cm/MHz, respectively. The rate of accuracy for ATI, XL-CAP, and M-CAP in the group with SCD over 22.5 mm was 87.5%, 75%, and 75%, respectively. The rate for ATI was higher than that for the other modalities.

Fig. 4
figure 4

Rate of accuracy for fibrosis and steatosis

Discussion

With FibroScan, the cut-off value for both the M and XL probes was SCD ≥ 25 mm, as indicated in a previous report [15] and in the manufacturer’s recommendation. However, Kumagai et al. showed that inadequate liver stiffness measurements (LSMs) with the M probe increased with longer SCDs, with a significant difference between subgroups at an SCD ≥ 22.5 mm [16]. This result is similar to that found in our study: the reliability of the measurement rate for fibrosis obtained with the M probe was low in the group with SCD > 22.5 mm. It is reported that the measurement depth of the M probe is 25–64 mm in obese patients, and that the presence of adipose tissue in the region of interest explored by M probe leads to overestimation of liver stiffness [12]. Alternatively, we found that the reliability of the measurement rate with 2D-SWE was high in all groups when divided by SCD. We hypothesized that the reason for the high reliability of the measurement rate was that 2D-SWE is integrated in conventional ultrasound systems, which allows for visual control of the measurement location, and we could thus obtain a reliable measurement from the color mapping area. However, a previous report [17] found that the odds of a successful 2D-SWE examination decreased with higher SCD. We thought that the reason for the different result was that there were no patients with extremely high SCDs in our study. The manufacturer has stated that the reliability is low in patients with extremely high SCDs over 40 mm. In our study, the highest SCD was 30.8 mm.

Some studies [18,19,20] have investigated the diagnostic efficacy of TE and 2D-SWE for fibrosis compared to pathological assessment and concluded that the diagnostic accuracy of TE was as good as 2D-SWE. Similarly, in our study of the accuracy of diagnosis of fibrosis using both TE and 2D-SWE, there was no difference in AUROC between TE and 2D-SWE. We concluded that the diagnostic ability of 2D-SWE was as good as that of TE. However, a comparison of SCD, M-TE, and 2D-SWE showed a low rate of accuracy in the group with SCD over 22.5 mm (Fig. 4). In the case of M-TE, the reason stemmed from the measurement depth of the M probe, as mentioned above. The reason that 2D-SWE showed low accuracy was unclear. It might have been due to the cut-off value for diagnosis of F ≥ 3 fibrosis. There are no reports of the optimal cut-off value for 2D-SWE; therefore, the cut-off value should be investigated in a future study.

Alternatively, for steatosis, CAP and ATI are useful for diagnosis [21,22,23,24,25]. However, there are no reports of a direct comparison in which biopsy was used as the reference standard for the diagnostic accuracy of CAP and ATI for steatosis. In the present study, there was no difference in the AUROC between CAP and ATI in terms of the diagnostic accuracy for steatosis. Therefore, we concluded that the diagnostic accuracy of ATI was as good as that of CAP. In a comparison by SCD, each modality showed at least 71.4% accuracy for S ≥ 1 steatosis despite the SCD.

This study had some limitations. First, the number of patients per group was small (in particular, the F3 and F4 groups). And the number of patients who had extremely high SCDs was small. Therefore, the assessment of the diagnostic efficacy of elastography may be insufficient. Second, this study was retrospective in nature and from a single institution, and additional multicenter studies with larger sample sizes are needed to validate the findings.

Conclusion

The diagnostic ability of FibroScan was as good as that of 2D-SWE for fibrosis and steatosis. However, for measurement in patients with high SCDs, FiborScan (XL probe) might be better for the diagnosis of fibrosis stage. However, 2D-SWE had the advantage of the reliability of its acquisition rate being high regardless of the SCD with one probe. Moreover, 2D-SWE had the advantage of allowing both ultrasound examinations to be performed, as well as fibrosis and steatosis measurements concurrently, as it is integrated into a conventional ultrasound system. As such, we concluded that 2D-SWE could be useful as a screening test because of the convenience and high reliability of the acquisition rate.