Introduction

Viral, autoimmune, hereditary, metabolic and toxin-mediated liver disease can result in hepatocellular dysfunction, expansion of the extracellular matrix with liver fibrosis, portal hypertension and finally liver cirrhosis [1]. Estimation of the degree of liver fibrosis is important for prognosis, surveillance and treatment decision in patients with chronic liver disease (CLD) [2, 3]. Liver biopsy, the most commonly used reference method for the assessment of liver fibrosis, is an invasive method associated with patient discomfort and sometimes with serious complications [4]. In addition, the accuracy of liver biopsy is limited owing to intra- and inter-observer variability and sampling error [5]. Transient elastography (FibroScan; Echosens, Paris, France) is currently a well-established method of non-invasive ultrasound-based liver fibrosis staging worldwide and has been evaluated in several meta-analyses with good diagnostic accuracy [610]. Besides fibrosis staging, conventional ultrasound is advised every 6–12 months in patients with chronic liver disease to screen for hepatocellular carcinoma, signs of liver cirrhosis, to evaluate portal vein flow and exclude thrombosis [11]. However, conventional ultrasound cannot be performed with FibroScan.

Acoustic radiation force impulse (ARFI) imaging is a novel ultrasound-based elastography method integrated into a conventional ultrasound machine and can be performed with ultrasound probes during an abdominal ultrasound. It could represent an alternative method to transient elastography (TE) for the non-invasive assessment of liver fibrosis.

The aim of the present study was to assess the overall performance of ARFI imaging for the diagnosis of liver fibrosis by including all relevant publications (including abstracts) with the main focus on the meta-analysis of the area under the ROC curve (AUROC).

Materials and methods

Acoustic radiation force impulse imaging

ARFI imaging (ACUSON S2000, Virtual Touch tissue; Siemens Healthcare, Erlangen, Germany) is performed with a curved array for B-mode imaging of the abdomen. No limitations concerning measurement are known. It involves targeting of an anatomical region to be interrogated for elastic properties with a region–of–interest (ROI) cursor. Tissue at the ROI is mechanically excited using short-duration acoustic pulses with a fixed transmit frequency of 2.67 MHz to generate localised displacements in tissue. The displacements result in shear-wave propagation away from the region of excitation and are tracked using ultrasonic, correlation-based methods [12]. The maximum displacement is estimated for many ultrasound tracking beams laterally adjacent to the single push-beam. By measurement of the time to peak displacement at each lateral location, the shear-wave speed of the tissue can be reconstructed [12]. The propagation velocity is proportional to the square root of tissue elasticity [13, 14]. The results are expressed in metres per second (range, 0.5–4.4 m/s with ± 20 % accuracy over the range).

Literature search

A systematic literature search was performed to evaluate the performance of ARFI for the diagnosis of liver fibrosis in CLD from 2007 to February 2012. Sources searched for original articles included PubMed and the Cochrane Library. Sources searched for abstracts included ISI Web of Knowledge, American Association for the Study of the Liver (AASLD, Hepatology), European Association for the Study of the Liver (EASL, J Hep), Digestive Week (DDW), European Journal of Ultrasound (Ultraschall in Med), Ultrasound in Medicine and Biology.

The search was performed independently by two researchers (A.C.O. and J.N.) with the keywords “Acoustic Radiation Force Impulse Imaging”, “ARFI”, “elastography and liver”, “elastography and hepatic” and “Virtual Touch Tissue Quantification”. The input into PubMed was “[(Virtual touch tissue quantification) or (Acoustic Radiation Force Impulse Imaging) or (ARFI)] and (liver or hepatic)”.

Inclusion and exclusion criteria

The researchers defined a priori the inclusion and exclusion criteria. Studies were included if they evaluated the performance of ARFI of the liver in adults with liver biopsy as the reference standard and CLDs. Further, the studies had to use a comparable liver biopsy staging system (METAVIR, Ishak, Brunt, Ludwig’s, Knodell, Desmet, Scheuer), assess the AUROC value for the fibrosis stages F ≥ 2, F ≥ 3 or F = 4 according to METAVIR or a comparable staging system and/or assess sensitivity, specificity, positive predictive value (PPV) or negative predictive value (NPV) for the diagnosis of a fibrosis stage based on some cut-offs for liver stiffness.

Studies were excluded if they were abstracts or full papers with data already published as a full paper. In the case of abstracts, which obviously present data of the same study at different meetings, only the most recent abstract was included. Authors of abstracts were contacted to confirm that the abstracts presented the data of the same patients before exclusion. Reviews, corresponding letters or editorials not reporting their own results were excluded too. Details are given in Fig. 1.

Fig. 1
figure 1

Flow diagram of study inclusion. AUROC area under the ROC curve, LB liver biopsy, CLD chronic liver diseases, PPV positive predictive value, NPV negative predictive value

The meta-analysis was conducted along a pre-specified short protocol but was not registered.

Data analysis

The meta-analysis was performed by estimating the mean AUROC over all studies using the random effects model (DerSimonian and Laird estimator) [15]. The AUROC of some studies was partly unknown for single fibrosis stages; therefore only the available AUROCs were analysed separately for the different fibrosis stages and for different aetiologies. The standard errors of the AUROC values could be determined or approximated from the available data, mainly using the 95 % confidence intervals (CIs). The random effects model accounts for heterogeneity of the different studies in the analysis of the overall AUROC value. Studies with a larger sample size and therefore a smaller standard error received more weight when calculating the mean AUROC.

In addition, mean diagnostic odds ratio (DOR), calculated by [sensitivity/(1 – sensitivity)] × [specificity/(1 – specificity)], was also estimated by the random effects model as a further measure of the diagnostic accuracy of ARFI [15]. A continuity correction of 0.5 was applied to each cell with a zero cell count.

Different fibrosis scoring systems all using scores from 0 to 4 were pooled to estimate an overall AUROC value and an overall diagnostic odds ratio. The Ishak score, using a scale from 0 to 6, was transformed into METAVIR as in a former meta-analysis [16].

Possible reasons for the heterogeneity between studies and comparisons of subgroups were analysed by the Q-test based on analysis of variance and by a meta-regression approach regarding the effect of different factors (mean age, mean body mass index [BMI], sex distribution, underlying liver disease, country where the study was performed, staging system, publication versus full-length article, number of patients excluded because of ARFI or liver biopsy measurement failure, depth and site of ARFI measurement, median length of liver biopsy sample and the quality criteria) on the AUROC.

To calculate a summary ROC (SROC) curve from all studies in which sensitivity and specificity were known a bivariate random effects model [17] was performed. Further, the area under the SROC curve was calculated. A weighted linear regression with weights chosen according to sample size was used to assess the influence of the cut-off levels for liver stiffness on sensitivity and specificity.

The presence of publication bias, heterogeneity of studies or data irregularities was assessed by a funnel plot analysis of the log DOR. According to Deeks et al. [18] sample size-related precision is more appropriately expressed by the effective sample size (ess) in the case of analysing diagnostic test accuracy. Therefore the log DOR is plotted against 1/root(ess) in the funnel plot, where root(ess) describes the root of the effective sample size. The rank correlation test based on Kendall’s tau was performed to evaluate the significance of asymmetry [19]. There is no established method to adjust for possible publication bias when analysing diagnostic test accuracy. Here, we used the trim and fill method by Duval and Tweedie [20] to adjust for a potential effect of publication bias on the funnel plot using 1/root(ess). Then, we recalculated the standard error of log DOR of the added hypothetical studies by taking the standard error of the counterpart on the other side of the funnel plot. Afterwards, we used a random effect estimate for log DOR including the added hypothetical studies to estimate an effect of publication bias, and compared it with the original random effect estimate of log DOR.

To assess the quality of the included studies the Quality Assessment of Studies of Diagnostic Accuracy Included in Systematic Review (QUADAS) questionnaire was used [21].

All analyses and graphical illustrations were performed using R (R Foundation for Statistical Computing, Vienna, Austria), particularly the R package meta by G. Schwarzer (2.1-0, 18.05.2012) and the R package metafor by W. Viechtbauer (1.6-0, 12.04.2011). The meta-analysis was performed according to the guideline Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA Statement) [22].

Results

The literature search yielded 637 articles evaluating ARFI. Finally 36 studies with a total of 3951 patients could be included in the meta-analysis according to the inclusion and exclusion criteria (Fig. 1). The 36 studies present various patient characteristics and study results which are shown in Table 1, Tables 2 and Table S1. Further the histological staging systems varied among the studies.

Table 1 Characteristics and histology distribution of studies evaluating the performance of acoustic radiation force impulse (ARFI) imaging for the diagnosis of liver fibrosis
Table 2 Results of studies evaluating the performance of ARFI for the diagnosis of liver fibrosis

Diagnostic accuracy of ARFI imaging

For the diagnosis of significant fibrosis (F ≥ 2) the AUROC with the corresponding standard errors were available in 26 studies. The mean AUROC for the diagnosis of significant fibrosis was 0.84 (95 % CI, 0.80–0.87). The mean AUROC for the diagnosis of severe fibrosis (F ≥ 3) was 0.89 (95 % CI, 0.87–0.92) and for the diagnosis of liver cirrhosis (F = 4) it was 0.91 (95 % CI, 0.89–0.94) with data available from 26 studies for F ≥ 3 respectively 27 studies for F = 4.

Because of significant differences of diagnostic accuracy of ARFI for the diagnosis of severe fibrosis (F ≥ 3, P = 0.048) among studies subanalysis were performed. The studies were divided into: (1) studies examining chronic hepatitis C (HCV)-infected patients only; (2) studies examining patients with NASH only; (30 studies including patients with chronic liver diseases of mixed aetiology. Detailed results are given in Fig. 2.

Fig. 2
figure 2

Forest plots from meta-analysis of the AUROC. Forest plots for F ≥ 2 (a), F ≥ 3 (b) and F = 4 (c) using the random effects model. The sizes of the boxes are proportional to the weight that the respective study contributes to the estimator of the overall AUROC. Studies (represented by the lines) with a larger sample size get more weight than studies with a small sample size. 1Study including only HBV-infected patients

In addition studies including chronic hepatitis B (HBV) patients and studies without HBV-patients were compared. A part of the heterogeneity for F ≥ 3 might be explained by differentiating the studies into those including HBV-infected patients and those without HBV-infected patients (P = 0.017). The overall AUROC for studies including HBV-infected patients was 0.87 (95 % CI, 0.85–0.90) and for studies without HBV-infected patients it was 0.92 (95 % CI, 0.89–0.95).

Further, analysing quantitative factors showed a significant influence of the mean BMI on the AUROC for F ≥ 2 with a reduction of the AUROC with increasing BMI (P = 0.0062). The number of patients excluded because of ARFI measurement failure, quality of liver biopsy, median length of liver biopsy sample, the different staging systems, the countries, the mean age and sex distribution had no significant influence on the AUROC. For the diagnosis of significant and severe fibrosis the depth of ARFI measurement had no influence on the AUROC, but for the diagnosis of liver cirrhosis (P = 0.028). Studies with depths of 2 cm had an overall AUROC of 0.94 (95 % CI, 0.91-0.97), studies with 3 cm an overall AUROC of 0.89 (95 % CI, 0.86-0.91) and studies with 5 cm an overall AUROC of 0.93 (95 % CI, 0.89-0.96). Further, a significant reduction of heterogeneity was found when differentiating between abstract versus full-length article with regard to the AUROC for the diagnosis of liver cirrhosis (P = 0.04). Full length articles show a significant higher diagnostic accuracy (AUROC, 0.93; 95 % CI, 0.9-0.95) than abstracts (AUROC, 0.89; 95 % CI, 0.86-0.92).

The measurement of ARFI on the right side indicates a significant higher diagnostic accuracy for the diagnosis of significant fibrosis with an overall AUROC of 0.86 (95 % CI, 0.83-0.89) than the measurement on both sides (P = 0.038) with an overall AUROC of 0.82 (95 % CI, 0.79-0.84).

For the diagnosis of significant fibrosis the DOR could be estimated in 19 studies. The mean DOR for the diagnosis of significant fibrosis was 11.54 (95 % CI, 9.46–14.09). For the diagnosis of severe fibrosis the DOR of only 19 studies was available and the mean DOR was 33.54 (95 % CI, 22.61–49.76). The mean DOR for the diagnosis of liver cirrhosis was 45.35 (95 % CI, 30.16–68.18) by inclusion of 20 studies with available DORs. Forest plots of the DORs for all fibrosis stages are given in Fig. S1.

A subgroup analysis identified significantly lower DORs for abstracts than for full length articles for the diagnosis of significant and severe fibrosis (F ≥ 2, P = 0.013; F ≥ 3: P = 0.013; F = 4: P = 0.2). Further, studies with more than nine QUADAS items rated as “yes” have significant higher diagnostic odds ratios than those equal to or less than nine items rated as “yes” for the diagnosis of significant and severe fibrosis (F ≥ 2, P = 0.035; F ≥ 3, P = 0.017; F = 4, P = 0.16).

The funnel plot analysis was performed for the diagnosis of different fibrosis stages and some deviations from symmetry could be seen among the single studies for all fibrosis stages. Kendall’s rank correlation test applied to the different fibrosis stages indicates this asymmetry only for F ≥ 2 (F ≥ 2, P = 0.0057; F ≥ 3, P = 0.1955; F = 4, P = 0.95). Further analyses with the trim and fill method yielded a recalculated DOR of 10.8 (95 % CI, 8.33-14.01) for the diagnosis of significant fibrosis with six added studies. For the diagnosis of severe fibrosis the recalculated DOR was 24.25 (95 % CI, 16.34–35.98) with eight added studies and for the diagnosis of liver cirrhosis the recalculated DOR was 34.32 (95 % CI, 22.5–52.34) with six added studies. A new funnel plot analysis for the diagnosis of the different fibrosis stages with the studies added by the trim and fill method is shown in Fig. S2.

The performance of a funnel plot analysis only with the original articles included in the meta-analysis seems to reduce the presence of asymmetry in the funnel plot for F ≥ 2 and F ≥ 3 (Kendall’s rank correlation test: F ≥ 2, P = 0.13; F ≥ 3, P = 0.4). By the trim and fill method no further studies have to be added for F ≥ 2 and two studies have to added for F ≥ 3 with a recalculated DOR of 42.23 (95 % CI, 27.06–65.89) in comparison to a mean DOR of 43.34 (95 % CI, 27.23–68.99) before.

In addition, a funnel plot analysis without the studies of lower quality (equal to or less than nine QUADAS items rated as “yes”) was performed. For the diagnosis of severe fibrosis no deviations from asymmetry could be seen in the funnel plot (Kendall’s rank correlation test, P = 0.352) anymore and no further studies had to be added by the trim and fill method.

SROC curves

The SROC curves for the different fibrosis stages are shown in Fig. 3. The area under the SROC curves were 0.86 for the diagnosis of significant fibrosis, 0.89 for the diagnosis of severe fibrosis and 0.93 for the diagnosis of liver cirrhosis.

Fig. 3
figure 3

Summary ROC curves. SROC curves for F ≥ 2 (a), F ≥ 3 (b) and F = 4 (c). The sizes of the squares, circles and squares for 1 – specificity and sensitivity of the single studies in the ROC space are derived from the respective sample sizes

Cut-off levels with the corresponding sensitivities and specificities were available in 20 studies for F ≥ 2, in 20 studies for F ≥ 3 and in 23 studies for F = 4. The dependence between sensitivity and specificity on the respective cut-off levels was evaluated but heterogeneity between the studies did not allow decisive conclusions on cut-off levels of the SROC curve (Fig. 4). The cut-off values suggested in Fig. 4 are given by 1.35 m/s for the diagnosis of significant fibrosis, 1.61 m/s for the diagnosis of severe fibrosis and 1.87 m/s for the diagnosis of liver cirrhosis.

Fig. 4
figure 4

Sensitivities and specificities of respective cut-off values of the single studies. They are shown for F ≥ 2 (a, b) and F = 4 (c, d). The sizes of the circles represent the sample size of the single studies. A weighted linear regression line and the optimal sensitivity and specificity from the SROC are shown. The crossing of the two lines suggests an optimal cut-off value (F ≥ 2, 1.35 m/s; F = 4, 1.87 m/s)

Quality assessment using QUADAS

Detailed information on the rating of items is given in Table S1. The proportions of studies rated as yes, no or unclear for each of the QUADAS items are shown in Fig. S3. Heterogeneity of the AUROC was found among the categories of selection criteria, reference execution details, test review bias and diagnostic review bias, respectively.

Discussion

The systematic literature search revealed 36 studies evaluating the diagnostic performance of ARFI for the staging of liver fibrosis that fulfilled the inclusion criteria. The aim of the systematic literature search was to perform a meta-analysis of the AUROC with the publications and abstracts included.

The diagnostic accuracy of ARFI showed best performance for the diagnosis of liver cirrhosis with an overall AUROC of 0.91. For the diagnosis of severe fibrosis the AUROC was 0.89 and for the diagnosis of significant fibrosis 0.84. A diagnostic tool is defined as perfect if the AUROC is 1, excellent if the AUROC is greater than 0.9 and good if the AUROC is greater than 0.8 [59, 60]. According to these results, ARFI imaging is a good diagnostic tool for the diagnosis of significant and severe fibrosis and an excellent tool for the diagnosis of liver cirrhosis. The results are in accordance with the results of a previous pooled meta-analysis of ARFI including eight studies with 518 individual patient data (IPD), which reported diagnostic accuracies of 0.93 for the diagnosis of liver cirrhosis, 0.91 for the diagnosis of severe fibrosis and 0.87 for the diagnosis of significant fibrosis [16]. Thus, both meta-analyses show similar results, although the results of the present meta-analysis are not based upon IPD, which allow a more extensive analysis of the data. However, the number of studies and patients included in the present meta-analysis is almost eight times higher than in the previous meta-analysis [16]. Hence, the similar results in the two meta-analyses suggest that there is no strong bias due to study selection in the previous meta-analysis, which can be the main deficiency of IPD meta-analysis [61]. Thus the performances of both meta-analyses, one IPD-based and the other based on summary data from published studies complement each other and confirm the results.

A comparable meta-analysis on TE (FibroScan) revealed overall diagnostic accuracies of 0.94 for the diagnosis of liver cirrhosis, 0.89 for the diagnosis of severe fibrosis and 0.84 for the diagnosis of significant fibrosis, respectively [6]. The advantage of ARFI compared with TE is that it is integrated into a conventional ultrasound system, enabling the simultaneous screening for focal liver lesions and elastography with the same machine and ultrasound probe. In addition, the measurement site of ARFI can be visualised with B-mode ultrasound and this allows more exact measurement of liver tissue elasticity by excluding small non-parenchymatous areas within the measurement site. ARFI can be performed at different measurement sites in the left and right liver lobes, which might even enable a better overall estimation of liver fibrosis, gaining a better overall distribution of liver fibrosis. However, this assumption needs further investigation; optimally, multiple biopsies would be needed from different locations to make a direct comparison with ARFI imaging performed at different locations within the liver. Another advantage to TE is that ARFI can be performed in obese patients, as well as in patients with ascites, and therefore more patients can be evaluated with this method. In patients with chronic liver disease regular ultrasound examinations are indicated once or twice per year, depending on the presence of severe fibrosis or cirrhosis.

An advantage of TE is the larger measurement area of 4 cm in length compared with only 1 cm for ARFI; however, the possibility of measuring in different areas of the liver might overcome this shortcoming of ARFI.

Optimal cut-off values for the different stages of liver fibrosis were estimated from the SROC analysis. For the diagnosis of significant fibrosis a cut-off value of 1.35 m/s is suggested, for the diagnosis of severe fibrosis a cut-off value of 1.61 m/s and for the diagnosis of liver cirrhosis a cut-off value of 1.87 m/s. These cut-offs are very close to the optimal cut-offs calculated in the previously published pooled meta-analysis of individual patient data with 1.34 m/s, 1.55 m/s and 1.80 m/s, respectively [16]. These cut-offs are, however, only suggestions since they vary in function of aetiology, necroinflammation and depth of measurement. Therefore, in clinical practice disease specific cut-offs should be preferred.

In addition, the SROC analysis shows similar results for the overall AUROCs with 0.86 for F ≥ 2, 0.89 for F ≥ 3 and 0.93 for F = 4.

Regarding the diagnostic accuracy, the results of the AUROC are in accordance with the estimations of the overall DOR. For the diagnosis of liver cirrhosis, the overall DOR showed the best performance with 45.35. But also the overall diagnostic odds ratios for the diagnosis of significant and severe fibrosis (F ≥ 2, 11.54; F ≥ 3, 33.54) revealed a good diagnostic accuracy.

Due to the slight asymmetry in the funnel plot there is evidence of publication bias for the diagnosis of the different fibrosis stages. The trim and fill method also indicates missing studies and the recalculated DORs are smaller than the mean DORs without these refilled studies. But publication bias does not have to be the main source of asymmetry in the funnel plot, the heterogeneity of studies could also cause this asymmetry.

Abstracts included in the meta-analysis showed significantly lower diagnostic odds ratios than full-length articles for the diagnosis of significant and severe fibrosis, and a significantly lower AUROC for the diagnosis of significant fibrosis. However, the exclusion of grey literature, e.g. unpublished studies or conference abstracts, could lead to an overestimation of the effect size, as mentioned in McAuley et al. [62]. Thus, the authors suggest the inclusion of grey literature, here the inclusion of conference abstracts that satisfy the inclusion criteria in a meta-analysis. Another cause of asymmetry could be the quality of the studies for the diagnosis of severe fibrosis. When performing a funnel plot analysis by taking only studies with more than nine QUADAS items rated as “yes” there is no longer any evidence of publication bias.

Significant heterogeneity was found among the individual studies for all fibrosis stages with regard to the AUROC. Sources of heterogeneity are the different distributions of liver diseases between the studies for the diagnosis of severe fibrosis and liver cirrhosis. A reason for this heterogeneity might be the fact that there is a slight trend towards higher diagnostic accuracy in studies with patients only infected with HCV than in studies with patients infected with different liver diseases. Another cause of the difference might be the homogenous distribution of liver fibrosis within the liver in patients infected with chronic hepatitis C compared with patients without HCV. Further subgroup analyses showed a significantly better diagnostic accuracy for the diagnosis of severe fibrosis for the studies without HBV-infected patients than for those including HBV-infected patients. A reason for this difference might be that liver fibrosis is distributed more heterogeneously in patients with HBV and liver cirrhosis is predominantly macronodular rather than micronodular in HCV.

The only quantitative factor that seemed to have a significant influence on the AUROC was the BMI for the diagnosis of significant fibrosis. Studies with a higher mean BMI reported a significantly lower overall AUROC. In patients with a higher BMI, adipose non-hepatic tissue could lead to an overestimation of liver stiffness [63]. So this might explain why, in particular, significant fibrosis is more difficult to separate from non-significant fibrosis. A study evaluating TE of 286 patients with chronic liver diseases revealed increasing liver stiffness values with increasing BMI [64]. Similar results are observed by the use of TE for the measurement of liver fibrosis by Petta et al. [65] and Gaia et al. [66]. They showed that a high BMI interferes with the performance of a precise measurement of the liver tissue. Furthermore, Jaffer et al. [67] show that the deeper the measurement within the liver the better the AUROC with ARFI imaging. A high BMI implicates a thick layer of fat but the ARFI measurement is limited in depth. This could explain the worse results of the AUROC for studies with a higher mean BMI. This association needs to be evaluated for ARFI imaging in future studies.

Sporea et al. [40] discovered that ARFI measurements at 1–2 cm and at 2–3 cm below the capsule seem to be the best depths to determine the stage of fibrosis. In the present meta-analysis, the performance of ARFI seems to be better at a depth of 2 cm than at a depth of 3 cm. But for the studies with the deepest measurements of 5 cm, the diagnostic accuracy was again higher in the present meta-analysis. Because these deep measurements were only observed in two studies, the depth of ARFI measurements and other reasons for ARFI measurement failure need to be evaluated in future studies.

Measurement failure with ARFI was reported in 12 out of 37 studies in 3 % of patients overall, but no significant influence on AUROC was observed. The same applied in the previous meta-analysis. A main reason in the present meta-analysis was failure because of a high BMI [25, 26, 29, 35, 47, 53]. Other reasons are probably age and gender [25], a narrow intercostal space [26], ascites (one case in [29]) and deep measurements in the liver capsule [39].

Another reason for heterogeneity regarding the AUROC could be the site of ARFI measurement for the diagnosis of significant fibrosis. ARFI measurement of the right liver lobe seems to perform better than ARFI measurement of the left liver lobe (lower AUROC in studies performing ARFI in both liver lobes). These results are in accordance with the study of Toshima et al. [41]. One explanation could be that ARFI measurement in the left liver lobe may lead to increased liver stiffness and therefore increased shear-wave velocity induced by manual compression during inspiration. Higher shear-wave velocities in the left liver lobe compared with the right liver lobe may support this assumption.

Furthermore, the assessment of quality by the QUADAS criteria could not explain the heterogeneity between the studies sufficiently.

In conclusion, the present meta-analysis including 36 studies with 3,951 patients revealed good diagnostic accuracy of ARFI imaging for the staging of significant and severe fibrosis, and excellent diagnostic accuracy for the diagnosis of liver cirrhosis.