Introduction

Accurate diagnosis focal liver lesions (FLLs) remains a dilemma [1, 2], whereas it is essential with regards to intervention and prognosis [3, 4]. The introduction of contrast-enhanced ultrasound (CEUS) with gas-filled microbubbles serving as contrast agents (CAs) has dramatically improved the characterization of FLLs when compared with conventional US (either B-mode or Doppler ultrasound) [5, 6].

All currently commercially available ultrasound CAs consist of an inert gas encapsulated by a shell molecule. The low-solubility gas component determines the major acoustic properties, while the shell mainly affects the stability and durability in blood [7, 8]. When employing an intravenous ultrasound contrast agent, CEUS makes it possible to observe the hemodynamic process in real-time. Advanced low mechanical index technologies along with sophisticated software provide high-resolution real-time contrast-specific imaging for detecting macro- and micro-vascularization in lesions [9]. Almost all malignancies show a contrast wash-out feature in the delayed phase compared to normal liver tissue; reversely benign lesions are typically iso- or hyper-enhancing. Consequently, many clinical studies have proved that CEUS is useful for characterization of FLLs based on the above characteristics [1, 10].

The US Food and Drug Administration finally approved the application of CEUS with SonoVue under the name of Lumason for liver examination in 2016 after years of off-label usage [11]. This license might result in a possible breakthrough in the field of CEUS study. Hence, we carry out a meta-analysis to present the diagnostic value of CEUS in the work-up of FLLs through summarizing the studies so far in order to give related researchers some reference. Additionally, there is a wide variety of contrast agents in the healthcare market, and sonographers are facing numerous choices. As there are still no comparative studies among different CAs published to date, CA selection was often done without any guidance from relevant theories. Therefore, the other aim of our study is to explore the diagnosis performances of different CAs, and then to offer a certain theoretical foundation for clinical practice.

Materials and methods

The systematic review was conducted according to the recommendations of the PRISMA guidelines.

Literature search

A comprehensive search was performed to identify suitable diagnostic studies from electronic databases (the Cochrane Library, PubMed and Web of Science) up to February 10th, 2017. The search terms used in this meta-analysis were as follows: (focal liver lesions OR FLL OR hepatocellular carcinoma OR cholangiocarcinoma OR metastatic hepatic carcinoma OR liver metastases OR liver tumor OR hepatic haemangioma OR focal nodular hyperplasia OR liver adenoma OR liver abscess OR liver neoplasms [Mesh]) AND (contrast-enhanced ultrasound OR contrast-enhanced US OR CEUS). The search had no language restriction, but only full articles written in English were further evaluated. The references of relevant reviews were also manually searched and screened to identify eligible studies.

Two reviewers selected eligible studies independently with disagreements resolved by consensus. The following inclusion criteria were utilized to recognize eligible studies: (1) human patients with suspected FLLs; (2) studies evaluated by CEUS in the differential diagnosis of FLLs; (3) only per-lesion or per-patient statistics had sufficient data to construct a diagnostic table (2×2 table); (4) each study consisted of at least 20 samples; (5) final diagnosis confirmed by histological or close clinical diagnosis with imaging follow-up for at least 6 months; (6) full articles were available and written in English.

Studies were excluded if: (1) types of literature such as reviews, letters, meta-analyses, case reports or editorial articles; (2) fewer than 20 patients; (3) could not provide sufficient data for diagnostic meta-analysis; (4) with FLLs after treatment. When data were presented in more than one study by the same authors, either the most recently published studies or the study with the largest sample size was included.

Data extraction

All selected studies were screened by two reviewers to retrieve the following data: first author’s name, publication year, country of origin, the number of patients, the number of lesions, average age, gender ratio, final diagnosis standard, final diagnosis (the specific disease types and quantities), the number of benign and malignant lesions, average lesion size, CA, true positive (TP), true negative (TN), false positive (FP) and false negative (FN).

Methodology quality assessment

The quality of eligible studies was evaluated by the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool by the same reviewers who performed data extraction. Fourteen items (maximum score 14) were included to assess the overall quality of each study.

Statistical analysis

The estimates including sensitivity, specificity, diagnostic odds ratio (DOR), positive likelihood ratio (PLR), negative likelihood ratio (NLR) with corresponding 95% confidence intervals (CIs) are summarized for graphically represent the diagnostic value of CEUS in differentiating malignant from benign lesions in FLLs. Afterwards, the hierarchical summary receiver operating characteristic (HSROC) curve and the area under the curve (AUC) were calculated. The heterogeneity across the studies was assessed by a chi-square test and Q statistic. The random effects model (the DerSimonian Laird method) would be utilized if the heterogeneity was significant (P heterogeneity < 0.05 or I 2 ≥ 50%); otherwise, the fixed effects model (the Mantel–Haenszel method) would be used. The Spearman correlation coefficient was used to investigate the threshold effect. Subgroup analysis and meta-regression analysis were also utilized to further explore the potential sources of heterogeneity. Bias in publication was tested by funnel plots. All statistical analyses were performed by Meta-Disc (version 1.4) and STATA (version 13.1).

Results

Study identification and selection

The initial databases search with the above strategy yielded a total of 4579 potentially relevant studies (29 from the Cochrane Library, 2642 from PubMed and 1908 from Web of Science). After 311 duplicated studies were deleted, 4268 potential studies remained. 4025 studies were further excluded according to the inclusion criteria by screening the titles and abstracts, and the remaining 243 studies were left for full text review. In accordance with the inclusion criteria, a further 186 records were excluded due to various reasons (seen in Fig. 1), leaving 57 eligible [1, 10, 12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66] studies selected in this meta-analysis. The detailed flow chart is shown in Fig. 1.

Fig. 1
figure 1

Flow chart of the study selection process

Characteristics of eligible studies

Basic characteristics of eligible studies are presented in Table 1 with the publication year from 2001 to 2017 (Fig. 2A). 35 studies were conducted in Western countries (10 from Germany, 9 from Italy and 16 from other Western countries), and the remaining 22 studies were conducted in Asian countries (10 from China, 8 from Japan and 4 from other Asian countries; Fig. 2B). The numbers of both patients and lesions varied from 30 to 1328. The average age of the included patients ranged from 13 (one study conducted in paediatric patients) to 70. Most of the malignant lesions were hepatocellular carcinomas (HCCs) and liver metastases, and most of the benign lesions were haemangiomas and regenerative or dysplastic nodules. The first-generation contrast agent (Levovist) was used in 12 studies, and the second-generation contrast agents were utilized in the other 45 studies [39 studies used SonoVue, 4 studies used Sonazoid (a particular US contrast agent, which has late liver-specific phase) and the remaining 2 studies used Definity and Optison]. QUADAS scores are also summarized in Table 1.

Table 1 The characteristics of eligible studies
Fig. 2
figure 2

Distribution of studies according to publication year (A) and country (B)

Diagnostic accuracy

The pooled sensitivity and specificity of CEUS for characterization of FLLs were 0.92 (95%CI: 0.91–0.93), and 0.87 (95%CI: 0.86–0.88), respectively (Fig. 3). The pooled PLR and NLR of CEUS were 7.38 (95%CI: 5.86–9.31) and 0.09 (95%CI: 0.07–0.11), respectively (Fig. S1). And the pooled DOR was 104.20 (95%CI: 70.42–154.16; Fig. S2). Figure 4 illustrates the SROC curve with AUC to be 0.9665. The Spearman correlation coefficient showed there was no significant correlation between sensitivity and specificity (r = -0.158, P = 0.242), which indicated no threshold effect.

Fig. 3
figure 3

Sensitivity (A) and specificity (B) for characterization of FLLs with CEUS

Fig. 4
figure 4

Summary receiver operating characteristic curves

Subgroup and meta-regression analysis

Several potential factors were explored to illustrate their capabilities in affecting the diagnostic accuracy (Table 2, Fig. 5). Since the DOR is a single entity which combines the data from sensitivity and specificity, we calculated pooled DOR to present the diagnostic accuracy. As seen in Table 2, number of lesions and CA type used in CEUS both greatly influenced the diagnostic accuracy. The DOR value of the big-sample-size group (number of lesion ≥100) appeared more improved than the small-sample-size group (number of lesion < 100; 135.86 vs. 58.19). Heterogeneity was still observed in the big-sample-size group (I 2 = 84.3%), while it was greatly reduced in the small-sample-size group (I 2 = 39.1%). CA type also affected the diagnostic accuracy. Because Definity was only applied in one study, as was Optison, the pooled DORs of these CAs could not be calculated in subgroup analysis. After eliminating these two types, Sonazoid had the highest DOR (DOR = 227.39), while Levovist had the lowest DOR (DOR = 62.78). Even more remarkable was the fact that heterogeneity was almost eliminated in the Sonazoid group (I 2 = 15.5%). Nevertheless, heterogeneity still existed in SonoVue and Levovist. When dividing the included studies according to different generations of CAs, the subgroup result demonstrated the second-generation CAs had higher diagnostic accuracy than the first-generation CA (Levovist; DOR: 118.27 vs. 62.78). However, heterogeneity still existed in both groups.

Table 2 Subgroup analysis of DOR of CEUS for the diagnostic performance of FLLs
Fig. 5
figure 5

Subgroup analysis of DOR for SonoVue (A), Sonazoid (B) and Levovist (C) in characterization of FLLs

Meta-regression analysis was performed to take all the above factors into account. As shown in Table 3, none of the factors (including region, number of lesions and CA type) was the major source of heterogeneity.

Table 3 Meta-regression analysis of potential source of heterogeneity.

Publication bias

Funnel plots were created to assess the publication bias of the eligible studies. As seen in Fig. 6, the plot was symmetric, indicating that there was no potential publication bias for the included studies (P = 0.630).

Fig. 6
figure 6

Funnel plot for the evaluation of potential publication bias of included studies

Discussion

The results of this meta-analysis showed that CEUS had excellent diagnostic capability in differentiating malignant from benign FLLs. The pooled sensitivity, specificity, DOR, PLR, NLR and AUC for CEUS in characterization of FLLs were 92%, 87%, 104.20, 7.38, 0.09 and 0.9665, respectively. Subgroup analyses demonstrated some factors might affect diagnostic performance such as number of lesions, CA generation and CA type.

Diagnostic performance of the big-sample-size group appeared greatly improved than the small-sample-size group (DOR: 135.86 vs. 58.19). The performance of CEUS is more strongly influenced by the experience of the sonographer compared with CT and MRI. The sonographers in large medical centers with adequate patients tend to have more professional experiences to distinguish FLLs for on-site reading in clinical practice [12].

Another major factor which greatly influenced the diagnostic accuracy of CEUS was the various kinds of CAs used in applications. Ultrasonic CAs have unique structures, consisting of inert gas and a shell molecule. Since the lifetime of air bubbles is very short, soft-shell materials are used to stabilize the CA, as well as improve the nonlinear oscillation. The terms “first- and second-generation ultrasound CA” are usually used to differentiate CAs, which are determined by different kinds of inert gas [7]. Though that's a bit of a simplification, in fact, the development of a second generation of ultrasound CAs leads to near complete disappearance of first-generation CAs on account of greatly improved image quality and effectiveness [7, 67]. In our study, the first-generation CA (Levovist) was used in 12 studies, and the second-generation CAs were utilized in the remaining 45 studies. The first-generation CA (Levovist) was used between 2001 to 2009; then the second-generation CAs replaced it entirely. Subgroup analysis indicated higher diagnostic accuracy of the second-generation CAs than the first-generation CA, Levovist (DOR: 118.27 vs. 62.78). The perfect DOR of second-generation CAs illustrated CA upgrade benefited diagnostic efficacies of FLLs.

Among the second-generation ultrasound CAs, Sonazoid is a particular kind. The unique feature of Sonazoid is the accumulation property in the reticuloendothelial system (RES), such as liver and spleen [68]. This phenomenon might involve the Kupffer cells, which present in the hepatic parenchyma. As the Kupffer cells do not exist in malignant lesions, the contrast-enhanced images can easily elucidate the difference of contrast effect between the malignant lesion from normal parenchyma or benign lesion in the post-vascular phase (also known as Kupffer phase) [9, 69]. This late liver-specific phase lasts from around 6 to 10 min to over 60 min. The advent of Sonazoid has become a big breakthrough in CEUS practice of characterizing FLLs. However, it is only available in Japan, South Korea and Norway, so far [9]. SonoVue, another kind of second-generation CA, is widely used in most countries and regions [2]. In our meta-analysis, we evaluated the diagnostic value of CEUS in differentiating malignant from benign FLLs, meanwhile, different kinds of CAs were explored for their diagnostic capabilities. Since no comparative studies between Sonazoid and SonoVue are available at present, our study may offer an evidence-based basis for clinical practice. In our meta-analysis, the liver-specific contrast agent (Sonazoid) was only utilized in 4 studies, much less than SonoVue (utilized in 39 studies). Nevertheless, Sonazoid demonstrated the highest diagnostic accuracy among three major CAs (SonoVue, Levovist and Sonazoid) used in CEUS practice (DOR: 118.82 vs. 62.78 vs. 227.39). The above results revealed that Sonazoid was an outstanding CA; however,it still needs global research to verify its diagnostic ability. Marked heterogeneity was found among the different studies. To deal with this issue, the Spearman correlation coefficient, subgroup analyses and meta regression were combined to detect the sources of heterogeneity. Number of lesions and CA type might contribute to heterogeneity of included studies according to subgroup analyses. Heterogeneity was mainly observed in the big-sample-size group and non-Sonazoid group. However, synthetic regression analysis did not provide evidence supporting the above results. This might be due to the multivariate factors involved in this clinical diagnostic procedure we were unable to statistically analyse. For example, Fracanzani’s [66] study indicated that the vascularity in a small nodule could not be easily assessed by CEUS. But since the data on small nodules couldn’t be obtained in most of the eligible studies, the diagnostic value of CEUS for small FLLs could not be estimated. Given that, the heterogeneity within our study may have influenced the reliability of our results. There are some limitations in this meta-analysis. Firstly, the performance of CEUS is strongly influenced by the experience of the sonographer. Heterogeneity among studies might not be fully eliminated. Secondly, in obese patients, or when the lesion is very deep, the lesion might be difficult to assess. This intrinsic limitation of CEUS might decrease the diagnostic performance to some extent. Thirdly, US techniques have evolved over the last decade; low mechanical index imaging along with phase inversion mode greatly improved spatial resolution [70]. This would result in significantly improved diagnostic capacity in recent studies compared to studies without these techniques. Lastly, meta regression failed to reveal the source of heterogeneity presented in this meta-analysis. The consequences might impact the credibility of this study, highlighting that further research is most pressing.

With regard to the above results, our meta-analysis indicates that CEUS has an outstanding performance in differentiating malignant from benign FLLs with both high sensitivity and specificity. The usage of second-generation CAs, especially Sonazoid, greatly promoted the diagnostic accuracy of CEUS. As CEUS becomes more widely available in the future, it’s role will increase in managing patients with FLLs.