Introduction

Hepatocellular carcinoma (HCC) is the second most common cause of cancer-related deaths worldwide and the most common primary hepatic malignancy [1]. Unlike most other cancers, a noninvasive diagnosis based on imaging characteristics without mandatory pathologic confirmation is acceptable for HCC. Therefore, it is important to establish well-defined imaging diagnoses for HCC that provide high sensitivity while maintaining high specificity [2].

Many advances have been made in contrast-enhanced magnetic resonance imaging (MRI), including the use of hepatobiliary contrast agent (HBA). In contrast to extracellular contrast agent (ECA), HBA enables the delineation of focal hepatic lesions as hypointensity defects with high lesion-to-liver contrast [3], leading to improved sensitivity for HCC diagnosis [4]. Given this advantage of HBA, it has been actively incorporated into the major clinical guidelines [1, 2, 5]. However, as HBA begins to be taken up by hepatocytes about 60–90s after contrast injection, it is unclear whether hypointensity on transitional phase (TP) or hepatobiliary phase (HBP) was true washout or caused by lack of hepatocyte uptake [6], Thus, in the European Association for the Study of the Liver (EASL) 2018 and American Association for the Study of Liver Disease (AASLD) 2018 guidelines, hypointensity on the portal venous phase was only considered washout when using HBA [1, 2].

In two previous meta-analyses, MRI using HBA (HBA-MRI) had higher sensitivity (87% vs. 74–75%) and specificity (94% vs. 86%) than MRI using ECA (ECA-MRI) [4, 7]. However, two recent prospective studies compared diagnostic performance between ECA-MRI and HBA-MRI, and both studies reported ECA-MRI to have a higher sensitivity than HBA-MRI (71.2–77.9% vs. 46.4–66.3%) [8, 9], findings that conflict with the previous meta-analyses. Although these two prospective studies conducted head-to-head comparative analyses, their results might be limited by a small number of subjects, specific underlying liver disease, and different imaging criteria for diagnosing HCC. Considering the paucity of studies comparing ECA and HBA, we considered it timely and important to clearly determine and compare the diagnostic performance of ECA-MRI and HBA-MRI.

Therefore, we aimed to evaluate and compare the diagnostic performance of ECA-MRI and HBA-MRI using detailed comparison criteria.

Materials and methods

This systematic review and meta-analysis was performed in compliance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [10].

Literature search strategy

A literature search of Pubmed, EMBASE, and Cochrane Library databases was conducted to find original studies that investigated the diagnostic performance of contrast-enhanced MRI for diagnosing HCC. The search was limited to English-language studies on human subjects. The time period for the studies was limited from January 1, 2010 to February 12, 2020, considering the time point for the commercial release of HBA. The detailed search strategy is described in Supplementary Table 1.

Inclusion and exclusion criteria

Studies meeting the following criteria were included: (a) population: patients at high risk for HCC [1, 2, 5]; (b) index test: liver MRI with a full protocol using gadolinium-based contrast agent (HBA or ECA); (c) reference standard: histopathology and clinical diagnosis such as imaging follow-up or laboratory markers; (d) outcomes: sufficient details to be able to obtain the number of true positives, false positives, false negatives, and true negatives to allow calculation of the sensitivity and specificity for the diagnosis of HCC on a per-lesion basis.

Studies meeting any of the following criteria were excluded: (a) studies not reporting sufficient data to clearly establish outcomes; (b) studies for which it was not possible to obtain separate outcomes using HBA and ECA; (c) studies with hepatic lesions previously treated with systemic therapy; (d) studies with case–control designs; (e) studies with partially overlapping cohorts; (f) case reports or series including fewer than ten patients; and (g) protocols, conference abstracts, reviews, guidelines, books, letters, editorials, and errata.

The two reviewers (≥ 5 years of experience in abdominal imaging) independently screened the titles and abstracts for potential eligibility, and full-text reviews were conducted of potentially relevant articles to determine their eligibility for the analysis. Disagreements were harmonized by consensus involving arbitration by a third reviewer (14 years of experience in abdominal imaging).

Data extraction and quality assessment

The following data were extracted from each eligible study: (a) study characteristics: authors, year of publication, institution, country, duration, and study design (prospective vs. retrospective); (b) patient characteristics: number of patients, sex, age, underlying liver disease, and Child–Pugh score; (c) lesion characteristics: lesion number, lesion size, and final diagnosis; (d) MRI techniques: magnetic field, MRI protocol, and type of contrast agent; (e) reference standard; and (f) study outcome: true-positive, false-positive, false-negative, and true-negative values of the MRI.

The methodological quality of the selected studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool [11]. The risk of bias and applicability of each eligible study were assessed according to the four different domains of patient selection, index test, reference standard, and flow and timing. Studies without a high risk of bias in any domain were considered to have a low-to-moderate overall risk of bias. Likewise, studies without a high concern for applicability in any domain were considered to have a low-to-moderate overall concern for applicability.

The data extraction and quality assessment were independently conducted by the two reviewers, with any disagreements being resolved by discussion with the third reviewer.

Statistical analysis

The aim of this study was to compare the performance of ECA- and HBA-MRI for the diagnosis of HCC. Therefore, the results of ECA- and HBA-MRI in all articles were segregated and analyzed as separate studies. The per-lesion sensitivity and specificity with 95% confidence intervals (CI) were determined from each individual study. The meta-analytic pooled sensitivity and specificity of ECA- and HBA-MRI were calculated using the bivariate random effects model [12]. The summary receiver operating characteristic curve was obtained using hierarchical summary receiver operating characteristics (HSROC) modeling. Heterogeneity was evaluated using Higgin’s I2 statistic (I2 > 50% indicating substantial heterogeneity). The presence of a threshold effect was evaluated by visual assessment of the coupled forest plots of sensitivity and specificity and the Spearman correlation coefficient between sensitivity and false-positive rate (> 0.6 indicating a considerable threshold effect) [13]. The pooled performances were compared between ECA- and HBA-MRI using a joint-model bivariate meta-regression [14].

Subgroup analyses were performed for predefined subsets of studies based on (a) study design (prospective and retrospective studies), (b) underlying liver disease (hepatitis B and hepatitis C or alcoholic hepatitis), (c) lesion size (≤ 2 cm), (d) reference standard (pathology only for malignancy) and (e) imaging criteria (EASL 2018 and the Liver Imaging Reporting and Data System [LI-RADS] category 5 [i.e., adopted in the AASLD 2018]). For HBA-MRI, subgroup analysis using modified EASL 2018 or LI-RADS criteria by applying extended washout (i.e., hypointensity in PVP, TP, or HBP) was additionally performed. The available subgroup data of each individual study were extracted to perform the subgroup analyses.

To explore the causes of study heterogeneity, meta-regression analysis was performed using the following covariates: (a) study design (prospective and retrospective studies), (b) study location (western and eastern), (c) study period (after 2015 and before 2015), (d) lesion size (all lesions ≤ 3 cm and others), (e) number of lesions (≥ 200 and < 200), (f) MRI magnet strength (3.0-T only and 1.5-T or combined use), (g) reference standard (pathology only for malignancy and pathology or clinical diagnosis [i.e., composite reference standard]), and (h) diagnostic cutoff (definite HCC and probable HCC or others).

Publication bias was evaluated by visual assessment of Deeks’ funnel plot and application of Deeks’ asymmetry test [15]. Statistical analyses were conducted using Stata version 15.1 (StataCorp LP, College Station, TX, USA).

Results

Study characteristics

A total of 1760 studies were identified from MEDLINE (n = 1087), EMBASE (n = 1891), and the Cochrane Library (n = 41) after removing 1259 duplicates (Fig. 1). After screening by their titles and abstracts, 1581 articles were excluded. Full-text reviews led to exclusion of a further 148 articles, including one head-to-head comparison study between ECA- and HBA-MRI [8] that had a population overlapping with another head-to-head study [16]. A total of 31 studies were subsequently analyzed, including 11 using only ECA-MRI, 16 using only HBA-MRI, and 4 using both ECA- and HBA-MRI. As one study [17] comparing ECA- and HBA-MRI had a population potentially overlapping with another eligible study using HBA-MRI, it was not included in the analysis of the diagnostic performance of HBA-MRI. Finally, 15 studies using ECA-MRI and 19 studies using HBA-MRI were included in the analyses.

Fig. 1
figure 1

Flow diagram of study selection

The characteristics of the 31 eligible studies are summarized in Table 1 (Also see Appendix 1). Of these 31 studies, 10 were prospective studies, and 21 were retrospective studies. Fourteen studies were from western countries, and 17 were from eastern countries. Hepatitis B was the most common underlying liver disease in 16 studies, hepatitis C in 8 studies, and alcoholic hepatitis in 2 studies. Regarding the reference standard, 17 studies used pathology only as the reference standard for malignancy. The diagnostic performance of MRI for diagnosing HCC ≤ 2 cm was available in 15 studies. LI-RADS was the most commonly used imaging criteria (60% [9 of 15] in ECA-MRI and 52.6% [10 of 19] in HBA-MRI), and EASL 2018 criteria was the second most commonly used imaging criteria (26.7% [4 of 15] of ECA-MRI and 21.1% [4 of 19] of HBA-MRI). Of note, six HBA-MRI studies used modified EASL 2018 or LI-RADS criteria by applying extended washout.

Table 1 Characteristics of the eligible studies

Quality of the included studies

The overall risk of bias was low to moderate in 22 articles. Of the four domains, a risk of bias frequently occurred in the domains of the reference standard and flow and timing because of the use of composite reference standard (Supplementary Fig. 1). The overall concerns regarding applicability were low to moderate in 28 articles. In the reference standard domain, unclear or high concerns were frequently found because of the use of different reference standard, and inappropriate or unknown intervals between MRI and the reference standard.

Diagnostic performance in the studies with ECA-MRI and HBA-MRI

In the 15 studies using ECA-MRI with 2890 focal hepatic lesions including 1763 HCCs, the sensitivity and specificity ranged 35–90% and 75–100%, respectively. The pooled sensitivity and specificity were 72% (95% CI 65–79%) and 92% (95% CI 89–95%), respectively (Fig. 2), with an area under the HSROC of 0.92 (95% CI 0.89–0.94; Supplementary Fig. 2).

Fig. 2
figure 2

Coupled forest plots of sensitivity and specificity using extracellular contrast agent (ECA)

In the 19 studies using HBA-MRI with 3893 hepatic lesions including 2830 HCCs, the sensitivity and specificity ranged 41–96% and 71–100%, respectively. The pooled sensitivity and specificity were 76% (95% CI 68–83%) and 92% (95% CI 87–95%), respectively (Fig. 3), with an area under the HSROC of 0.92 (95% CI 0.90–0.94) (Supplementary Fig. 2). Compared with ECA-MRI, no significant difference was found in the pooled sensitivity (72% [ECA-MRI] vs. 76% [HBA-MRI]) or specificity (92% vs. 92%; p = 0.72; Table 2).

Fig. 3
figure 3

Coupled forest plots of sensitivity and specificity using hepatobiliary contrast agent (HBA)

Table 2 Comparison of pooled diagnostic performance between ECA-MRI and HBA-MRI

In both ECA-MRI studies and HBA-MRI studies, substantial study heterogeneity was noted in the sensitivity (I2 = 85.9% and 94.1%, respectively) and specificity (I2 = 62.7% and 78.8%, respectively), but there was no significant threshold effect (Spearman correlation coefficients = 0.536 and 0.154, respectively). Borderline publication bias was found in ECA-MRI studies (Supplementary Fig. 4), although there was no statistical significance both in ECA- and HBA-MRI studies (p = 0.08 and 0.45 respectively; Supplementary Fig. 5).

Comparison of diagnostic performance in the subgroups

The results of the subgroup analyses are summarized in Table 2. In the subgroup analysis based on the study design, HBA-MRI had higher sensitivity and lower specificity than ECA-MRI in retrospective studies (77% vs. 70% for sensitivity and 90% vs. 95% for specificity), but lower sensitivity and higher specificity in prospective studies (72% vs. 77% for sensitivity and 93% vs. 89% for specificity). However, there was no significant difference between ECA- and HBA-MRI (p = 0.11 [retrospective studies] and 0.65 [prospective studies]). Regarding underlying liver disease, we found no difference between ECA- and HBA-MRI (78% vs. 75% for sensitivity, and 95% vs. 90% for specificity in hepatitis B, p = 0.09; 71% vs. 59% for sensitivity, and 92% vs. 91% for specificity in hepatitis C or alcoholic hepatitis, p = 0.38). Similarly, there was no significant difference between ECA- and HBA-MRI in the subgroup analysis of lesion size ≤ 2 cm (71% vs. 69% for sensitivity, and 93% vs. 93% for specificity; p = 0.97), reference standard of pathology only for malignancy (77% vs. 81% for sensitivity, and 94% vs. 92% for specificity; p = 0.70), and imaging criteria using EASL 2018 or LI-RADS category 5 (68% vs. 66% for sensitivity, and 94% vs. 91% for specificity; p = 0.33). In the HBA-MRI studies using modified EASL 2018 or LI-RADS criteria by applying extended washout, the pooled sensitivity and specificity were 83% (95% CI 71–90%) and 85% (95% CI 75–95%), respectively.

Meta-regression analysis

In the meta-regression analysis, the three factors of study design (p = 0.02), study location (p = 0.02), and number of lesions (p < 0.01) were significantly associated with heterogeneity across the studies using ECA-MRI. The two factors of number of lesions (p = 0.02) and diagnostic cutoff (p < 0.01) were significantly associated with heterogeneity across the studies using HBA-MRI. Detailed results of the meta-regression analyses are summarized in Supplementary Table 2.

Discussion

For the diagnosis of HCC, this study found no significant difference in pooled sensitivity (72% vs. 76%) or specificity (92% vs. 92%) between ECA- and HBA-MRI, respectively (p = 0.72), with similar areas under the HSROC (0.92 vs. 0.92). In addition, no significant difference was found between ECA- and HBA-MRI according to study design (p ≥ 0.11), underlying liver disease (p ≥ 0.09), lesion size (p = 0.97), reference standard (p = 0.70), or imaging criteria (p = 0.33).

HBA is reported to have many advantages such as providing valuable information for detecting small HCCs and differentiating small HCCs from arterial-enhancing pseudolesions with high lesion-to-liver contrast and lesion conspicuity during the hepatobiliary phase [18]. However, it also has disadvantages such as weak arterial-phase hyperenhancement, arterial-phase image quality degradation by acute transient dyspnea, and masking of the enhancing capsule by the relatively strong enhancement of background liver parenchyma [3, 19]. Therefore, we suspect that the improved sensitivity of HBA-MRI from the advantages might be mitigated by the disadvantages, resulting in no statistically significant difference between ECA- and HBA-MRI.

The results of this meta-analysis are different from those of two head-to-head prospective studies, which reported higher sensitivity and accuracy of ECA-MRI in comparison with HBA-MRI [8, 9]. Although these previous studies reported similar conclusions, rigorously speaking, the reported data varied between the two studies, with sensitivities of 46.6% vs. 66.3% for HBA-MRI, and specificities of 83.3% vs. 100% for both ECA- and HBA-MRI [8, 9]. These differences might be associated with different patient characteristics (hepatitis B vs. alcoholic hepatitis) and different lesion characteristics (the proportion of HCC: 81% vs. 68%; the proportion of lesion size < 2 cm: 53% vs. 68%). Therefore, the results of those head-to-head studies might not generalize to clinical practice. However, our meta-analysis included 2890 lesions in 15 ECA-MRI studies and 3893 lesions in 19 HBA-MRI studies, covering a wide spectrum of clinical features, and found no significant difference between ECA- and HBA-MRI according to underlying liver disease and lesion size; therefore, we suggest that our meta-analysis may be more relevant to clinical practice.

Two previous meta-analyses compared ECA- and HBA-MRI using subgroup analyses reported that HBA-MRI had higher sensitivity than ECA-MRI (87% [HBA-MRI] vs. 74–75% [ECA-MRI]) [4, 7], which had similar tendency to our results. However, compared to them, our meta-analysis resulted in lower sensitivity of HBA-MRI (76%), possibly yielding no significant difference between ECA- and HBA-MRI. The difference in the sensitivity from the previous studies might be explained by the difference in eligible studies between ours and the previous ones, probably caused by the different purposes: unlike the previous meta-analyses aiming at the diagnostic performance of MRI per se or by comparing with CT, the main purpose of our study was to compare the diagnostic performance of ECA- and HBA-MRI on a per-lesion basis. Indeed, the majority (more than 90%) of the eligible studies in the previous meta-analyses were excluded for our study due to their published year (i.e., studies before the commercial release of HBA) and study design (i.e., per-patient analysis among patients with HCC). The imaging criteria for HCC is another possible cause for the lower pooled sensitivity of HBA-MRI in this meta-analysis in comparison with the previous meta-analyses [4, 7]. When we considered the imaging criteria used for HCC in each individual study, the following difference was noticeable: none of the included studies in the previous meta-analyses used the imaging criteria suggested by EASL 2018 or AASLD 2018 [4, 7], whereas 63.2% (12/19) of the included HBA-MRI studies in this meta-analysis used them for the diagnosis of HCC. As both EASL 2018 and AASLD 2018 criteria restrict the assessment of washout on HBA-MRI to only the portal venous phase, the reduced sensitivity of HBA-MRI in this meta-analysis might be attributed to the inclusion of studies complying with the EASL 2018 or AASLD 2018 guidelines. In fact, both ECA- and HBA-MRI had almost the same sensitivity (68% vs. 66%, respectively) and specificity (94% vs. 91%, respectively) in the subgroup analysis of imaging criteria using EASL 2018 or LI-RADS category 5. In addition, HBA-MRI studies using modified EASL 2018 or LI-RADS criteria by applying extended washout had a higher pooled sensitivity (83%) and a lower pooled specificity (85%) than using conventional imaging criteria. This result was similar to the previous studies reporting that extended washout as a major feature could reduce the specificity of HBA-MRI while increasing sensitivity [6]. However, considering the fact that several studies using ancillary features such as marked T2 hyperintensity or a targetoid appearance as exclusion criteria demonstrated mitigation of the decreased specificity by extended washout [20], further studies on this issue are needed to elaborate the imaging criteria for HCC on HBA-MRI to make full use of the main advantages of HBA-MRI while maintaining high specificity.

Comparative diagnostic performance between ECA-MRI and HBA-MRI implies that no certain contrast agent can be preferred, but clinical perspective needs to be considered. For example, given the improved sensitivity for HCC in well-compensated cirrhosis, HBA-MRI would be advantageous in Asia where surgical resection or locoregional treatments are the preferred treatment options [21]. By contrast, considering the fact that conspicuity of HCC in the HBP of HBA-MRI can be decreased in patients with poor hepatic function due to diminished parenchymal enhancement [21, 22] and that ECA-MRI has a relatively higher sensitivity than HBA-MRI in decompensated cirrhosis [16], ECA-MRI might be favorable to patients with poor liver function. Further investigation would be needed regarding the selection of contrast agent tailored to the patient’s individual characteristics. Besides the diagnostic performance of MRI, the selection of contrast agent should also be considered according to the likely improvement of clinical outcomes such as survival gain or reduction of mortality. The survival benefit of HBA-MRI after CT was shown in a single-center retrospective study of 700 patients with a single early-stage HCC [23], and the benefit of HBA-MRI over ECA-MRI was shown in a recent cohort study of more than 30,000 patients with localized HCCs [24]. Given the comparable diagnostic performance of HBA-MRI to ECA-MRI in this meta-analysis, the selection of contrast agent might be determined with consideration of the benefits of each agent. However, because of the lack of a randomized multicenter trial to assess whether ECA- and HBA-MRI result in different clinical outcomes, a further study is necessary.

This study has several limitations. First, substantial study heterogeneity was noted, and could preclude the creation of solid meta-analytic summary estimates regarding the diagnostic performance of ECA- and HBA-MRI. To overcome the heterogeneity of our data, we robustly performed subgroup analyses and meta-regression analyses. Second, the comparison between the two contrast agents according to liver function was limited by the information available in the published studies, such as the subgroup data according to liver function not being separately reported in each individual study, which might leave some ambiguity in the results. Third, a borderline publication bias was noted in ECA-MRI studies. Because studies with significant results are more likely to be published than those with no significant results, the summary estimates might be led to an upward bias [25].

In conclusion, for the diagnosis of HCC, ECA-MRI had similar sensitivity and specificity to HBA-MRI. In addition, no significant difference in performance was found according to study design, underlying liver disease, lesion size, reference standard, and imaging criteria. Therefore, given the comparable diagnostic performance of ECA- and HBA-MRI, the selection of contrast agent might be determined with consideration of the advantages of each agent.