Introduction

The value of contrast-enhanced ultrasound (CEUS) for diagnosing hepatocellular carcinoma (HCC) in high-risk patients has been controversial in recent years [1,2,3]. In the 2005 edition of American Association for the Study of Liver Diseases (AASLD) guidelines of the management of HCC, CEUS was introduced as one of the dynamic diagnostic imaging modalities [4]. However, it was eliminated from the HCC diagnostic flowchart in the updated 2011 guidelines [5]. This deletion was mainly caused by a study [6] which concluded that intrahepatic cholangiocarcinoma (ICC) in cirrhosis shares a similar enhancement pattern to that of HCC on CEUS (47.6%, 10/21), which may lead to a false-positive diagnosis of HCC. This removal has caused a worldwide controversy and has not gained acceptance in some countries [3, 7,8,9] because the study was based on a small sample size without differential diagnostic analysis between ICC and HCC.

In 2016, the American College of Radiology (ACR) released the CEUS Liver Imaging Reporting and Data System (LI-RADS) [10]. It was designed in such a way that LR-5 could provide the highest specificity to diagnosing HCC and that HCC could be diagnosed by the criterion of LR-5 and send patients to treatment without biopsy. Terzi E et al found a high positive predictive value (PPV) of 98.5% in predicting HCC [11]. The CEUS LI-RADS also includes a category of lesions that are either definitely or probably malignant, named LR-M, which is not specific for HCC. This category may be the possibility of ICC, mixed hepatocellular and cholangiocellular carcinoma or other malignancies. In fact, there are studies which showed that 35–48% of LR-M lesions were HCCs, indicating a high false negative rate and low sensitivity for HCC diagnosis [11, 12]. In the CEUS LI-RADS algorithm, besides the major imaging features for diagnosis, ancillary features are also important to refine the final category [12]. Benefit to the real-time scanning of CEUS, more dynamic and transient features would be explored to improve the system.

In light of the diagnostic issues of HCC and ICC, we sought to validate the diagnostic performance of HCC and ICC via the CEUS LI-RADS algorithm in our institute. Moreover, we identified important imaging predictors of ICC on CEUS, to develop an M-score that could be used to accurately predict the risk of ICC in high-risk patients and investigate the diagnostic performance of HCC and ICC with the modified CEUS LI-RADS.

Materials and methods

Patients

This study was approved by the institutional review board, and informed consent was obtained from each patient. From November 2003 to December 2017, 3315 consecutive patients with HCC and 457 consecutive patients with ICC who underwent both baseline US and CEUS were enrolled. The inclusion criteria were (1) a pathologically confirmed diagnosis of HCC or ICC and (2) high-risk patients with chronic hepatitis B infection confirmed via laboratory tests [13] and cirrhosis of any cause confirmed by pathological examination via liver biopsy or surgery. The exclusion criteria included (1) mixed hepatocellular cholangiocarcinoma (n = 49) or (2) unqualified and missing imaging data (n = 308).

Next, 1932 patients with HCC and 105 patients with ICC were included for propensity score matching, which was used to reduce the effect of selection bias in retrospective observational studies [14]. HCC and ICC patients were matched 1:1 using the nearest modality. The variables for matching were size and number of nodules. Finally, 105 patients with HCC and 105 patients with ICC were analysed in our study (Fig. 1). Basic clinical data including age and sex, as well as laboratory tests including hepatitis status, alpha-fetoprotein (AFP) levels, and CA-199 levels were recorded.

Fig. 1
figure 1

Diagram of the HCC and ICC patient selection

Image techniques

The ultrasound equipment was as follows: (1) Acuson Sequoia 512 with a 4 V1 vector transducer (frequency range, 1.0–4.0 MHz) and a contrast-specific mode of contrast pulse sequencing; (2) Aplio SSA-770 or Aplio 500 with a 375BT convex transducer (frequency range, 1.9–6.0 MHz) and a Contrast Harmonic Imaging mode; and (3) Aixplorer Ultrasound system equipped with the SC6–1 convex probe (frequency range, 1.0–6.0 MHz). All examinations were separately performed by two skilled radiologists (W.W. and X.Y.X.) who each had at least 15 years of experience in liver CEUS. First, the entire liver was scanned with baseline ultrasound (BUS). Then, the imaging mode was changed to CEUS, and a volume of 2.4 mL of SonoVue was administered. For patients with multiple nodules, only the largest lesion was selected. The target lesion was observed continuously for at least 3 min, and all imaging data were recorded. The CEUS process was classified into arterial (6–40 s after contrast agent injection), portal venous (41–120 s), and late phases (121–300 s).

All BUS and CEUS images were anonymised, randomised, and independently reviewed in two separate review sessions by two radiologists (L.D.C. and J.Y.L.), who had at least 10 years of experience in liver CEUS. Neither the patient details nor the clinical or pathological results were available to them. In cases of discordance, a third investigator (M.D.L., with at least 15 years of experience in liver CEUS) reviewed the images to make the final decision. The readers were simply asked to review the enhancement appearances of the lesion instead of making a diagnosis. Therefore, the readers were informed of the fact that all patients had either ICC or HCC, but they were blinded to the final diagnosis of the target lesion.

Image analysis

The CEUS features were characterised as follows (Fig. 2): (1) the number of lesions; (2) maximum diameter of the target nodule; (3) shape of the nodule; (4) boundary of the lesion; (5) enhancement level in the arterial/portal/late phase (hyper-/iso-/hypo-); (6) enhancement patterns of the lesion in the arterial phase (rim/homogeneous/inhomogeneous/others); (7) time of enhance onset; (8) washout time (within 60 s or not) [15]; (9) duration of enhancement (washout time subtracts time of enhance onset, if the lesion showed hyperenhancement and then washout); (10) tumour supply artery (defined as an artery extending from the surrounding liver parenchyma into the tumour) [16]; (11) peripheral circular artery (defined as an annular strip artery around the tumour in the arterial phase) [16]; (12) intratumoural vein (defined as straight vessel branches extending through the mass during the portal venous and late phase) [17, 18]; (13) boundary of the intratumoural non-enhanced area (if it was present); and (14) marked washout (defined as the lesion appearing as a uniform black defect within the enhanced liver parenchyma) [15]. Histopathological examination was the reference standard of this study.

Fig. 2
figure 2

Specific CEUS features of ICC

Development of M-score for predicting ICC

Because of the multicollinearity of the CEUS features, we used a method of least absolute shrinkage and selection operator (LASSO) [19] regularised regression to select the specific independent features for predicting ICC. A CEUS M-score for identifying ICC was developed via a linear combination of selected features that were weighted by their respective coefficients.

Validation of modified CEUS LI-RADS with M-score

Using a modified CEUS LI-RADS diagnostic algorithm, that is, replacing the LR-M with the M-score, we then investigated the diagnostic performance of HCC and ICC. ICC was classified using the M-score. The cutoff value was determined by receiver operating characteristic (ROC) analysis. The remaining lesions were assigned to LR-3, LR-4, and LR-5 using the ACR CEUS LI-RADS Diagnostic Table. HCC was defined using the algorithm of LR-5 in ACR CEUS LI-RADS.

Validation of CEUS LI-RADS with LR-M

According to the CEUS LI-RADS, we classified the lesion as ICC using the definition of LR-M: rim enhancement in the arterial phase and/or early onset washout (< 60 s) and/or a marked (punched-out) appearance [20]. After the LR-M nodules were categorised, the rest of the lesions were assigned to LR-3, LR-4, and LR-5 categories referred to the CEUS LI-RADS Diagnostic Table. HCC was defined using the definition of LR-5: size ≥ 10 mm, typical arterial phase hyperenhancement (not rim or peripheral discontinuous globular), and late washout onset (≥ 60 s) with mild washout degree (15).

Statistical analysis

Statistical analysis was performed using R software (R Foundation for Statistical Computing, version 3.2.5, http://www.r-project.org/) and Medcalc (version 11.2). Significance was set at a two-tailed p < 0.05.

Categorical variables were compared with the χ2 test. Continuous variables were compared with independent t test. The diagnostic performance of HCC and ICC was assessed by sensitivity, specificity, PPV, and negative predictive value (NPV). The area under the ROC curve (AUC) was calculated.

Results

Patients

One hundred and five ICC and 105 HCC nodules were observed. The study group comprised 210 nodules in 210 patients (72 men and 33 women; mean age ± standard deviation, 54 years ± 11 for the ICC group; 91 men and 14 women; mean age ± standard deviation, 55 years ± 10 for the HCC group) (Table 1). There were more male patients with HCC than with ICC (p = 0.003). Hepatitis B was confirmed in 101 (96.2%) ICC patients, as well as in 103 (98.1%) HCC patients. Hepatitis B and C were confirmed in 4 (3.8%) ICC patients, as well as in 2 (1.9%) HCC patients. Alpha-fetoprotein (AFP) was elevated (> 20 μg/L) in 16 (15.2%) ICC patients and 49 (46.7%) HCC patients (p < 0.0001). CA19-9 was elevated (> 35 U/mL) in 43 (41.0%) ICC patients and 10 (9.5%) HCC patients (p < 0.001). The number of nodules ≤ 3.0 cm, 3.1–5.0 cm, and > 5.0 cm was 10, 19, 76 in ICC group and 15, 28, 62 in HCC group, respectively (p = 0.126). There were 76 ICC patients (72.4%) and 82 HCC patients (78.1%) with one nodule, and 29 ICC patients (27.6%) and 23 HCC patients (21.9%) with multiple nodules (p = 0.424, Table 1).

Table 1 Demography of patients with an HCC or ICC

Specific CEUS features of HCC or ICC

In HCC, the following features were observed more frequently than in ICC: hyperenhanced in arterial phase (n = 102, 97.14%), tumour supply artery (n = 64, 60.95%), and peripheral circular artery (n = 29, 27.62%) (all p < 0.05). On the other hand, the following features were observed more frequently in ICC than in HCC: poorly circumscribed (n = 73, 69.52%), iso- or hypo-enhanced in arterial phase (n = 14, 13.33%), hypoenhancement in the portal (n = 103, 98.10%), rim enhancement (n = 67, 63.81%), early washout (n = 97, 92.38%), short duration of enhancement (n = 84, 80.0%), intratumoural vein (n = 59, 56.19%), obscure boundary of intratumoural non-enhanced area (n = 60, 57.14%), and marked washout (n = 62, 59.05%) (all p < 0.05) (Fig. 3).

Fig. 3
figure 3

Forest plot of CEUS features of HCC and ICC

M-scoring for predicting ICC

The most useful CEUS independent variables for predicting ICC selected by LASSO regression were as follows: poorly circumscribed, rim enhancement in the arterial phase, rapid washout (within 60 s), intratumoural vein, boundary of the intratumoural non-enhanced area, and marked washout. The M-score for diagnosing ICC was developed based on the above independent variables:

M-score = − 1.876 + 0.716 × poorly circumscribed + 1.222 × rim enhancement + 1.017 × rapid washout + 0.321 × intratumoural vein + 0.707 × unclear boundary of the intratumoural non-enhanced area + 1.039 × marked washout.

For diagnosing ICC, the sensitivity, specificity, PPV, NPV, and accuracy using the M-score as criterion were 89.52%, 88.57%, 88.68%, 89.42%, and 89.05%, respectively. The sensitivity, specificity, PPV, NPV, and accuracy using LR-M as criterion were 95.24%, 63.81%, 72.46%, 93.06%, and 79.52%, respectively (Table 2).

Table 2 Diagnostic performance of the LR-M and M-score for diagnosing ICC

Classification of HCC by ACR LI-RADS with LR-M or modified LI-RADS with M-scoring

Using the ACR CEUS LI-RADS diagnostic flowchart, there were 38, 1, 6, and 60 HCC nodules categorised as LR-M, LR-3, LR-4, and LR-5, respectively. One hundred, 0, 1, and 4 ICC nodules were categorised as LR-M, LR-3, LR-4, and LR-5, respectively.

If we used the M-score (cutoff value = − 0.837) instead of LR-M and the modified LI-RADS algorithm, there were 12, 1, 7, and 85 HCC nodules categorised as M-score, LR-3, LR-4, and LR-5, respectively. Ninety-four, 0, 1, and 10 ICC nodules were categorised as M-score, LR-3, LR-4, and LR-5, respectively (Table 3).

Table 3 Classification of HCC and ICC by ACR LI-RADS or modified LI-RADS

Diagnostic performance of the ACR LI-RADS and modified LI-RADS

The sensitivity and specificity of the ACR LI-RADS for diagnosing HCC were 57.14% and 96.19%, respectively. The sensitivity and specificity of the modified LI-RADS for diagnosing HCC were 80.95% and 90.48%, respectively. The AUC of modified LI-RADS (AUC = 0.857) was much higher than ACR LI-RADS (AUC = 0.767, p = 0.0001).

Since we used the 1:1 propensity score matching for patient selection, the morbidity of HCC was 50% in this study. Consequently, the PPV and NPV of the ACR LI-RADS for diagnosing HCC were 93.75% and 69.18%, respectively. The PPV and NPV of the modified LI-RADS for diagnosing HCC were 89.47% and 82.61%, respectively. However, according to our previous study, the practical incidence rate of HCC in our institution was 92.0% [21]. Therefore, the modified PPV and NPV of the ACR LI-RADS for diagnosing HCC were 99.42% and 16.33%, respectively. The modified PPV and NPV of the modified LI-RADS for diagnosing HCC were 98.99% and 29.23%, respectively (Table 4).

Table 4 Diagnostic performance of the ACR LI-RADS and modified LI-RADS for diagnosing HCC

Discussion

This study proposed a CEUS M-score as a substitute for LR-M, as well as a modified CEUS LI-RADS. Compared with the ACR CEUS LI-RADS, the modified CEUS LI-RADS had a higher sensitivity to diagnose HCC, and a higher specificity to diagnose ICC.

The differentiation between ICC and HCC in high-risk patients has been a challenging issue for the identification of HCC in focal liver lesions. The CEUS LI-RADS is designed to provide LR-5 100% specificity to HCC without biopsy. As indicated in our study, the diagnostic specificity of HCC achieved 96.19%. However, high specificity leads to low sensitivity and false negative diagnosis of HCC. In LR-M nodules of our study, 27.5% HCC nodules were misclassified as non-HCC malignancy. We think that it was inaccurate to classify too many HCC nodules into LR-M because the category of LR-M is set to identify a malignant nodule, not necessarily HCC. Therefore, it is also necessary to improve the diagnostic sensitivity of HCC. Although the HCC specificity of modified LI-RADS was a little lower than ACR LI-RADS, the sensitivity has been greatly improved from 57.14 to 80.95%. The HCC diagnostic performance of modified LI-RADS was distinctly improved compared to the one of ACR LI-RADS. Upon using the practical prevalence of HCC in our institution, the PPV of modified LI-RADS remained quite high (98.99%).

One of the advantages of CEUS is real-time scanning. It can capture transient vascular architecture in the enhancement process along with the enhancement time and dynamic changes. We found that the vascular architecture of the two tumours is different. The unique vasculature development of HCC was the basket pattern or peripheral circular artery around the tumour [3]. Although the sensitivity of this feature was only 27.6%, the specificity for the identification of HCC was quite high (97.1%). Therefore, it could be added as an important ancillary feature in CEUS LI-RADS.

Although the LR-M in the CEUS LI-RADS represents various non-HCC malignancy, the most common malignancy aside from HCC in patients at risk for HCC is ICC [22]. A retrospective study showed that 40% (6/15) of LR-M lesions were HCCs, 13% (2/15) were hepatocholangiocarcinomas, and 47% (7/15) were cholangiocarcinomas [12]. Therefore, besides HCC, most LR-M nodules refer to ICC. LR-M is not a specific criterion to diagnose ICC, but it is mainly designed to include all ICC so that no ICC will be misdiagnosed as HCC. As shown in our study, the sensitivity of LR-M was 95.2% when LR-M was used as the diagnostic criterion for ICC, but the specificity was low (63.8%). As opposed to the study by Terzi E et al [11] which had no ICC misclassified as LR-5, ICCs were assigned as LR-5 in our study. Differentiating between ICC and HCC remains challenging. Therefore, a criterion to preserve high specificity for diagnosing ICC is necessary, and we need a more detailed definition of ICC to resolve this dilemma.

In this study, we investigated more specific CEUS features for ICC. In addition to rim enhancement, rapid washout, and marked washout proposed by ACR LI-RADS, we found that poorly circumscribed, intratumoural vein, and unclear boundary of the intratumoural non-enhanced area were also useful to diagnose ICC. Intratumoural vein was a unique CEUS feature that we have reported in previous study [18]. The unclear boundary of the intratumoural non-enhanced area may be consistent with its pathological findings, and is abundant in fibrous connective tissues with sparse tumour cells and blood vessels in the centre of the tumour [23]. Based on these features, we constructed an individualised M-score for predicting ICC. Compared with the LR-M, the M-score had a lower sensitivity, but a much higher specificity from 63.8 to 88.6%, and a higher accuracy from 79.5 to 89.1%.

Our study has several limitations. First, we did not enrol benign lesions and other rare liver cancers to validate the diagnostic accuracy of the CEUS LI-RADS. Second, our study is a single centre study. A multicentre and prospective research is necessary to validate the applicability of CEUS LI-RADS in China. Finally, we did not compare the diagnostic performance of CEUS with that of MRI/CT. Although contrast enhanced MRI/CT are proposed as the diagnostic imaging standards by AASLD guidelines, their specificity or PPV for diagnosing HCC does not achieve 100% (95.2–97.3%) [24, 25]. Therefore, the optimal diagnostic algorithm or combination of MRI, CT, and CEUS is worth exploring in depth instead of simply deleting CEUS.

In conclusion, we developed a CEUS M-score for predicting ICC in high-risk patients. The modified LI-RADS and M-score had higher sensitivity for diagnosing HCC and higher specificity for diagnosing ICC than ACR LI-RADS. It was complementary with ACR LI-RADS. The HCC diagnostic performance of modified LI-RADS was improved in comparison to ACR LI-RADS.