Introduction

Hepatocellular carcinoma (HCC) is the most common primary liver malignancy and the third leading cause of cancer-related mortality worldwide [2], the fifth as incidence for men and the seventh for women [3, 4]. It typically arises in patients with chronic disease (HBV, HCV, autoimmune, genetic and metabolic conditions), with or without cirrhosis, generally presenting late with a 5-year survival < 16% [5, 6]. Thus, early detection of HCC is important to improve overall survival, with 5-year survival rates reaching up 93% when patients are able to receive potentially curative therapies, such as resection or orthotopic transplantation [2, 7, 8].

For this reason, surveillance programs and performing imaging play a crucial role in the diagnosis of HCC: in fact, it is the only malignancy that can be diagnosed with imaging findings alone, as suggested by the American college of Radiology, the American Association for the Study of Liver Diseases and The Organ Procurement and Transplantation Network/United Network for Organ Sharing [9]. Advances in imaging technique and growing communication between specialists led to develop noninvasive methods to define lesions in cirrhotic patients at risk, of whom Liver Imaging Reporting and Data System (LI-RADS) has recently become a benchmark. The hallmark imaging features of HCC are arterial phase hyperenhancement (APHE) and portal/delayed washout, essential for a noninvasive diagnosis on either contrast-enhanced computer tomography (CT) or magnetic resonance imaging (MRI) [10]. This current trend is to avoid, if possible, histological confirmation with biopsy, due to its inherent risks.

However, nowadays the rate of misdiagnosis is of 10–20% [11] and, even if lower than the previously reported [12], it is still too much frequent. In order to minimize this phenomenon, LI-RADS aims to create uniformity both in imaging techniques and in reporting terminology; such standardization strengthens communication between radiologists and other medical disciplines, allowing more efficient multidisciplinary collaborations for research endeavors and quality assurance initiatives [13].

The purpose of the study was to determine the effectiveness of liver reporting and data system (LI-RADS) to diagnose Hepatocellular Carcinoma (HCC) and to retrospectively evaluate its impact on the adopted therapeutic strategy. Moreover, we evaluate the agreement between the two radiologists, and secondarily between them and final pathology examination.

Materials and methods

This is was a retrospective study, approved by our internal review board, and informed consent was obtained from each patient. Population consisted of 350 patients submitted to liver resection for suspected HCC in our surgical unit, between January 2008 and August 2019.

The protocols used for the diagnosis and treatment of HCC patients have been described elsewhere [14]. According to LI-RADS v2018 [1], exclusion criteria were cirrhosis due to congenital hepatic fibrosis or vascular disorders, and age < 18 years. To ensure more homogeneous data, recurrent lesions or patients without at least one contrast-enhanced MRI or CT were excluded. Thus, only 40/350 patients were selected from picture archiving and communication systems (PACS), for a total of 21 CT and 19 MRI (Fig. 1).

Fig. 1
figure 1

Flowchart with exclusion criteria

Two abdominal radiologists, with 15 and 5 years of experience, respectively, named RAD1/senior and RAD2/junior, reinterpreted independently the preoperative CT and MRI scans, scoring all major and ancillary features and assigning an overall LI-RADS category. Both radiologists were blinded to the clinical, laboratory and pathology results.

Every lesion was included in the study (54 nodules), but the comparison was made only on the major nodule surgically treated, thus resulting in a final cohort of 40 nodules in 40 patients.

Statistical analyses

The diagnostic accuracy for each LI-RADS category was described by sensitivity, specificity, and positive and negative predictive values, comparing radiological evaluations and the reference standard as histological diagnosis on the surgical piece, for each observer and imaging modality. To assess the agreement of two radiologists and the imaging features with the histology, likelihood ratios (PLR) were calculated for IOUS and histological evaluation using cross-tabulations. For PLR, values above 10 indicate high predictive power, whereas values between 5 and 10 indicate good power; values below 5 indicate low predictive power; and values around 1 indicate that no useful information for ruling the diagnosis in or out was produced. Agreement was also reported using Kappa index (κ) of Cohen (slight agreement 0.01–0.20, fair agreement 0.21–0.40, moderate agreement 0.41–0.60, substantial agreement 0.61–0.80 and almost perfect agreement 0.81–0.99). Receiver operating characteristics (ROC) analyses were further constructed to determine the potential diagnostic performance for detecting the presence of MI. Corresponding areas under the ROC curve (AUC) with 95% confidence intervals (95% CI) were calculated. An area of 1 represents a perfect test; an area of .5 represents a worthless test. A rough guide for classifying the accuracy of a diagnostic test is the traditional academic point system: 0.90–1 = excellent (A); 0.80–0.90 = good (B); 0.70–0.80 = fair (C); 0.60–0.70 = poor (D) and 0.50–0.60 = fail (F).

Results

The main patients’ features are summarized in Table 1.

Table 1 Patients' characteristics

Overall 108 nodules were detected by radiologists, with a median diameter of 24.48 mm (range 7–20 mm): RAD1 was more skill, detecting 58 nodules (1.45 nodule/patient, 29 on MRI and 29 on CT) versus 50 detected by RAD2 (1.25 nodule/patient, 25 on MRI and 25 on CT). Subsequently, the larger lesion suitable for surgical resection in the suspicion of HCC was considered for each of the 40 patients.

RAD 1 assigned LR-3 at 3 lesions (7.5%; 3/40), LR-4 at 8 lesions (20.0%; 8/40), LR-5 at 24 lesions (60.0%; 24/40) and LR-M at 5 lesions (12.5%; 5/40). RAD 2 assigned LR-3 to 3 patients (7.5%; 3/40), LR-4 to 14 patients (35.0%; 14/40), LR-5 to 19 patients (47.5%; 19/40) and LR-M to 4 patients (10.0%; 4/40). The perfect match among readers was of 62.5% (25/40), with a Kappa value of 0.41 (IC 95% 0.15–0.67), better for LI-RADS 5 (16/25), with a Kappa Cohen of 0,46 (IC 95% 0.18–0.73) and higher in MRI investigations (68%; 13/19). Histological examination confirmed HCC preoperative suspect in 82.5% (33/40) of cases, while in 17.5% of cases (7/40) it revealed 3 intrahepatic cholangiocarcinoma (ICC), 1 hepato-cholangiocarcinoma (HCC-ICC), 1 primary lymphoma (PML), 1 focal nodular hyperplasia (FNH) and 1 regenerative liver nodule (RLN). LI-RADS categories compared with final histological results, for each radiologist, are summarized in Table 2.

Table 2 LI-RADS categories compared with gold standard histological results (Hy) from surgical specimen according two different radiologists

For our statistical analysis, we considered patients with radiological features with high probability of HCC (category LR-4 or LR-5) and histological confirmation as “true negatives”; indeed we considered patients with radiological features not attributable to HCC and histological diagnosis different from HCC as “true negatives.”

When considering only lesions at high risk of HCC (LR4 and LR5), the assigned LI-RADS score by RAD1 reached sensitivity, specificity, positive and negative predictive values, positive and negative likelihood ratio (LR ±), diagnostic accuracy, diagnostic OR and AUCs of 90,9%, 29,0%, 93,8%, 62,5%, 3,18 (LR +), 0,13 (LR-), 87,5%, 25 and 0.81, respectively, slightly higher than RAD2 ones (Table 3).

Table 3 Diagnostic accuracy of LI-RADS for high-risk HCC patients (LR4-5), for each radiologist

Focusing on the 7 patients with a histological diagnosis of non-HCC malignancy, RAD1 was more accurate than junior one. Both readers rightly gave LR3 to nodule with a final diagnosis of FNH, but definitively failed in judging as LR5 the same two patients resulted in 1 PHL and 1 RLN.

Relatively to imaging modality, according to RAD1, that was more accurate, MRI has been demonstrated the best modality of choice than CT for imaging diagnosis, with APHE as the most sensitive and accurate among major features. Differently, portal/delayed washout appearance had been shown to be the best major feature in CT investigation (Table 4).

Table 4 Diagnostic performance of LI-RADS major features by CT and MRI, according to RAD1

Discussion

HCC is the only tumor that can benefit from surgical treatment on the basis of imaging findings alone; the hallmark imaging features of HCC are APHE and portal/delayed washout, which represent the characteristic vascular pattern of HCC on dynamic CT or MRI [10].

However, various studies underline the persistence of a high percentage of false positives to preoperative staging, which reaches up to 21% in patients undergoing transplantation [12]. It could be explained by the fact that approximately 40% of HCCs demonstrates an atypical behavior and many HCC mimickers, such as ICC and HCC-CC, can be present in at-risk patients, thus representing a huge diagnostic challenge for radiologists [14]. LI-RADS was just conceived to improve the consistency and clarity of radiologist interpretation and reporting, providing a standardized lexicon, strict diagnostic criteria, an easy-to-follow diagnostic algorithm, and reporting guidelines, in order to prioritize patient care and optimize medical outcomes [3].

As the first fundamental point of our study, both radiologists were able to recognize all major lesions, with a Cohen k value of 0.41 (IC 95% 0.15–0.67), expressing a moderate agreement, higher in MRI investigations, in accordance with the study Davenport et al. [15]. Among imaging methods and according to RAD1, in fact, MRI was proved to be more effective in the diagnosis of HCC for categories LR-4 and LR-5, with a positive predictive value of 100 vs 82.4% of CT. It can be explained by the possibility to apply a higher number of ancillary features, thus helping to make diagnosis in doubtful cases. Among the major features, APHE for MRI and portal/delayed washout for CT had been demonstrated as the most sensitive and accurate major features, but less specific than enhancing capsule because APHE can be present even in benign entities (hemangiomas), dysplastic nodules and non-HCC malignancies and, in patients with advanced cirrhosis, the parenchymal heterogeneity may obscure areas of washout [16].

Relatively to the assignment of the LI-RADS categories, the greatest agreement was reached for the LR-5 (LR-5 vs. non LR-5K value 0.46, IC 95% 0.18–0.73), with PPVs of 91.7% and 89.5% for RAD1 and RAD2, respectively, demonstrating the good degree of reproducibility for the diagnosis of HCC [17]. However, RAD1 recognized 4 more nodules and safely attributed LR-5 to 24 nodules; on the contrary, RAD2 made a greater number of uncertainty diagnoses (LR-4 RAD 1: 8 nodules vs RAD 2: 14 nodules). As Liu et al. published in 2017, in fact the accuracy of radiologists increases after 8 years of experience and at least three months of specific training to draw up the LI-RADS criteria, underlining how the management of patients with HCC requires specific acknowledge and strengthened expertise [18].

However, both radiologists mistakenly assigned LR-5 category to the same two cases resulted to be 1 PHL and 1 RLN, which are therefore the Achille’s heels of the LI-RADS classification in our experience.

PHL known as “the great mimicker” is the main pitfall whose diagnosis is often difficult, especially in patients with chronic liver disease and viral hepatitis. At CT, PHL commonly has soft tissue attenuation and variable contrast enhancement, lesser than the liver parenchyma in all dynamic phases. It may demonstrate hemorrhage, necrosis or a rim enhancement pattern [19,20,21,22,23]. At MR imaging, the nodules tend to be hypo- or isointense on T1-weighted images (T1WI) and moderately hyperintense on T2-weighted images (T2WI), with an enhancement pattern similar to that seen at CT. A “target” appearance on T2WI, with a hyperintense poorly enhancing center and peripheral enhancement, has been described in about 15% of lesions. Diffusion-weighted imaging (DWI) is an important component of the imaging protocol for characterization of suspected lymphomatous lesions, due to the highly cellular nature of lymphoma that typically results in restricted diffusion, allowing earlier identification of disease when compared with traditional MRI sequences.

In general, imaging findings of lymphadenopathy below the level of the renal veins, poor lesion enhancement in all contrast-enhanced phases and vascular encasement without thrombosis favor a diagnosis of lymphoma [20, 24]. APHE, delayed washout with capsular enhancement and vascular thrombosis suggest HCC (Fig. 2); moreover, hepatic lymphomas are generally avidly hypermetabolic at PET, while most HCCs are not [19, 25, 26].

Fig. 2
figure 2

Typical HCC at contrast-enhanced CT arterial a and delayed b phases

Nevertheless, these characteristics are not always evident unequivocally, as reported in a literature review of 2018 by Bohlok et al. [27], in which it emerges that hepatic lymphoma may appear as a single or multiple lesion with variable imaging characteristics; on CT it can present arterial hyperenhancement and portal/delayed washout (Fig. 3), while on the MRI hepatic lymphoma may appear hypointense in T1WI and iso- or hyperintense in T2WI and therefore mislead the diagnosis toward HCC.

Fig. 3
figure 3

Atypical lymphoma at contrast-enhanced CT arterial (a) and delayed (b) phases

Moreover, early HCCs, defined by WHO as a well-differentiated tumor < 2 cm, with poorly defined margins and vaguely nodular type, tend to show hypovascularity on dynamic imaging studies due to decreased portal supplies and insufficient neovascularization. Furthermore, HCCs expressing stemness-related markers, scirrhous, sarcomatoid and large (> 5 cm) HCCs may show targetoid appearance due to central areas of fibrosis or necrosis [10].

RNL can also present confounding radiological characteristics: they are islands of hepatic parenchyma in the context of fibrosis, generally characterized by the same enhancement of normal hepatic parenchyma. When > 15 mm and especially in hepatitis C cirrhosis, they can present early enhancement in arterial phase and low SI on the hepatobiliary phase, erroneously leading the diagnosis toward HCC LR4 or 5 [17, 28, 29] as in our case (Fig. 4). Nevertheless, several studies have reported that ancillary features such as low SI on T1WI, high SI on T2WI and diffusion restriction DWI may improve radiologists’ diagnostic accuracy in distinguishing early HCCs from DNs [29,30,31,32]. In particular, diffusion restriction on DWI was shown to be strongly related to progression to hypervascular HCCs within a few years [33].

Fig. 4
figure 4

Atypical regenerative nodule at contrast-enhanced CT arterial (a) and delayed (b) phases

Among the 7/40 cases with unconfirmed diagnosis of HCC by postoperative histological evaluation, there were also 3 ICC, 1 HCC-ICC and 1 FNH, about whom in 4 cases (10%), LI-RADS would have modified the treatment strategy significantly.

In particular, relatively to 3 ICCs and 1 HCC-ICC (Fig. 5), senior radiologist was more accurate to assign LR-M category (RAD1: 3 LR-M and 1 LR-3; RAD2: 1 LR-M and 3 LR- 4); however, both radiologists attributed categories that expressed the need of multidisciplinary assessment. This would have allowed to develop a more patient-tailored strategy, modifying the adopted therapeutic procedure, through the insertion of the preoperative biopsy before and the lymphadenectomy to surgery after, as foreseen in cases of pure cholangiocarcinoma. According to a 2014 review [33], in fact, lymphadenectomy allows a better pathological staging, without impact on long-term survival.

Fig. 5
figure 5

Cholangiocarcinoma with a, b T2W targetoid appearance; c targetoid restriction; d no APHE; e delayed central enhancement; f targetoid HPB appearance

Finally, both radiologists have included the focal nodular hyperplasia in the field of LR-3, according to what is suggested by the same classification [1]: even if not well distinguishable from HCC with atypical behavior, this entity is rare in cirrhotic patients. In this case, to correctly typify the lesion, the diagnostic imaging would have been repeated after 3–6 months, or, after a multidisciplinary discussion, a biopsy would have been performed, which would have established the benignity of the lesion, thus avoiding liver resection and its possible related complication.

In this perspective, any effort should be made to increase adherence and to enhance standardized use of LI-RADS internationally. For successful implementation, several concrete steps can be followed in a radiology practice such as: enlist a radiologist who will become a LI-RADS benchmark and a leader for change for his colleagues; advocate the use of LI-RADS within department and as a communication standard for reporting liver examinations in at-risk patients with other clinicians; promote a LI-RADS reporting template for patients at risk of HCC within your department or institution; use LI-RADS terminology and observation categories in the setting of multidisciplinary discussion (aka ``tumor boards''), focusing on LR-4 and LR-5; consider using mobile applications and online resources to help assign observation categories, such as downloadable template or training webinars from the ACR web site [13].

Limitations

Our study has some limitations: the single-centered and retrospective nature and the relatively small sample size.

Conclusions

LI-RADS classification increases diagnostic accuracy, showing good inter-observer reproducibility and excellent concordance with the anatomopathological data, and positively influences the therapeutic procedure, underlining the importance of multidisciplinary approach. The diagnostic power of LI-RADS grows if the imaging method used is MRI, more sensitive and accurate than the CT in the diagnosis of HCC, in particular in the detection of the “washin” among major features.

The interpretative ambiguities of the features included in the classification and the complexity of its compilation constitute the limits of the LI-RADS score, as they reduce the concordance between different radiologists, making it therefore still strongly operator dependent. Furthermore, in cases such as lymphoma or regenerative nodules, although correctly using the LI-RADS classification, it is often not possible to make a correct diagnosis, thus representing the weak point of the classification.

However, in the proposed experience, even if based on small cohort of patients, LI-RADS would have modified the treatment strategy in 4 cases (10%, 4/40), avoiding an unnecessary laparotomy in the case of FNH and extending the surgical gesture (with lymphadenectomy) for the three ICC.

These data are significant because underline how diagnostic imaging is crucial in the management of patient at risk of HCC and LI-RADS has become an additional diagnostic tool to be added to guide the therapeutic choices of surgeon and hepatologist. Minimizing errors, improving patient care and establishing a universal set of conventions for evaluating patient at risk of HCC, LI-RADS ensures radiologist value to the health-care team and strengthens communications among physicians, allowing more efficient multidisciplinary collaborations for research endeavors and quality assurance initiative.