The previous articles in this special issue of Abdominal Radiology reviewed in detail the content and conceptual foundation of Liver Imaging Reporting and Data System (LI-RADS) in 2017. This article provides a glimpse into the future, discussing the immediate and long-term plans for the continuing improvement and expansion of LI-RADS. While the article emphasizes goals directly relevant to LI-RADS, it also provides examples of far-reaching future directions that will require large multidisciplinary research efforts. Our hope is that this chapter will reinforce the reader’s understanding of LI-RADS and how it was developed, as well as to prioritize the research, technical innovation, and political harmonization needed to secure its continued refinement, validation, and adoption.

Immediate goals

The Core and Essentials components of the latest version of LI-RADS were released in 2017 [1]. Currently in preparation is a comprehensive manual intended to augment these components. In addition to an updated atlas and lexicon, the manual will include technical recommendations, management guidance, as well as reporting instructions and templates. To improve radiologists’ application of LI-RADS, the manual will elaborate on details of the various algorithms, clarify and illustrate the LI-RADS categories, and provide background education on relevant topics such as cirrhosis, hepatocarcinogenesis, tumor biology, contrast agents, and treatment mechanisms of action. Additional immediate goals are to translate LI-RADS into other languages, prepare supplementary slides and flashcards to be publicly available, and organize and offer LI-RADS workshops at radiology meetings such as the 2017 Radiological Society of North America Annual Meeting and 2018 Society of Abdominal Radiology Annual Meeting.

Long-term goals

A dynamic system, LI-RADS will be updated every 3 to 5 years through a rigorous approval process that incorporates new knowledge, advances in technology, and user feedback. To understand how this will occur, it helps to first review the developmental process until now.

Historical development

Historically, LI-RADS was developed by a variable blend of data, expert opinion, and a desire for congruency with other diagnostic systems in North America, including Organ Procurement and Transplant Network (OPTN) [2] and American Association for the Study of Liver Diseases (AASLD) [3]. For example, the LI-RADS criteria for hepatocellular carcinomas (HCCs) ≥ 20 mm was supported strongly by data. Due to the lack of high-quality informative evidence, however, much of the LI-RADS content was developed without empirical substantiation. An example is the criteria for 10–19 mm LI-RADS 5 (LR-5). According to prior studies, the combination of arterial phase hyperenhancement and “washout” on CT or MRI was highly specific for hepatocellular carcinoma (HCC) in 10–19 mm nodules detected at antecedent ultrasonography (US) [4, 5], but there were no validated criteria for establishing this diagnosis in 10–19 mm observations undetected by US. Since LI-RADS was intended to apply to all observations, not just sonographically discernible lesions, more general criteria were needed for the 10–19 mm size range. These criteria were created by starting with the validated criteria for larger (i.e., ≥ 20-mm) lesions and then imposing greater stringency (i.e., requiring additional features) to compensate for the smaller size. Similarly, the LR-4 and LR-3 criteria were created by a stepwise reduction approach from LR-5, loosening the criteria incrementally from LR-5 to LR-4 and then from LR-4 to LR-3. While the development of a system without hard evidence may seem flawed, it was a necessary compromise. Fundamentally, LI-RADS had to be created before it could be tested.

Future development

Now that the system has been created, deployed, and allowed to mature, we anticipate that future refinements will benefit from the high-quality scientific data starting to emerge from the broader research community. Two LI-RADS working groups will contribute most directly to the process: the Evidence Working Group will review the emerging literature, and the Research Plan Working Group will identify and highlight remaining gaps in knowledge. Examples of gaps in knowledge are provided below, organized into categories mirroring the elements in Table 2 in the article by Tang et al. in this issue [6].

Patient population

LI-RADS proposes different patient populations for screening and surveillance and for diagnosis and staging, but the ideal populations are not yet well understood. For example, the screening and surveillance population, adopted from the AASLD guidelines, is defined as the group in which HCC surveillance is assumed to be cost effective, typically when the estimated risk of HCC is 1.5% per year or greater. Since the modeling assumptions may be imperfect, however, the cost effectiveness of surveillance is difficult to estimate, which introduces unavoidable uncertainty. Another important gap that merits further research and has not yet clarified by the AASLD guidelines is whether surveillance should be offered to non-Asian and non-African chronic hepatitis B carriers without cirrhosis and, if so, whether these surveillance recommendations should be constrained by age or other demographic or clinical characteristics [3].

As compared to the screening and surveillance population, LI-RADS defines the diagnostic and staging population conceptually as the group in which the pre-test probability of HCC and of non-HCC lesions resembling HCC is sufficiently high and low, respectively, that a lesion meeting HCC criteria can be presumed to be HCC. Operationally, this is assumed to include adults with cirrhosis, in whom the supporting evidence is reasonably strong, as well as non-cirrhotic chronic hepatitis B carriers. Since the use of LI-RADS in this latter group is supported by theoretical arguments rather than empirical corroboration, research is needed to collect the needed data. Other gaps in knowledge are whether the LI-RADS diagnostic algorithm can be applied in non-cirrhotic individuals with multiple HCC risk factors and in adults with cirrhosis due to vascular disorders, such as Budd–Chiari syndrome and cardiogenic cirrhosis.

Screening and surveillance algorithm

Despite the universal endorsement of US for HCC screening and surveillance by various clinical HCC-related practice guidelines worldwide, no organization has previously developed an algorithm for sonographic interpretation and reporting. To address this gap, LI-RADS v2017 introduced a screening and surveillance algorithm for US. Although this scheme represents an advance, the proposed US LI-RADS Category and Visualization Score is based on expert consensus with little direct data, and therefore requires prospective validation. In particular, research is needed to evaluate both inter- and intra-reader agreement for both scores. Research also is needed to understand how US sensitivity is impacted by sonographic visualization of the liver: this knowledge will inform the possible integration of the newly developed Visualization Score into the follow-up recommendations.

Another gap in knowledge is the ideal surveillance interval. While two studies have reported that a 6-month surveillance interval prolongs survival compared to a 12-month interval, even after adjusting for lead time [7, 8], a third study found no difference in survival benefit between these intervals [9]. Despite the inconsistency of these results, the AASLD recommends semi-annual surveillance for at-risk patients, a time interval that the US LI-RADS algorithm adopted for congruity [3]. Further research is needed to validate the 6-month interval as well as the slightly expedited follow-up interval (3 to 6 months) recommended by LI-RADS after detection of a subthreshold (< 10 mm) observation.

Perhaps the most important barrier to effective screening at the population level is inconsistent conformity with surveillance recommendations. Several studies have shown that only a minority of patients with cirrhosis undergo regular surveillance, in clear non-observance of practice guidelines. Effective reminders and alert systems are needed to facilitate compliance with surveillance recommendations. Although not created for this purpose, the standardized reporting enabled by LI-RADS may positively contribute to the development and successful deployment of such procedures.

Diagnostic imaging modalities, techniques, and contrast agents

LI-RADS provides a single diagnostic algorithm for multiphase CT, MRI with extracellular agents (ECA), and MRI with gadoxetate disodium, a type of hepatobiliary agent (HBA). While initially combined for simplicity, the use of a common algorithm for all three imaging methods has a potentially important drawback. Emerging evidence suggests that the assigned categories are modality-dependent, with different modalities assigning different categories to the same observation. Research is needed to understand the sources of discordance, knowledge that could potentially be used to modify the LI-RADS criteria, and improve cross-modality reproducibility. A related gap in knowledge is how to integrate information across modalities. Since each imaging modality has its own advantages and disadvantages, some major and ancillary features may be characterized better or even uniquely by particular modalities, for example T2 hyperintensity and hepatobiliary phase hypointensity. Research is needed to inform how imaging features across modalities could be combined to further improve sensitivity for HCC diagnosis without sacrificing specificity.

Diagnostic scope

Unlike most other HCC imaging systems, which focus on the diagnosis of definite HCC in the absence of macrovascular invasion, LI-RADS provides guidance on the diagnosis of non-HCC malignancies (LR-M) and presence of tumor in vein (LR-TIV). The LR-M criteria were selected based on descriptive retrospective case series published in small, single-center studies, with unavoidable selection biases arising from their retrospective design. Thus, prospective research is needed to rigorously test and appropriately refine the LR-M criteria. Similarly, despite its relevance to staging, prognosis, and treatment planning, the imaging-based diagnosis of macrovascular invasion is poorly understood. Research is needed to rigorously assess inter- and intra-reader reliability and diagnostic accuracy of the LR-TIV category and to refine its criteria as needed. Additionally, research is needed to identify and validate criteria for differentiating the potential causes of tumor in vein (including HCC, intrahepatic cholangiocarcinoma (ICC), and combination cholangiocarcinoma-HCC).

Diagnostic technical expertise

Unlike other systems, which specifically recommend that HCC imaging be performed in centers of excellence, LI-RADS is intended for both community and academic settings and for use by experts and non-experts alike. Although this is a pragmatic necessity, studies are needed to confirm that LI-RADS can be applied reliably and appropriately across the spectrum of radiology practices that participate in the care of patients with liver disease.

Terminology

While other HCC diagnostic systems classify lesions in a binary fashion, as definitely HCC or not definitely HCC, LI-RADS is more granular, stratifying the spectrum from benign to HCC into five levels, or categories, of relative likelihood. These LI-RADS categories were designed to separate observations into clinically distinct groups, so that each category would have meaningful impact, different from that of other categories, on prognosis and clinical decision making. A single-center retrospective study provided preliminary validation for this scheme by showing that LR-2, LR-3, and LR-4 observations have both different probabilities as well as cumulative incidences of progression to LR-5 during clinical follow-up [10]. These results need to be confirmed in independent cohorts of patients, ideally in multicenter prospectively conducted studies with clinical endpoints.

Major imaging features

LI-RADS endorses four major imaging features for the diagnosis of HCC. These include arterial phase hyperenhancement, “washout,” “capsule,” and threshold growth [11, 12]. For LR-5 categorization, an observation must be ≥ 10 mm in size, show arterial phase hyperenhancement, and depending on observation size, various combinations of the other features. Since the prior literature used the preceding terms ambiguously or inconsistently, LI-RADS undertook to formulate standardized definitions, a process which by necessity involved arbitrary decisions. Examples of arbitrary decisions were the requirement that an observation be brighter than liver in the arterial phase to qualify for the feature arterial phase hyperenhancement, the inclusion of the “capsule” (if present) in the size measurement, and the evaluation of “washout” relative to composite background liver tissue rather than to background nodules. Research is needed to understand the effect of the multiple arbitrary decisions on LI-RADS categorization and to refine the definitions as appropriate to improve reader reliability and/or to improve sensitivity for HCC while maintaining specificity.

Also controversial is the 20 mm size threshold for stipulating the number of additional major features needed for LR-5 categorization. The choice of 20 mm as a size stratifier was born from a desire for congruency with other imaging systems such as OPTN as well as with the histopathological convention of classifying HCCs as small (< 20 mm) or large (≥ 20 mm), but there is little if any evidence to suggest that this is the optimal threshold. Research is needed to assess whether a smaller threshold, say 15 mm, would increase sensitivity for HCC without impairing other performance parameters.

Unlike AASLD, LI-RADS advocates threshold growth as a major feature, adopting from OPTN the requisite growth rate for defining this feature. Since growth is a non-specific characteristic of malignancy, its inclusion as a major feature of HCC may be unjustified. Moreover, as growth is assessable only in patients with prior exams, it introduces randomness into the categorization related to whether or not a particular patient has a prior examination available. Finally, the growth rate advocated by OPTN is unsupported by scientific evidence. Research is needed to validate threshold growth as a major feature of HCC.

Ancillary imaging features

In addition to major imaging features, LI-RADS also advocates the use of ancillary imaging features for category adjustment [13]. These features were selected based on a variable combination of evidence from single-center retrospective studies, biological plausibility, and expert opinion [12]. While the level of evidence supporting the ancillary features individually is relatively low, there is even less data to inform exactly how the features should be applied to adjust LI-RADS categories. Research is needed to understand the incremental impact of ancillary features, both alone and in various combinations, on sensivity and specificity.

Staging, management, and transplant eligibility

Although the OPTN system is the current state-of-the-art for imaging-based staging and transplant prioritization, radiologists and other specialists should be aware of some limitations and gaps in knowledge [2]. The current tumor staging system ignores suspicious lesions that fail to meet HCC criteria (e.g., LR-4), even though patients with multiple such lesions may have multifocal HCC and should not undergo transplantation. Research is needed to determine whether and how such lesions should be incorporated in HCC staging and transplant prioritization. Although imaging is frequently used to establish the presence of tumor in vein (macrovascular invasion), the accuracy of imaging for this purpose is not well established owing to the paucity of published data. Research is needed to prospectively assess the accuracy and reader reliability of each imaging modality [CT, MRI with ECA, MRI with HBA, and contrast-enhanced US (CEUS)] for this diagnosis and to develop and validate high-specificity criteria. Until such criteria are validated, radiologists should be cautious in reporting the presence of tumor in vein, reserving the LR-TIV category for unequivocal cases. If there is any doubt, the presence of tumor in vein may be suggested along with an approximate confidence level, but the LR-TIV category should not be assigned.

Treatment response

LI-RADS v2017 introduced a system using CT and MRI for assessing HCC response to several different types of locoregional treatment. Adapted from the Modified RECIST system, this new system includes several elements developed through expert opinion. There is now a need to prospectively evaluate the performance of the system, including a critical and rigorous assessment of the proposed criteria for different types of locoregional therapies and at different time points after their application. Additionally, the current system has gaps that need to be filled. For example, the system does not apply to LR-M or histopathologically proven ICC, since the mechanisms of action of locoregional treatments in these tumors may differ from HCC. Development and validation of tumor response criteria for LR-M and path-proven ICC therefore are needed. Additionally, the recently released first version of CEUS LI-RADS did not include a treatment response algorithm for CEUS, a gap that is planned to be addressed in an upcoming LI-RADS update.

Future adoption

A major long-term goal for LI-RADS is to promote the adoption of a single imaging system for HCC diagnosis among community and academic radiologists for clinical care, research, and education. Such adoption will enhance communication with clinicians, improve the consistency and clarity of radiology reports, facilitate quality assurance, reduce diagnostic errors, facilitate meta-analysis of the published literature, enable the creation of imaging-based registries, and provide highly vetted consensus-generated material for education. The existence of competing systems for HCC diagnosis in North America and abroad provide a barrier to such adoption and introduce confusion, since the systems are inconsistent with each other. Ultimately, it is hoped that these various systems will be unified. LI-RADS may play an important role in that unification by providing a common framework whose core components can be adjusted as appropriate to match local and regional variation in healthcare systems, clinical practices, and resources. An immediate step in this direction is to translate LI-RADS into other languages.

The diagnostic rigor and complexity of LI-RADS are among its major strengths, but also are major barriers to broad adoption. To make LI-RADS simpler to apply in daily clinical care, the American College of Radiology has partnered with industry to develop radiology reporting systems that seamlessly integrate LI-RADS into radiologists’ normal workflow. It is anticipated that these systems will allow radiologists to use LI-RADS efficiently and accurately, while automatically structuring report content to permit quality assurance, data mining, and ultimately the creation of a LI-RADS registry.

Future expansion in scope

Since its first release in 2011, the scope of LI-RADS has expanded from two modalities (CT and MRI with extracellular agents) to multiple modalities (US, CEUS, and MRI with HBA) and from one set of clinical indications (diagnosis and staging) to a wider range now including screening and surveillance as well as treatment response assessment. Future potential expansions in scope include HCC treatment response with CEUS, interpretation and reporting of benign liver lesions, pediatric liver lesions, and quantitative liver imaging as well as the development of radiology-pathology consensus terminology.

Conclusion

As a comprehensive and standardized system for imaging HCC, LI-RADS has been developed over many years through a rigorous process of refinement and consensus-based approval. Decisions to date have been grounded on evidence when available and on expert opinion and conformity to existing guidelines otherwise. Unfortunately, evidence has been either unavailable or reported using inconsistent or ambiguous terminology, which has created challenges for the interpretation and incorporation of those data. Additionally, most diagnostic accuracy studies have failed to report results with sufficient granularity to allow extraction of all relevant information. For these reasons, much of the current LI-RADS content has been developed without robust supportive evidence. Now that LI-RADS is operational, we anticipate that its future refinement will be guided by emerging high-quality data.

In this article, we have reviewed many of the knowledge gaps to be filled by forthcoming research. We hope that future research projects and manuscripts will adopt the LI-RADS terminology to enable pooling of data across studies and allow for robust meta-analysis. Similarly, we urge researchers to report performance results for individual features and for relevant feature combinations or, when possible, to publish deidentified datasets that allow other investigators to extract the needed information. The development of structured reporting tools that seamlessly integrate into routine clinical workflows will further enhance the adoption of LI-RADS, facilitate the creation of large registries, and allow the potentially transformative collection of “Big Data” related to HCC. In parallel with these research endeavors, multidisciplinary cooperative efforts are needed to unify the various diagnostic systems. By providing a rigorous and comprehensive foundation, LI-RADS may play a central role in this desired unification.