Introduction

Tremendous progress has been made in our understanding of chronic hepatitis B virus (HBV) infection and in its prevention and treatment in the last few decades; however, the global burden of HBV infection remains high. Expanding coverage of universal HBV vaccine and improving diagnosis and linkage to care are essential to meet the World Health Organization goal of eliminating HBV infection by 2030, a strategy that contributes to the proposed targets for the reduction of chronic viral hepatitis incidence and mortality of 80% and 65%, respectively [1]. In order to achieve a reduction of chronic viral hepatitis mortality of 65%, focused effort is of top priority to prevent deaths from the complications of chronic HBV infection. Hepatocellular carcinoma (HCC) is the fifth most common cancer in men and ninth most common in women worldwide [2]. HCC is one of the top killers as it carries a high mortality rate and represents the third most frequent cause of cancer death globally (782,000 deaths in 2018) [2]. Chronic HBV infection is a key risk factor for HCC development, which accounts for approximately 50% of cases worldwide and 70–80% of cases in regions where HBV is highly endemic [3•]. On the other hand, up to 30–40% of chronically infected persons will die of complications of chronic liver disease, including cirrhosis and HCC [4]. The majority of HCC disease burden (85%) is found in low- and middle-income countries with high prevalence of HBV such as the Asia-Pacific region [5]. This pattern of disease burden places heavy financial needs in the areas where resources for antiviral therapy, HCC surveillance, diagnosis, and treatment are often limited. There is therefore an urgent need to develop accurate risk scores for HBV-related HCC to guide patient selection for antiviral therapy and HCC surveillance.

Developing HCC Risk Scores

Existing HCC risk scores were mostly developed using traditional regression methods. The most commonly used method is Cox proportional hazards regression as it models the relationship between covariates and the time to HCC development and is a semi-parametric model that does not require stringent assumptions on the underlying distribution of time to HCC development, unlike parametric survival models such as Weibull model. In the regression model, cutoffs for continuous covariates are commonly adopted to simplify the calculation of the risk score while preserving the overall prediction accuracy; though this may be less relevant today due to the advances in application software or online calculator. The risk score is usually built by first giving weight to each of the selected factors according to the regression coefficient. Then, the risk score is formulated as the weighted sum of the selected factors or a function transformation of the weighted sum. The performance of risk score on calibration and discrimination would be validated internally or better externally and independently. Single or multiple cutoff values would be determined to stratify patients into different risk of HCC development.

An extension of Cox regression is time-dependent Cox regression, which incorporates covariates that change over time. The natural history of chronic hepatitis B (CHB) is dynamic and involves complicated interaction between the viral and host factors. Also, long-term antiviral treatment modifies the natural history of CHB, which can prevent disease progression and improve hepatic function and liver fibrosis [6, 7•]. Thus, the risk of HCC is reduced over long-term effective antiviral treatment [8, 9•]. To improve the prediction on HCC risk by incorporating changing liver function and liver fibrosis status over time, especially for patients who receive antiviral treatment, time-dependent Cox regression is a reasonable approach. On the other hand, competing events of HCC including non-HCC-related death and liver transplantation can lead to overestimation of cumulative incidence of HCC and thus mistake in the association of factors with HCC development. Cox regression-based competing risk model such as Fine-Gray subdistribution hazard model and cause-specific hazard model may have their role if a significant proportion of patients experience competing events, for example, in cirrhotic patients. Hsu et al. used Fine-Gray model to identify HCC risk factors (Table 1) and examined the impact of treatment duration on the reduced annual incidence of HCC [10]. They subsequently developed a HCC risk score called CAMD score for Asian CHB patients on entecavir or tenofovir disoproxil fumarate (TDF) [11].

Table 1 Risk factors for HBV-related HCC

Missing data is another less noticeable but potentially important issue in risk score development. Many HCC risk scores involve laboratory measurements including alanine aminotransferase, platelet counts, albumin and total bilirubin, and viral markers including HBV DNA and hepatitis B e antigen status, which may not be measured in all patients. Moreover, development cohort of the risk score may be from multiple centers that have their own protocols to collect clinical data; this causes missing data if not every centers have the same set of measurements. Risk score development based on complete cases can introduce selection bias and affect the precision of effect estimate of risk factors for HCC. Methods to handle missing data like multiple imputation can help overcome bias arisen from missing data. It is worth to notice that except PAGE-B score, most of the existing HCC risk scores did not mention clearly on how missing data are handled in model development, which may impact their reliability and generalizability.

HCC Risk Scores in Untreated Patients (Table 2)

REACH-B and mREACH-B Scores

The risk estimation for hepatocellular carcinoma in chronic hepatitis B (REACH-B) score was developed in the cohort of 3584 non-cirrhotic treatment-naive CHB patients from the landmark, community-based Taiwanese REVEAL-HBV study [12]. During the median follow-up of 12 years, 131 patients developed HCC. The score was composed of gender, age, ALT, HBeAg status, and HBV DNA level and ranged from 0 to 17. Another cohort of 1505 treatment-naive patients from Hong Kong and South Korea constituted the validation cohort, with 111 of these patients developed HCC. The areas under the receiver operating characteristics (AUROCs) of REACH-B score predicting HCC risk in validation cohort were 0.811, 0.796, and 0.769 at 3, 5, and 10 years, respectively. As this score was derived from a non-cirrhotic cohort, its performance was indeed better in the non-cirrhotic patients in validation cohort, with higher AUROCs 0.902, 0.783, and 0.806 at 3, 5, and 10 years.

Table 2 HCC risk prediction scores in untreated patients

However, the importance of HBV DNA diminished in patients with complete virological response, and REACH-B score cannot work well in CHB patients treated with nucleos(t)ide analogues. Hence, the modified REACH-B (mREACH-B) score was developed based on liver stiffness measurement (LSM) instead of suppressed HBV DNA level in the cohort of 192 entecavir-treated CHB patients who achieved complete virological response in South Korea [13]. During the median follow-up of 43 months, 15 patients developed HCC. The weighting of LSM value < 8.0 kPa, 8.0–13.0 kPa, and > 13.0 kPa was assigned for 0, 1, and 2 points, respectively, in mREACH-B I score; its AUROC at 3 years was 0.805 which was better than 0.629 of REACH-B score. The AUROC of mREACH-B II score was slightly improved to 0.814 at 3 years by upscaling the weighting 0, 2, and 4 points for LSM value < 8.0 kPa, 8.0–13.0 kPa, and > 13.0 kPa, respectively.

CU-HCC and LSM-HCC Scores

The Chinese University HCC (CU-HCC) score was developed in the cohort of 1005 CHB patients at the Prince of Wales Hospital from a prospective study [14•]. During the median follow-up of nearly 10 years, 152 patients (15.1%) received antiviral therapy, and 105 patients (10.4%) developed HCC. The score is consisted of age, serum albumin, total bilirubin, HBV DNA, and cirrhosis which ranges from 0 to 44.5. By using the cutoff value of 5 and 20, patients were stratified into low-risk, intermediate-risk, and high-risk groups; 12 (2.2%), 41 (14.5%), and 52 (29.4%) patients of these groups developed HCC, respectively. The corresponding sensitivity and negative predictive value (NPV) were 88.6% and 97.8%. CU-HCC score was validated in the cohort of 424 CHB patients, and 45 patients (10.6%) developed HCC during the median follow-up of 10.53 years. The number of patients developing HCC in low-risk, intermediate-risk, and high-risk groups was 8 (2.7%), 20 (31.8%), and 17 (26.6%), respectively. The sensitivity and NPV of the cutoff value of 5 were 82.2% and 97.3%, which remained satisfactory.

As cirrhosis cannot be diagnosed accurately by ultrasonography, LSM-HCC score was refined from CU-HCC score by using LSM instead of clinical cirrhosis, which could be of certain subjectivity [15•, 16]. A total of 1555 CHB patients in Hong Kong were included and assigned to training (1035 patients) and validation (520 patients) cohorts randomly. Thirty-eight patients (3.7%) in the training cohort and 17 patients (3.4%) in the validation cohort developed HCC during the mean follow-up of 69 months. LSM-HCC score was composed of LSM, age, serum albumin, and HBV DNA level (i.e., one component, total bilirubin, fewer than CU-HCC score) and ranges from 0 to 30. By using 11 as the cutoff value, 4 patients (0.6%) and 29 patients (8.8%) in the low- and high-risk groups developed HCC at 5 years in the training cohort. The corresponding sensitivity and NPV were 87.9% and 99.4% at 5 years. In the validation cohort, 1 (0.3%) and 12 (7.6%) patients developed HCC in the low- and high-risk categories at 5 years. The AUROCs were 0.83 at 5 years in both training and validation cohort.

Combination of Serum Biomarkers and LSM (e.g., ELF-LSM-HCC Score)

Combining enhanced liver fibrosis (ELF), proprietary serum biomarkers for liver fibrosis, and LSM score reduces the number of patients with uncertain diagnosis of advanced liver fibrosis by LSM [17]. A novel ELF-LSM-HCC score was developed in the cohort of 453 intermediate- and high-risk CHB patients defined by LSM-HCC score in Hong Kong [18]. Patients were all NA-treated; 45 (9.9%) patients developed HCC during the mean follow-up of 56 months. Patients with intermediate risk of HCC defined by LSM-HCC score would have ELF score performed to further stratify the HCC risk into low-risk or high-risk groups. HCC occurred in 6 (4.7%) and 15 (11.7%) patients respectively in the low- and high-risk patients defined by the combined ELF-LSM-HCC. The sensitivity and NPV to predict HCC were 86.7% and 95.3%.

Other HCC Risk Scores—GAG-HCC, NGM1-HCC, and NGM2-HCC

The Guide with Age, Gender, HBV DNA, Core Promoter Mutations and Cirrhosis-HCC (GAG-HCC) score was derived from 820 CHB patients in the Queen Marry Hospital, Hong Kong [19]. Patients did not receive any treatment for CHB at baseline. During the mean follow-up of 76.8 months, 40 patients (4.9%) developed HCC. The original version of the GAG-HCC score was composed of gender, age, HBV DNA, core promoter mutations, and cirrhosis. The cutoff value of 101 had sensitivity and NPV of > 84% and > 98% at both 5 and 10 years prediction. The accuracy was not validated by an independent cohort but with the leave-one-out validation. As core promoter mutation result may not be readily available in many laboratories, a simplified version of GAG-HCC score was developed by excluding core promoter mutations. The sensitivity and NPV were > 67% and > 98% at 5 and 10 years by using the cutoff value of 100 and 82. The AUROCs of both versions were ≥ 0.87 at 5 and 10 years prediction.

Nomogram 1-HCC (NGM1-HCC) and nomogram 2-HCC (NGM2-HCC) were derived from 3653 Taiwanese CHB patients of REVEAL-HBV study cohort [20]. These patients were randomly divided into training and validation cohort by ratio of 2:1. Gender, age, family history of HCC, alcohol consumption habit, serum ALT level, serum HBeAg status, serum HBV DNA level, and HBV genotypes were all independent risk factors to predict HCC and composed risk scores. Nomograms were developed to predict the risk of HCC by the total risk score of each patient. The AUROCs in all training and validation cohorts of risk prediction nomograms were ≥ 0.82. The risk prediction tools predict accurately with correlation coefficients more than 0.9 between observed and estimated HCC risk. Yet, there may be a concern of lack of HBV genotype information, as it is not a routine assay in many centers.

HCC Risk Scores in Treated Patients

Performances of Untreated-Derived Risk Scores in Treated Cohorts

Reducing the risk of HCC is a main goal for managing patients with CHB. As primary prevention of HCC in CHB patients, long-term effective antiviral treatment is initiated for patients with active CHB or liver cirrhosis to lower the risk of disease progression [21,22,23]. However, antiviral treatment can reduce but not eliminate the risk of HCC (Fig. 1). It is necessary to have accurate HCC risk scores among treated patients as they are at risk and will benefit from accurate prediction on disease progression and HCC development. Among nucleos(t)ide analogues (NA)-treated patients, the performances of untreated-derived risk scores including CU-HCC, GAG-HCC, and REACH-B scores have shown to be modest in Asian patients [24] yet unsatisfactory in Caucasian patients [25]. In a long-term study of patients of mixed ethnicity in the registration trials of TDF, the incidence rate of HCC reduced in non-cirrhotic patients on TDF and gradually deviated from that predicted by REACH-B score [26]. Among untreated patients, high serum HBV DNA is a well-known risk factor for HCC [27]. All untreated-derived HCC risk scores include HBV DNA level as an important predictor. However, HBV DNA is suppressed in the majority of NA-treated patients, and it is thus less discriminative after treatment initiation [28]. Meanwhile, antiviral therapy can improve patients’ necroinflammation and liver function, which are reflected by alanine aminotransferase normalization, as well as improved serum albumin and total bilirubin levels. HCC risk scores that rely on these laboratory parameters are expected to be less predictive after treatment commencement [29•]. While untreated-derived risk scores retain some predictive values due to the significant weighting on age and cirrhosis, HCC risk scores that derived among virally suppressed CHB patients under antiviral treatment are necessary. After all, the patient population in the old days that was used to derive those untreated-derived risk scores may be less clinically relevant now as some of those at-risk patients fulfill treatment criteria and will receive treatment based on current management.

Fig. 1
figure 1

Risk of hepatocellular carcinoma in the next 3 years by risk scores after antiviral treatment (results adopted from Wong et al. [24])

PAGE-B and Modified PAGE-B

In view of the modest performance of untreated-derived risk scores on treated patients, PAGE-B score is derived as a simple risk score composed of age, gender, and platelet counts to predict HCC risk for up to 5 years among Caucasian patients under entecavir/tenofovir treatment [30]. Subsequently, Korean investigators modified the weighting of age, gender, and platelet and included serum albumin as an additional factor in the modified PAGE-B (mPAGE-B) score for Asian patients with CHB on entecavir/tenofovir treatment [31]. Both PAGE-B and mPAGE-B scores show a good discriminatory ability (an AUROC of > 0.8) on HCC development in treated patients and have also been well validated in independent treated cohorts [32, 33•, 34, 35]. As an increasing number of patients have received long-term effective antiviral treatment, the group of investigators who developed PAGE-B score recently examined the role of liver stiffness measurement by transient elastography at year 5 in HCC risk score as a surrogate marker of severity of liver fibrosis in patients after 5 years of entecavir/tenofovir treatment. They included liver stiffness measurement at year 5 into an updated PAGE-B score for this patient population [36].

HCC Risk Scores and Surveillance Recommendation

Until now, 7 and 12 HCC risk scores have been proposed for treated and untreated patients with CHB, respectively [37]. Nonetheless, the clinical implication of these scores remains undetermined. So far, no recommendation has been given by any of the regional clinical practice guidelines on the optimal use of HCC risk scores in clinical practice among different subgroups of CHB patients [21,22,23]. In HCC risk scores, most of the determined low cutoffs have undergone independent validation to give a high negative predictive value to exclude a meaningful proportion of patients with low HCC risk [38]. Based on these cutoffs, HCC risk scores may have a role on guiding surveillance for HCC in clinical setting, especially among non-cirrhotic patients. Current clinical practice guidelines suggest that HCC surveillance should continue in at-risk/all patients under effective long-term NA treatment [21,22,23]. Regular HCC surveillance can facilitate early detection of HCC that is still manageable by curative treatment, leading to improved survival [39]. Yet, surveillance relies heavily on patient adherence and compliance and may not be cost-effective in patients who are indeed at low risk of HCC development. CHB patients on long-term effective NA treatment have a reduced risk of HCC development [8]. Therefore, we expect that some patients who have well-controlled CHB indicated by long-term virological, biochemical, and possibly histological responses to NA treatment may have a reduced HCC risk that is low enough to delay HCC surveillance. These patents may benefit from a delayed participation in HCC surveillance program. In contrast, high-risk patients including cirrhotic patients can continue to benefit from HCC surveillance program.

A recent retrospective study showed that among treated patients who are classified as low risk by PAGE-B or mPAGE-B scores, their HCC incidence can be less than 0.2% annually, i.e., the cost-effective threshold for HCC surveillance in non-cirrhotic CHB patients [23, 33•]. It is thus possible to delay HCC surveillance in these patients if they do not have advanced liver disease, cirrhosis, or strong family history of HCC, while more concrete evidences are warranted from prospective studies. PAGE-B and mPAGE-B scores are based on simple calculation using objective demographic and routinely available laboratory parameters. They have the potential to be widely used in clinical practice to reassess non-cirrhotic patients regularly and identify those with low HCC risk who require no surveillance in the near future. On the other hand, most of the HCC risk scores for treated patients were derived based on clinical characteristics at the time of entecavir/tenofovir initiation and validated only in patients who received 2 or 3 years of treatment. Studies that validate the accuracy of the HCC risk scores in patients who have received long-term effective antiviral treatment are warranted. A recent study suggested that adjustment to HCC risk scores is needed after 5 years of effective antiviral treatment [36]. Moreover, validation of HCC risk scores for non-cirrhotic patients, for patients with different ethnicity, or separately for entecavir and tenofovir-treated patients may help to further guide the use of HCC risk scores in view of the recent controversies on the first-line NA treatment on HCC prevention [40].

Conclusions and Future Perspective

There is a need to develop accurate risk scores for HBV-related HCC to prioritize patient care, and current international guidelines vary widely on their definitions of high-risk patients. Risk prediction for HCC in chronic HBV infection continues to be a dynamic and evolving field. Current risk scores can accurately predict HCC in specific populations, in both treatment-naive patients and those receiving antiviral therapy. Different levels of care and different intensities of HCC surveillance should be offered according to the risk profile of patients. Patients at high-risk category should receive antiviral therapy, as well as appropriate HCC surveillance. For patients receiving antiviral therapy, maintained virologic response should be the treatment target, particularly in patients with cirrhosis. Patients at risk of HCC should receive regular HCC surveillance even when they are receiving antiviral treatment.

However, there is more work to be done to optimize the existing risk scores in the non-Asian populations and patients on antiviral therapy. So far, PAGE-B score works best in non-Asian populations. In order to better stratify risk in these populations, risk score with best performance in that population should be checked regularly, namely, annually or biannually. Improving the diagnosis of cirrhosis in risk scores by the integration of noninvasive markers of liver fibrosis such as transient elastography or serum biomarkers is promising and will continue to be an area of further study. Finally, the process of translating HCC risk into clinical practice by redefining surveillance intervals or modalities in patients with different risks to achieve survival benefit will also be a challenge.