Introduction

Approximately 250 million people are hepatitis B surface antigen (HBsAg) positive globally [1, 2]. Chronic hepatitis B virus (HBV) infection is one of the most common causes of liver cirrhosis. Patients with cirrhosis are at higher risk of developing liver-related events (LRE), including hepatic decompensation, hepatocellular carcinoma (HCC), and liver-related death. Compensated HBV-induced cirrhosis patients with detectable HBV DNA are recommended to initiate long-term antiviral therapy with effective nucleos(t)ide analogues (NAs). Antiviral agents have been shown to reduce the risk of disease progression and should be initiated as early as possible in patients with HBV-induced cirrhosis [3]. Since the risk of disease progression is not completely eliminated [4, 5], early identification of patients at high risk of LRE will benefit them by closer screening of early HCC or timely management of portal hypertension, e.g. screening for varices and prophylactic beta blockers for those with large varices.

Many models have been developed to predict LRE. Age, male gender, HBVDNA, alanine aminotransferase (ALT), total bilirubin (TB), albumin (ALB), core promoter mutation, hepatitis B e antigen (HBeAg), and presence of cirrhosis have been shown to be independent predictors of HCC [6,7,8]. With dramatic improvements in patients on long-term NA therapy, HBV DNA, ALT, and cirrhosis at the onset of treatment are less informative in predicting LRE than in untreated patients. New evaluation system such as the Albumin–Bilirubin (ALBI) grade [9, 10] or some models incorporating on-treatment non-invasive tests, notably liver stiffness measurement (LSM) have been developed to predict clinical outcomes in hepatitis B patients receiving NA therapy [11,12,13,14]. However, the vast majority of the models described above focused on prediction of HCC. Although HCC is the most dreaded outcome in patients with HBV-induced cirrhosis receiving NA, hepatic decompensation can occur in some patients. Our previous studies presented 3-year incidence of LRE in compensated HBV-induced cirrhosis patients receiving NA was 6.1–7.7% [15, 16], however only a few studies reported long-term incidence of LRE [17,18,19] and on-treatment independent risk predictors for LRE are unknown.

Therefore, we leveraged long-term follow-up data in two clinical studies to describe the long-term incidence of LRE and to determine on-treatment-independent risk predictors for LRE in compensated HBV-induced cirrhosis patients.

Patients and methods

Study population

Prospective cohorts of adults with HBV-related compensated cirrhosis were enrolled from March 2012 to October 2015 in two clinical studies [15, 16]. Inclusion criteria were: adults aged 18–70, treatment-naïve; HBV DNA > 2000 IU/mL for HBeAg-positive patients or > 200 IU/mL for HBeAg-negative patients, with evidence of cirrhosis. Cirrhosis was defined based on biopsy, presence of esophageal or gastric varices on endoscopy, or meeting at least two of the following four criteria: a. Imaging [abdominal ultrasonography (US), contrast-enhanced computed tomography (CT) or magnetic resonance imaging (MRI)] findings of liver surface nodularity and echogenicity; b. Platelet (PLT) < 100 × 109/L with no other cause; c. Serum albumin (ALB) < 35.0 g/L, or international normalized ratio (INR) > 1.3; d. LSM > 12.4 kPa [when ALT < 5 × upper limit of normal (ULN)]. Patients were excluded if they had decompensated cirrhosis (ascites, variceal bleeding or hepatic encephalopathy), HCC, other concomitant liver disease, other malignancies, severe systemic diseases, or were pregnant.

Treatment allocation

Cohort 1 was from a randomized controlled trial that began in June 2013, in which eligible patients were randomized 1:1 to (i) Entecavir (ETV, 0.5 mg/day, po) monotherapy or (ii) ETV monotherapy lead-in for 6 months then Thymosin-alpha1 (Thy-α1, 1.6 μg twice a week, subcutaneous injection) plus ETV monotherapy for 1 year, followed by ETV monotherapy. Cohort 2 was from a real-world observational study that began in March 2012, in which eligible patients chose ETV monotherapy or Lamivudine (LAM 100 mg/day, po) plus Adefovir (ADV 10 mg/day, po) after detailed explanation of the pros and cons of the two choices.

All patients remained on the same treatment at the time of data analysis, except for those who had primary nonresponse or virological breakthrough. Primary nonresponse was defined as less than 2 log10 decrease in serum HBV DNA after 6 months of treatment. Virological breakthrough was defined as increase in HBV DNA from nadir by more than 1 log10 IU/mL on-treatment. As tenofovir disoproxil fumarate (TDF) was not approved for hepatitis B in China until June 2014, the rescue therapy for patients with primary nonresponse or virological breakthrough was ETV 1.0 mg/day initially, and ETV 1.0 mg/day or TDF 300 mg/day after TDF became available.

Follow-up and clinical evaluation

Patients were evaluated at baseline every 6 months and had the following tests at each visit: blood count, liver biochemistries, HBV DNA, alpha-fetoprotein (AFP), INR, LSM, and abdominal US. HBV DNA, HBsAg, HBeAg, and hepatitis B e antibody (anti-HBe) were tested at a central lab and other tests were done at local centers. HBV DNA was tested using polymerase-chain-reaction assay with a linear range of detection from 20 to 1.7 × 108 IU/mL (Abbott Laboratories, North Chicago, United States). HBV serologies were tested using chemiluminescent immunoassay (Abbott GmbH & Co.KG, Wiesbaden, Germany). LSM was performed with a Fibroscan520 (Echosens, Paris, France) or Fibrotouch (Wuxi Hisky Medical Technology Co., Ltd., Wuxi, China) following the manufacturers’ instructions. It was considered as reliable with at least ten valid measurements and an interquartile to median ratio ≤ 30%. The normal range of ALT and aspartate aminotransferase (AST) was defined as ≤ 25 U/L in woman and ≤ 35 U/L in man, TB ≤ 17.1 μmol/L, ALB ≤ 35 g/L, GGT ≤ 50 U/L, PLT ≥ 100 × 109/L, AFP ≤ 15 ng/ml, LSM ≤ 12.5 kPa. ALBI, AST to PLT ratio index (APRI) and FIB-4 were calculated as previously reported [9, 20, 21].

LRE was a composite of hepatic decompensation, HCC, or death. Hepatic decompensation was defined as presence of ascites, hepatic encephalopathy (HE), esophageal variceal bleeding (VB), spontaneous bacterial peritonitis, or hepatorenal syndrome. The diagnosis of HCC was confirmed by at least two radiological methods such as abdominal US, contrast-enhanced CT, MRI, or angiography.

Statistical analyses

For descriptive analysis, continuous variables are expressed as median and interquartile range (IQR) and categorical variables as number and percent. Comparisons between groups for continuous variables were conducted with Mann-Whitney U test and for categorical variables with Chi-square test.

The cumulative incidence of LRE, decompensation, and HCC were estimated with Kaplan-Meier method and comparison of the survival curves between groups was conducted with log-rank test. Data were censored when any of the following occurred first: HCC, decompensation, death, loss to follow-up or June 30, 2019. Missing values of HBV DNA were imputed with local lab results if available. Lower limit of detection (LLD) of HBV DNA in local labs varied from 200 to 1000 IU/mL, if the result was reported as < LLD, we imputed the value with the corresponding LLD.

Since dramatic lab changes mainly happen during the first year of treatment and to get early prediction of long-term prognosis, we chose baseline, month 6 and month 12 on-treatment as representative time points of prediction. Univariate analysis of Cox regression was used to evaluate predictors of LRE. Routine tested variables of HBV DNA, AST, TB, ALB, GGT, PLT, INR, AFP, LSM and model for end stage liver disease (MELD) score were evaluated in univariate analysis. Variables with p values < 0.1 were included in multivariate analysis by forward stepwise approach. Age and sex were forced into the models. In multivariate models, each variable was analyzed in four ways at three time points: a. absolute value at baseline, month 6 or 12; b. delta (change in absolute value) from baseline to month 6 or 12; c. delta percent from baseline (change in value as a percent of baseline value) at month 6 or 12; d. norm (to be normal or undetectable) at month 6 or 12. If MELD score is included in multivariate analysis, TB and/or INR will not be valuated again.

The area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were used to describe performance of each model. A final model was selected with the highest AUROC. Comparisons of AUROC among prediction models were performed using the method of Delong et al. [22]. The Akaike information criterion (AIC) was used to compare model fit with lower value indicating better fit model. The results are presented with adjusted hazard ratio (HR) and their 95%CI. A p < 0.05 is considered as statistically significant difference. Internal validation of the models was performed by bootstrapping method with 1000 random samples of the same size as the original dataset. Data analyses were performed with SPSS (IBM SPSS Statistics Version 27.0).

Results

Study population, demographics and baseline characteristics

From both cohorts, a total of 937 treatment-naïve patients with compensated HBV-induced cirrhosis were enrolled in this long-term follow-up study. Baseline characteristics of the patients who received ETV monotherapy in the two studies were similar and combined as one group. Thus, altogether 608 (64.9%) patients were treated with ETV monotherapy, 252 (26.9%) with ETV + Thy-α1, and 77 (8.2%) with LAM + ADV (Fig. 1). Demographic and baseline characteristics of the patients who received ETV monotherapy, ETV + Thy-α1, and LAM + ADV were similar among the three groups, except for lower HBV DNA, higher INR and longer follow-up duration in LAM + ADV group (Supplementary Table 1).

Fig. 1
figure 1

Flowchart. RCT, randomized controlled trial; ETV entecavir, Thy-α1 thymosin-alpha1, LRE liver related events, HCC hepatocellular carcinoma, LAM lamivudine, ADV adefovir, VB variceal bleeding, HE hepatoencephalopathy

Table 1 Demographics and clinical characteristics of combined cohort

Liver-related events and clinical features on-treatment

After a median follow-up of 4.5 years, 88 patients developed LRE, including 48 patients with HCC (one subsequent liver transplantation and four subsequent deaths), and 40 patients with a total of 45 decompensation complications: 24 ascites, 20 VB, and 1 HE (Fig. 1). The 5-year cumulative incidence (95% CI) of LRE, hepatic decompensation, and HCC were 12.7% (10.4–15.5%), 5.8% (4.2–7.9%), and 7.4% (5.5–9.7%) respectively. Three patients died of lung cancer, colon cancer and endometrial cancer, respectively, before LRE. There was no significant difference between incidence of hepatic decompensation and HCC (p = 0.59). Subgroup analysis of the three treatment groups showed no significant differences in cumulative incidence of LRE (Supplementary Fig. 2).

Fig. 2
figure 2

Cumulative incidence of LRE, decompensation and HCC. LRE liver related events, HCC hepatocellular carcinoma

At baseline, patients with LRE were more likely to be men, smokers, with lower PLT, higher INR, and higher MELD score at baseline compared to those without LRE. Baseline HBV DNA, HBeAg status, liver biochemistries, AFP, LSM and Child-Turcotte-Pugh (CTP) score were similar in the two groups (Table 1).

After initiation of antiviral treatment, HBV DNA decreased rapidly during the first 6 months on treatment and then remained stable in all patients with no significant difference between LRE group and non-LRE group, four patients (2 in ETV group and 2 in LAM + ADV group) had poor treatment compliance during the first year of study, two (1 in ETV group and 1 in LAM + ADV group) of them withdrew inform consent and two were kept in the cohort with good compliance thereafter. A total of 14 (1.5%) patients had rescue therapy with ETV (1.0 mg) or TDF, due to primary non-response in 2 patients in ETV group, and virological breakthrough in 12 patients, including 7 (1.2%) in ETV group, 3 (1.2%) in ETV + Thy-α1 group and 2 (2.6%) in LAM + ADV group. In both LRE group and non-LRE group, all routinely tested parameters of ALT, AST, ALB, GGT, PLT, INR, AFP and ALBI improved after initiation of antiviral therapy, while bilirubin, CTP and MELD score has no significant change, because these values were not high at baseline (Fig. 2). Non-invasive indicator of liver fibrosis, LSM improved rapidly during the first 6 months (reduced from a median of 20.6–16.3 kPa in the LRE group versus 18.0–12.8 kPa in the non-LRE group, p < 0.01), then continued to decline steadily but significantly through 3 years (reduced to a median of 10.1 kPa in the LRE group versus 8.8 ± kPa in the non-LRE group, p < 0.01) (Supplementary Fig. 2, 3 and 4).

Prediction models of LRE, HCC and decompensation

By univariate analysis, baseline age, sex, AST, GGT, PLT, and MELD; and month 6 AST, ALB, GGT, PLT, AFP, LSM, and MELD; and month 12 ALB, GGT, PLT, AFP, LSM, and MELD were p < 0.1 for prediction of LRE (Supplementary Table 2, 3, 4.).

By multivariate analysis, lab values at month 6 that were independent predictors of outcomes included GGT, PLT, AFP for LRE; GGT and PLT for hepatic decompensation; and age, sex, GGT, and AFP for HCC (Table 2). By multivariate analysis, lab values at month 12 that were independent predictors of outcomes included sex, PLT, and AFP for LRE; GGT, PLT and LSM for hepatic decompensation; and age, sex, and AFP for HCC (Supplementary Table 5). Independent risk predictors at baseline are shown in supplementary Table 6.

Table 2 Multivariate analysis of independent predictors at month 6 for LRE

For prediction of LRE, decompensation, or HCC, all models using variables at month 6 or month 12 had better fit and higher accuracy than models using baseline variables. For prediction of LRE and HCC, models at month 6 had the best fit and highest accuracy. For prediction of hepatic decompensation, the model at month 12 had the best fit though the AUROC was not significantly different compared with the model at month 6 (p = 0.82) (Table 3; Fig. 3a, b, c).

Table 3 Performance of prediction models for liver-related events
Fig. 3
figure 3

Comparison of AUROC for the prediction of LRE, hepatic decompensation and HCC. 3a, b, c are comparison among models at baseline, month 6 and month 12. Models at month 6 and month 12 have better discrimination than models at baseline. Prediction model for hepatic decompensation at month 12 had better discrimination than month 6, Prediction model for HCC at month 6 had better discrimination than month 12, but neither of the difference is statistically significant. 3d, e, and f are comparison of prediction for LRE, hepatic decompensation and HCC among models at month 6 and previous risk stratification systems or models. Models at month 6 performs better than CTP, MELD and ALBI for prediction of LRE and hepatic decompensation as well as CU-HCC, LSM-HCC and PAGE-B for prediction of HCC. AUROC area under the receiver operating characteristic curve, LRE liver-related events, HCC hepatocellular carcinoma, CTP Child-Turcotte-Pugh, MELD model for end-stage liver disease, ALBI albumin-bilirubin

In general, models using on-treatment variables at month 6 or 12 performed better in predicting LRE, hepatic decompensation and HCC than models using baseline values regardless of how each variable was assessed (absolute value, delta, delta percent, or norm, supplementary Table 8). All models had high negative predictive values of 94.0–98.8%, but positive predictive values were low 9.3–32.3% (Table 3). Bootstrap in Cox regression had similar results, beta and 95%CI for variables at month 6 were shown in supplementary Table 7. Models at month 6 perform better than CTP, MELD and ALBI for prediction of LRE and hepatic decompensation, or previous models of CU-HCC, LSM-HCC, and PAGE-B for prediction of HCC (Fig. 3d, e, f).

Discussion

In this large cohort of treatment naïve patients with compensated HBV-induced cirrhosis, we found that despite rapid suppression of HBV replication on antiviral treatment, LRE still occurred in some patients with five year cumulative incidence of any LRE of 12.7%, decompensation alone of 5.9%, and HCC alone of 7.4%. We developed several models using routinely available parameters to predict LRE, decompensation, and HCC with good accuracy and high NPV.

Our cohorts began 8 years ago when LAM and ADV and later ETV were the most common antivirals available in China. Lamivudine was demonstrated to improve incidence of LRE significantly in patients with compensated HBV cirrhosis [23], but it is no longer the first-line therapy due to high risk of resistance. A meta-analysis in patients with LAM resistance showed that rescue therapy with ADV add-on to LAM was equally effective as switch to ETV [24]. Our previous studies also showed that initiative combination of LAM + ADV had similar efficacy to ETV in 3 year cumulative incidence of LRE in patients with compensated HBV cirrhosis [16]. In addition, we also presented that ETV and ETV + Thy-α1 had similar 3 year cumulative incidence of LRE, HCC or decompensation [15]. In this study, we extended our follow-up to 5 years and showed similar 5 year cumulative incidence of LRE, hepatic decompensation, and HCC among LAM + ADV, ETV and ETV + Thy-α1 groups. As most patients achieved sustained HBV DNA undetectable (< 200 IU/mL) after initiation of antiviral treatment, we combined the three groups in our analysis to show the clinical efficacy and to predict LRE.

In CHB patients, ETV and TDF have demonstrated long-term efficacy and safety for more than 10 years [25]. In patients with HBV-induced cirrhosis, several studies have shown clinical benefits of long term sustained viral suppression. Six years treatment with ETV or 5 years treatment with TDF led to regression of cirrhosis [26,27,28]. During a median follow-up of 6 years (range 1–14 years) treatment with ETV or TDF in compensated HBV-induced cirrhosis patients, the overall and liver-related 8-year survival rates were 89.3% and 94.1%, the standard mortality rate was similar to the general population [19]. Wong et al. reported in a retrospective-prospective cohort study that ETV reduced the risk of hepatic events, HCC, liver-related and all-cause mortality in patients with HBV-induced cirrhosis [17], however, the 5-year cumulative incidence of LRE and HCC was still higher in Wong’s study than in our cohort (25.5% vs. 12.7%, 13.8% vs. 7.4%, respectively). Since we used stricter diagnostic criteria of liver cirrhosis, the patients in our study had lower ALB, lower PLT, and higher MELD score at baseline than patients in Wong’s study, it may be the younger age in our cohort (mean age 47 ± 10 vs. 55 ± 11) that led to lower incidence of LRE and HCC.

In this prospective large cohort study, we determined independent risk predictors for LRE, hepatic decompensation and HCC from routinely available lab parameters. For prediction of disease progression or regression, increase of PLT at 1.5 year on-treatment with ETV was observed to be associated with improvement of liver fibrosis [29], on-treatment changes of LSM at 0.5 year was reported to predict 2-year clinical outcomes in compensated HBV-induced cirrhosis [13]. Only one study reported 5-year serial improvement of LSM during LAM or ETV treatment in histologically diagnosed advanced fibrosis (METAVIR > F3) reflecting reversal of liver fibrosis [30], however, few LRE occurred and the association of LSM improvement with LRE could not be confirmed. In this study, we found that GGT, PLT, AFP were independent predictors for LRE. All models using variables at month 6 or 12 had better fit than models using baseline values. The best models for prediction of LRE, hepatic decompensation and HCC have good discrimination capacity and high NPV-negative predictive values, which are significantly better than the previous complex indices of liver fibrosis or previous HCC prediction models of CU-HCC, LSM-HCC, and PAGE-B.

Risk prediction is very important in patient management of chronic hepatitis B. Antiviral therapy decreases but does not eliminate risk of disease progression in patients with HBV-induced cirrhosis. This is the first study to solely focus on patients with compensated HBV-induced cirrhosis for prediction of LRE. The CTP and MELD score are widely accepted risk prediction systems for patients with liver cirrhosis, but as shown in this study, few patients had significant CTP or MELD score change before and after initiation of antiviral therapy in patients with compensated HBV-induced cirrhosis, therefore, the CTP and MELD score systems are not sensitive enough to reflect the gradual risk change in this population. ALBI score is another promising grade system for prediction of long-term prognosis in patients with HBV-induced cirrhosis [10, 31], however, bilirubin in most compensated patients is within normal range and had little change after initiation of antiviral therapy, therefore, the application of ALBI score is limited either. We proposed relative risk prediction models in this study and they showed better performance than CTP, MELD and ALBI score systems in treatment naïve patients with compensated HBV-induced cirrhosis.

Besides a general prediction of LRE, we also investigated predictive models for hepatic decompensation and HCC, respectively. As demonstrated, both independent predictors and optimal time point for prediction are different between hepatic decompensation and HCC: age and sex do not affect hepatic decompensation as much as HCC, GGT is an independent predictor for both hepatic decompensation and HCC, PLT is an independent predictor for hepatic decompensation but not for HCC, AFP is an independent predictor for HCC but not for hepatic decompensation; the optimal prediction time point is month 12 for hepatic decompensation and month 6 for HCC. In addition, the study systematically evaluated the predictive value of routinely tested parameters before and after initiation of antiviral treatment and illustrated that on-treatment prediction is more accurate than baseline prediction. In the antiviral era of chronic hepatitis B, these results confirmed the antiviral efficacy and emphasized importance of on-treatment follow-up in patients with compensated liver cirrhosis.

There are several limitations in our study. First, this was a combined cohort study including three different treatments groups and one study was not a randomized trial, but sub-group analysis showed similar outcomes across the three treatment groups. TDF which is widely used nowadays was not included in this study due to the price and reimbursement issues and further investigation is warrant to verify its efficacy in improving the long-term clinical outcomes. Second, as only around 10% of our patients had baseline upper gastrointestinal endoscopy, we were unable to do subgroup analysis according to baseline varices. Third, five years observation in CHB patient on-treatment may still not be long enough to elucidate long-term outcomes. Fourth, quantitative change of HBsAg or HBeAg may play a role in the prediction, however central tests were not available for all patients and local tests showed great disparity in methods and reference range, therefore, they were not included in our analysis. Fifth, we developed the prediction model with parameters at baseline, 6 months and 12 months on-treatment, further longitudinal or dynamic prediction models need investigation. Finally, all the patients were Asians from China, and external validations of our models are needed.

In conclusion, despite effective suppression of HBV replication in treatment naïve patients with compensated liver cirrhosis, LRE occurred with five year cumulative incidence of 12.7%. Models using on-treatment values are more accurate in prediction of LRE, hepatic decompensation, and HCC than models using baseline values.