Introduction

Acute-on-chronic liver failure (ACLF) is a distinct entity in the spectrum of chronic liver disease, with a rapid downhill course and a poor outcome in response to an acute insult. ACLF patients have underlying chronic liver disease which becomes aggravated due to an acute precipitant. The mitigation of the acute insult may lead to spontaneous recovery in a proportion of cases. The potential of reversibility without liver transplantation or early recognition of the need for transplantation is the main reason for classifying these patients into distinct groups [1].

The two most widely used definitions for ACLF are by the European Association for the Study of Liver-Chronic Liver Failure (EASL-CLIF) consortium and the APASL ACLF Research Consortium (AARC). According to the former, in the CANONIC study, ACLF was defined as “an acute deterioration of pre-existing chronic liver disease, usually related to a precipitating event and associated with increased mortality at 12 weeks due to multisystem organ failure” [2]. As per the latter, ACLF is an acute hepatic insult manifesting as jaundice (serum bilirubin ≥5 mg/dl, i.e., ≥85 μmol/l) and coagulopathy (INR ≥ 1.5 or prothrombin activity <40%) complicated within 4 weeks by clinical ascites and/or encephalopathy in a patient with previously diagnosed or undiagnosed chronic liver disease/cirrhosis, and is associated with a high 28-day mortality [3]. One difference between the two is the issue of homogeneity of the patient population. The CLIF definition originally proposed assessment of mortality at 3 months while the APASL suggested assessment at 4 weeks. However, subsequently, the CLIF definition has been revised to include assessment of survival at 4 weeks, like APASL [4].

The core difference between the definitions is, however, the hepatic versus non-hepatic insult. While the EASL CLIF definition includes patients with a hepatic or non-hepatic insult, the AARC includes patients with only a hepatic insult. The basis for the restriction to hepatic insults was to include a more homogenous group. The development of other organ failures as concomitant or as a consequence of liver failure is a controversial issue [1]. The western definition of ACLF considers organ failure as an integral part of the definition, and includes extra-hepatic organ failure independent of liver failure as part of the definition [2, 4]. It is not known which of the approaches is relevant and truly predictive of the outcome of these seriously ill patients.

Disease severity scores such as Model for End Stage Liver Disease (MELD) have been considered for organ allocation. However, the MELD score does not take into account cerebral, circulatory and/or respiratory failures, thus giving no priority to patients with ACLF. The various ICU scores like Sequential Organ Failure Assesment (SOFA) and Acute Physiology and Chronic Health Evaluation (APACHE II) have also been evaluated for ACLF patients [5]. A study in 2012 by Garg et al. from our group showed the predictability of these scores and also the relevance of one, two or more organ failures [5]. Subsequently, the CLIF consortium has developed the CLIF-SOFA score for assessing disease severity and prognostication in ACLF. The CLIF-SOFA and the CLIF-OF (Organ Failure) scoring and the cut-off were developed arbitrarily and included patients with hepatic and non-hepatic insults [4]. Organ failure was solely derived based on a consensus opinion by experts [4]. The score is a bit cumbersome and becomes predictive of mortality only when extra-hepatic organ failures are included. We had earlier shown that patients of ACLF have a high mortality in the presence of HE and hyponatremia in addition to high MELD, APACHE II and SOFA scores [5], necessitating inclusion of these parameters. A recent study [6] showed that a simple score considering only the number of organ failures is easier to recall and superior to the CLIF SOFA score in predicting mortality in ACLF patients. Furthermore, the available prediction scores have been validated at baseline, but none has been evaluated in a dynamic manner for prognostication in ACLF patients. A dynamic model that could predict the reversibility or need for liver transplant is urgently required [7]. Early prediction of transplant-free survival, decision for transplant before onset of sepsis or multi-organ failure and prioritization for liver transplant could help improve the outcome of these patients. The objectives of the present study were to analyze the time events and clinical courses of ACLF patients and to derive a simple and dynamic prognostic model in predicting the short-term mortality and/or need for liver transplant in patients of ACLF.

Patients and methods

Study design

Patients with a diagnosis of ACLF as per the APASL definition seen between April 2009 and April 2015, initially from a single center, and after 2012, from multiple centers under the APASL-ACLF Research Consortium (AARC) were prospectively included. The data were collected into a pre-defined, web-based proforma in the AARC database (http://www.aclf.in). Approval of the institutional ethics committees was obtained. The data were annotated and encrypted before analysis. The members of the AARC working party assume full responsibility for the accuracy and completeness of the data and subsequent analyses. All authors had access to the study data and reviewed and approved the final manuscript.

Patients

The patients with a diagnosis of ACLF as per the APASL definition, between 18 and 70 years of age, consecutively enrolled and prospectively followed until 90 days were included. However, for the short-term outcome, we analyzed the patients at 28 days for various events, i.e. variceal bleed, hepatic encephalopathy (HE), sepsis, other extra-hepatic organ failures. i.e. renal, respiratory and cardiac, and mortality. ACLF patients who survived <24 h after presentation and had acute decompensation of cirrhosis with a history of prior decompensation, acute liver failure, co-existing pregnancy, HCC or extra-hepatic malignancy were excluded. Informed consent was taken from the pateint or the next of kin. The primary end-point of the study was death or liver transplant at day 28. The secondary end-points were events like variceal bleed, HE, sepsis, and other extra-hepatic organ failures, i.e. renal, respiratory or cardiac.

Statistical analysis

The total of 1402 patient data were divided into the ‘Derivation Cohort’ and the ‘Validation Cohort’ in a 1:2 ratio. The derivation set included ACLF patients enrolled between 2009 and 2012. Validation was carried out in patients enrolled between 2012 and 2015. The baseline parameters in both sets were compared. Comparison of continuous variables was carried out by Student’s t test and categorical variables were compared by Chi square test. The predictors for 28-day mortality were analyzed by Cox regression in univariate and multivariate analysis in the derivation cohort. The predictors obtained from the derivation cohort were tested in the validation cohort. Harrell’s C index was used to define the concordance of all patient pairs for different predictions and outcomes. Somers’ D coefficient was used to provide an estimate of the rank correlation of the observed response variable and the predicted probabilities. The prognostic model was tested and calibrated in the validation cohort and the total cohort. AUROC of more than 0.75 in the derivation cohort, equivalent or more in the validation set, was considered for deriving the prognostic model. With the purpose of deriving a simple, specific and dynamic prognostic score for ACLF patients, we included clinically relevant characteristics and laboratory parameters of mortality observed at baseline and on day 4, 7 and 28. An ordinal grading (1–3) was carried out for individual parameters with predicted 28-day mortalities of <20, 50 and >80% (individual values rounded to nearest whole figure). A score was obtained by combining the individual grade of all the significant parameters. The score was further used for a new ‘class’ system of grading for liver failure by using the proportion of probability for: (1) 28-day mortality, and (2) at least with <20, 50 and >80% margins across the grade for the outcome. Comparison of the score with existing models, i.e. CTP, MELD, SOFA, CLIF SOFA and APACHE II, was carried out by AUROC. The scores as well as the grade were evaluated in a dynamic manner with the respective values at day 4 and 7 to predict the day 28 outcome by repeated measures analysis (GEE) followed by post hoc comparison. The cumulative probability of survival was depicted by KM graph and compared by the Mantel Cox log-rank test. A p value of <0.05 was considered as significant. All statistical tests were performed using SPSS for Windows v.22 (Armonk IBM).

Results

The patients presented to various centers across Asia at a mean of 24.1 ± 8.1 days of symptoms mostly with jaundice and fatigue. Complications from ascites [jaundice to ascites median duration of 10 (0–28) days] and encephalopathy at admission were seen in 619 (44.1%) cases. Of the prospectively enrolled patients in the AARC database, 1402 patients with complete data for clinical events, severity scores and outcomes were considered. The derivation set with 480 patients was analyzed for the prognostic model and tested in the validation set of 922 patients.

Study populations/patient characteristics

The derivation and validation sets were comparable for age, gender, etiology of acute and chronic insult, laboratory parameters, disease severity and overall mortality (Supplementary Table 1). The etiology of acute insult commonly included ethanol (~50% of cases), followed by HBV reactivation, HEV infection and drug-induced liver injury. The etiology of acute precipitant could be determined in approximately 95% cases. Similarly, the etiology of chronic liver disease was ascertained, the commonest being ethanol, HBV and NASH. The ACLF cohort in the derivative and validation cohorts were comparable for severe liver failure (mean bilirubin 22.1 ± 10.4 mg/dl and 21.5 ± 10.16 mg/dl, p = 0.32), coagulopathy (INR of 2.6 ± 1.1 and 2.5 ± 1.2, p = 0.65), HE (50 and 51%, p = 0.42), platelets (median 133,000 and 134,000/cmm, p = 0.21), median serum creatinine of 1.0 mg/dl and MELD of 29, suggesting dominant hepatic insult and fewer extrahepatic organ failures. Renal failure was noted in only 5–6% cases at presentation.

Development of the AARC ACLF prognostic model in the derivation cohort

The derivation cohort was analyzed for the predictors of outcome, i.e. 28-day mortality on the basis of baseline parameters by Cox regression analysis. In multivariate analysis (Table 1), parameters related to liver failure (total bilirubin, HE, coagulopathy) and renal failure (serum creatinine) and markers for tissue perfusion (serum lactate) were found to be the independent predictors of short-term mortality. The C index also showed a good prediction as was Somer’s D for a prognostic model.

Table 1 Predictors of 28-day mortality in derivation cohort

Calibration and testing of the AARC ACLF prognostic model in the validation cohort

Before developing a prognostic score, the independent predictors were confirmed in the validation set. The parameters, i.e. total bilirubin, HE, coagulopathy, serum creatinine and lactate, were also found to be the independent predictors of 28-day mortality both in univariate and multivariate analysis. The concordance of all patient pairs in the derivation and validation sets showed a good fit (Harrell’s C, Somers’ D) (Table 2). The expected frequency and observed frequencies from derivation set (R 2 = 0.97) showed a good matching into the validation set (R 2 = 0.94). The prognostic model had a good predictability with an AUROC of 0.80 (derivation cohort) and 0.78 (validation cohort) (Supplementary Fig. 1].

Table 2 Calibration and testing in the validation model: prediction of 28-day outcome

Development of the AARC ACLF score and ACLF grade

With good applicability of the prognostic model both in the derivation and prognostic patient sets, the individual parameters were scored from 1 to 3 considering: (1) 28-day mortality, and (2) with at least <15, 50 and >80% margins for the prediction of mortality. The AARC ACLF score ranges from a minimum of 5 to a maximum of 15. Further, the predictors were used for a new ‘grade’ system i.e. Grade-I for a score of 5–7, Grade-II for 8–10 and Grade-III for 11–15 with 28-day mortality of 12.7, 44.5 and 85.9%, respectively (Table 3). The CANONIC study considered <15% mortality as acceptable for ACLF. Our data showed a linear correlation and that the ordinal grading for each parameter as well as the score significantly increased from the referrence at each cut-off point (Supplementary Table 2). This scoring and the grading is for easy-to-recollect laboratory parameters or the clinical features with a distinct HR on Cox regression and the mortality in a clear 3 different grades (I, mild; II, severe; III, very severe).

Table 3 AARC score and ACLF grade

AARC score was compared with the existing disease severity scores, namely CTP, MELD, SOFA, CLIF SOFA and APACHE-II. It had a predictive accuracy for 28-day mortality (AUROC = 0.80) with 72% sensitivity and 78% specificity, and positive and negative predictive values of 67 and 77%, respectively. AARC score was found to be better than the currently existing MELD, SOFA and CLIF SOFA scores (Fig. 1).

Fig. 1
figure 1

Discrimination ability of AARC score

Evaluation of AARC ACLF score as a dynamic predictor of mortality

The AARC score and grading was used for dynamic assessment, i.e. at presentation and degree of change at day 4 or 7 of admission and its correlation to outcome, i.e. 28-day mortality.

AARC score and its dynamic change

The AUROC showed a cut-off score of 10.5 (more than 10, as no decimal allowed) at admission with 78% specificity and 72% sensitivity, with positive and negative predictivity of 67 and 77%, respectively, for 28-days survival. A score of 9.5, i.e. 9 or less (as no decimal allowed) and a persistently declining trend in the first week was seen among survivors, whereas a score of >10 and the persistence of same or an increasing trend was observed among non-survivors (p = 0.001; GEE model) (Fig. 2b). Increase in the score by one unit at any time in the first week increased the risk of 28-day mortality by 10.2%, whereas a score above 10 showed a sharp increase in mortality. Each unit increase in the score above 10 (any time within 7 days) increased the mortality by 20%, i.e. a patient with a score of 10 and having an increase to 15 by day 7 is unlikely to survive without liver transplant (Fig. 2a).

Fig. 2
figure 2

Dynamicity of AARC Score and Grade. a The dynamicity of AARC score (the mortality increases with a score of 10 or more by 20% for each unit increase in score from D0 to D7). b GEE (Generalised Estimating Equation Model) showing the trend of score among survivor and non-survivors (score above 10 at any point of time). c The AARC grade i.e. I, II and III in a static scale at days 0, 4 and 7. In those having Grade I at any time point in the first week,the 28-day mortality risk is reduced from 12 to 7% (p = ns) but those remaining in Grade II until the end of the first week showed a reduced 28-day mortality risk (44.5 to 28.1%, p < 0.001). Grade III at any time point is associated with high mortality. d The dynamic change in AARC Grade, i.e. from Grade I at admission to Grade II by day 7, increases the 28-day mortality risk from 12 to 22% (p = ns), but the mortality risk significantly increased with a change from Grade II to Grade III (44 to >77%, p < 0.001). The change from Grade II to Grade I by day 7 significantly reduces the 28-day mortality risk (44.5 to 16%, p < 0.01) as does the change from Grade III to Grade II by day 7 (85 to 63%, p = 0.03). None of the patients showed sufficient improvement from Grade III to become Grade I

Grades of liver failure predict likelihood of mortality in a dynamic manner

The degree of liver failure was graded as Grades I, II and III. Baseline ACLF Grades I, II and III had a 28-day mortality of 12.7, 44.5 and 85.9%, respectively, in the absence of transplant. Change from one grade to another is dynamic and correlates with mortality. Grade I liver failure at admission and persistence of the same grade at day 4 and 7 did not affect the mortality (p = 0.32). But a change from Grade I to Grade II at day 4 and 7 or Grade II to Grade III at day 4 and 7 significantly increases the mortality (p = 0.01, p < 0.001, respectively). AARC Grade II at admission and persistence or change to Grade I showed a decrease in mortality (p < 0.001) (Fig. 2c). Patients in ARRC Grade III were the most seriously ill, with uniformly high mortality at baseline as well as at day 4 and 7 (85.9, 87.2 and 91.7%, respectively, p = 0.84). A change from Grade III to Grade II at day 4 and 7 showed a significant reduction in mortality (85.9, 77.5 and 63.15%, respectively, p = 0.03). None of the patients recovered sufficiently to come to Grade I from Grade III within the first week (Fig. 2d).

ACLF cohort and survival: AARC ACLF score and grade

The AARC cohort is a homogeneous one, i.e. all had liver failure due to acute hepatic insult. The historical derivation and validation cohorts were comparable in their baseline characteristics (Supplementary Table 1); however, there was a trend towards reduced 90-day mortality (52.9 vs. 47.3%, p = 0.05), a difference which could be because of improved outcome of ACLF over time.

The mortality for the ACLF cohort was a dynamic one as shown in the life table (Table 4) with score of 5 or 6 with good outcomes and 14–15 being nearly fatal; with the existing score at each point, the survival for 28 days as well as 90 days could be well predicted. The median survival in AARC Grade III was only 12.6 (95 CI 10.6–14.6) days in comparison to >28 days for Grade II and Grade I. At presentation to the health care facility, AARC Grades II and III were associated with 3.6 times (HR 95 CI 2.2–5.9) and 13.5 times (HR 95 CI 8.3–21.9) increased mortality for 28 days in the absence of liver transplantation (Supplementary Fig. 2). In the present study, the conditional survival was improved over time (p = 0.03) in the ACLF cohort in contrast to the actuarial survival based on the Kaplan–Meyer curve (Fig. 3).

Table 4 Life table: dynamic prediction of estimated survival at time points predicting expected survival for 90 days (in  %)
Fig. 3
figure 3

Conditional and acturial survival. The Cconditional survival improved in the present study over time, and is better than survival as the actuarial survival (KM curve). In the initial few weeks, the mortality is higher in ACLF and the cumulative survival was decreased over time, but at each time point, the conditional survival improves which was due to recovery in organ failure and overall clinical condition

Discussion

The basic premise for defining a syndrome is to identify a group of patients who have a distinct presentation, course and outcome. The present study from AARC includes patients from across Asia with liver failure in response to an acute hepatic insult. The results bring forth a simple ‘liver failure grading system’ based on variables, namely serum bilirubin, INR, serum lactate, serum creatinine and grade of HE. The existing scoring systems are either for acute liver failure, such as King’s College Criteria, for decompensated cirrhosis, such as CTP or MELD [5, 8], or for organ failure, such as SOFA [5] or CLIFF SOFA [2, 9]. There is no score dedicated to liver failure in a cirrhotic patient, commonly recognized as a distinct entity of ACLF.

The present study included more than 1400 patients from several centers across Asia, and used a large derivation cohort of 480 patients to develop a dynamic prognostic model, which was validated in a subsequent 922 patients to predict mortality. The diseased populations in the present study were homogenous, with chronic liver disease/cirrhosis presenting for the first time as acute hepatic decompensation in response to an acute hepatic insult, suggesting wider applicability. It was interesting to note that severe alcoholic hepatitis was the most common acute insult in Asia, unlike the earlier belief that HBV was the predominant etiology in Asia.

The AARC prediction model was developed using a large derivative cohort of 480 patients with follow-up of up to 28 days. The model was constructed based on the five most significant variables in the univariate analysis. These variables included bilirubin, creatinine and INR (currently used for MELD score) and grade of HE, and serum lactate as the marker of liver failure, tissue perfusion and systemic hemodynamics. A simple scoring system,like the Child–Pugh Turcotte (CTP), was developed with predictable mortalities of <20, 50 and >80% being awarded 1, 2 or 3 points. A grading system, i.e. Grade I for a score of 5–7, Grade II for 8–10 and Grade III for 11–15 within 28 days was associated with mortalities of 12.7, 44.5 and 85.9%, respectively. The grades show a potentially recoverable group (Grade I), a group that needs special monitoring (Grade II) and a group that demands immediate interventions for improved outcome (Grade III).

The AARC ACLF score was further tested and calibrated using a validation cohort of 922 cases. This showed good agreement, with a good matching of the expected and observed frequencies from the derivation set (R 2 = 0.97) and the validation set (R 2 = 0.94). The AARC model was found to be better than existing models for ACLF, with an excellent predictability, i.e. in AUROC of 0.80 (derivation cohort) and 0.78 (validation cohort). It is even more robust than recently reported models [4] in which the AUROC is below 0.80.

The AARC ACLF model is a simple bedside tool which does not require sophisticated variables. The availability of serum bilirubin, creatinine, INR and lactate and evidence of HE could help in predicting the outcome in ACLF. It has a distinct advantage over the cumbersome CLIF-SOFA (6 parameters) or confusing terms like CLIF-C [4], CLIF-AD [9] or MELD [8].

The AARC ACLF score is dynamic in nature. It can predict 28-day survival at presentation (those with 9 or below) and at day 7 (score of 9 or below). For a score of 10 or above, with each unit increase, mortality increases sharply compared with those below 10 at initial presentation (20 vs. 4%). The life table also explains the applicability of this model for 90-day survival. This provides us with a window of opportunity to explore definitive therapy including liver transplantation. A shift from Grade I to Grade III liver failure at day 4 and 7 increased the mortality and, at the same time, the persistence of Grades I or II until 7 days predicted improved survival, whereas persistence in Grade III failure is uniformly severe and warrants early consideration for transplantation.

The current AARC model considered the parameters used to determine MELD, i.e. bilirubin, creatinine and PT-INR, which have been used for organ allocation. Applying MELD, an AUROC of 0.76 was derived in the present study. However, MELD has a disadvantage of needing creatinine, which is an extra hepatic component and ignores encephalopathy and hemodynamic alterations subsequent to liver failure [1,2,3, 10]. The SOFA and CLIF SOFA scores had lower predictability in the present analysis owing to different subgroups with hepatic failure being at the core.

As in patients with ALF, the presence of HE is a major determinant to decide on the urgency of intervention, including liver transplantation. On the same analogy, the addition of HE is justifiable in a model to predict the outcome of patients. [11]. In fact, patients of ACLF presenting with HE have been shown to have a significantly higher mortality [11, 12]. Serum lactate levels are elevated in relation to portal pressure and correlate with hepatic dysfunction and stage of cirrhosis [13, 14].

The existing disease severity scores, such as the dynamic prognostic model, have many limitations. The currently used CTP system has a “ceiling effect”, and studies suggest the need for modification for the rising bilirubin and prothrombin time [15]. Shi et al. showed that the hepatic ACLF, as a distinct entity in the CANONIC cohort, where the MELD or CLIF SOFA scores were inaccurate, rather than age, the presence of HE, in addition to serum bilirubin, creatinine and INR, could predict the outcome better [16]. The widely used MELD for organ allocation is inappropriate when used in ACLF as it underestimates the illness severity and mortality in the absence of prioritization for hemodynamics and HE. Further use of MELD-Na has improved the predictability but it still does not consider HE or tissue perfusion, which are important and distinct determinants of progressive liver failure [16,17,18]. The authors have reported a model including lactate, INR, total bilirubin and creatinine, i.e. MELD lactate, as superior to MELD in predicting 30-day mortality after liver transplantation [14]. In fact, serum lactate is a good marker of liver failure, tissue perfusion and systemic hemodynamics. These observations support the inclusion of HE and lactate into the proposed AARC model and scores. Zheng et al. [19] had shown a good prediction model considering the Conditional Survival Estimate (CSE) for HBV-ACLF cases. ACLF is an aggressive critical condition with high short-term mortality; the hazard of death did not remain constant over time. The CSE is better than the actuarial survival (based on the Kaplan–Meyer curve) for survival prediction, because conditional survival improved over time but not cumulative survival. In our present study, the CSE model was also a good fit for tailoring patient-specific treatment [19, 20].

To conclude, ACLF syndrome is distinct from acute liver failure and decompensated cirrhosis, and the distinction is clear only in the absence of prior decompensation, with acute hepatic insult and presenting as hepatic failure with or without extra-hepatic organ failure. Considering liver failure as measured by total bilirubin, coagulation, encephalopathy, the most common extra-hepatic organ failure (AKI, i.e. serum creatinine) and lactate as a surrogate marker of liver failure as well as tissue perfusion/hemodynamics is an excellent prognostic model. AARC score and grading is a validated severity score, dynamic model for prognostication and timely referral for liver transplantation.