Introduction

Hepatocellular carcinoma is the seventh most common cancer but the second leading cause of cancer-related deaths in both sexes worldwide, according to GLOBOCAN [1]. Unfortunately, it is diagnosed in most cases in advanced stages that are not amenable to curative treatments, such as surgical resection or hepatic transplantation. In those cases, palliative loco-regional treatments, such as transarterial chemoembolization (TACE) or radioembolization (RAE), systemic treatments, or best supportive care are usually given. In two randomized phase III clinical trials, both in Asian and Western populations, sorafenib, an oral multikinase inhibitor of the vascular endothelial growth factor receptor (VEGFR), the platelet-derived growth factor receptor (PDGFR), and RAF, were shown to improve overall survival (OS) compared to placebo in patients with preserved liver function and advanced hepatocellular carcinoma (aHCC), defined as extra-hepatic metastases or impossibility and/or progression to standard loco-regional treatments [2,3,4,5].

Unfortunately, the survival benefit seen with sorafenib is modest and limited to patients with good to fair performance status and normal or near-normal liver function, as defined by a Child–Pugh (C–P) A stage and an Eastern Cooperative Oncology Group Performance (ECOG) Performance status (PS) of 0–2. In the real-life setting, it has been also difficult to replicate the results seen in the randomized trials [6]. Differences in the aetiology of the baseline hepatic impairment and the inclusion of patients with poorer performance status or worse liver function may explain these differences. On the other hand, sorafenib, although in general well tolerated, can be associated with troublesome toxicities, such as diarrhoea, hand-foot syndrome (HFS) and fatigue, which usually lead to treatment discontinuations and/or dose reductions and perhaps a lower efficacy of the drug if these toxicities are not managed adequately. Finally, a significant percentage of patients tend to progress quickly after progression to sorafenib, a situation where second-line options are scarce and limited at the moment to the use of regorafenib, another oral multikinase inhibitor, in selected patients after sorafenib progression, as recently reported in a placebo-controlled randomized phase III trial [7]. Thus, there is a need to optimize the selection of patients with aHCC that will benefit most from the use of sorafenib and to recognize those patients that are benefitting from the continued use of the drug.

Despite this need, most prognostic models in patients with HCC have focused in the adequate selection of patients’ candidates to surgical resection or loco-regional therapies and have not been validated to discriminate between patients in advanced stages [8,9,10,11,12,13,14]. A few models have been constructed for aHCC but have been developed without considering specifically the treatment with sorafenib or have focused on markers that were costly and more difficult to obtain, making them impractical for daily clinical practice [15]. However, perhaps of more interest and easier to obtain in the clinical setting, several studies have explored the possible association of sorafenib-linked toxicity (mainly skin rash and diarrhoea) with a better prognosis [16, 17]. In this sense, a recent study showed a positive correlation between skin rash and a better prognosis after conducting a time-dependent analysis [18]. Other studies have shown a prognostic value of new factors linked with inflammation, like the neutrophil-to-lymphocyte ratio (NLR) [19,20,21,22].

With this is mind and recognizing the need for a prognostic tool that reflects outcomes from the treatment with sorafenib, we explored factors associated with improved OS in patients with aHCC treated with first-line sorafenib at our Institution. We report a new prognostic model based on factors that can be obtained rapidly in our routine patient management.

Materials and methods

Study design

A retrospective database analysis was performed in April 2015. The database was created in 2008 and since then, clinical data from all aHCC patients treated at La Fe University and Polytechnic Hospital have been entered, being prospectively maintained. Confidentiality of patients’ data was kept throughout the study. The study was approved by the Local Ethics Committee of Clinical Research and was conducted in accordance with the Declaration of Helsinki for studies in humans.

Patients

Patients included in this retrospective analysis must have had a diagnosis of aHCC and had started first-line treatment with sorafenib between January 2008 and April 2015. Treatment with sorafenib was begun in these patients according to the Barcelona Clinic Liver Cancer (BCLC) staging criteria system [12]; namely, in those patients with stage C disease (C–P score A–B, PS 1 or 2, portal invasion, lymph node involvement or distant metastases) or in those patients with stage B disease (C–P score A–B, PS 0 with multinodular disease not amenable to resection or liver transplantation) where loco-regional therapies such as RAE or TACE had failed or where technically not feasible. Patients with stage 0 or A disease, candidates to curative local therapies, or patients with stage D disease (PS >2 or C–P score C) were not considered for treatment with sorafenib.

Data collected included patient age, sex, year of diagnosis, aetiology of HCC, ECOG PS, C–P score, Barcelona Clinic Liver Cancer staging (BCLC staging), baseline full blood count (including absolute neutrophil and lymphocyte counts) and biochemistry, baseline alpha-fetoprotein levels (AFP), prior loco-regional treatments, as well as the presence of portal vein thrombosis or distant metastases. A cut-off level of AFP of 200 ng/ml was chosen for further statistical analysis, in accordance with previous studies [23]. Patients with no baseline blood counts or with less than 30-day follow-up from the start of sorafenib, except for death, were excluded from the analysis.

Treatment-related data retrieved included initial sorafenib dosage, start and end of sorafenib treatment, treatment discontinuations and/or dose reductions, early (defined as less than 30 days after the start of sorafenib) and overall toxicity, grades of early and overall toxicity, number of sorafenib cycles and motive of discontinuation of sorafenib. Standard initial dosing was, as per label, 400 mg q12 h everyday; however, at physician discretion, a lower dosage of 400 mg q24 h every day could be begun, if there were concerns of possible unacceptable toxicity, especially in older patients, patients with poor PS or C–P score B. In those cases, the introduction of the full dosage of sorafenib was encouraged after the first month of treatment if tolerance was acceptable. Each cycle of sorafenib was defined as the continued administration of sorafenib during 28 days. Lastly, the final day of follow-up along with any event (progression of disease or death) that occurred during the follow-up period was also noted. Patients without baseline blood test results available were excluded.

Statistical analysis

The NLR was defined as the quotient of baseline absolute peripheral neutrophil count (cells/mm3) by absolute peripheral baseline lymphocyte count (cells/mm3). For baseline NLR values, counts from day 1 of the first cycle of sorafenib were used. The optimal cut-off level for baseline NLR was determined by receiver operating characteristic (ROC) analysis; therefore, the cut-off value for the maximum sensitivity and specificity of the NLR was calculated, and patients were divided into 2 groups based on the NLR cut-off value.

OS was defined as the time from the onset of sorafenib to date of death from any cause with censoring at date of last contact for patients alive at the cut-off date. The modified RECIST criteria were used for assessment of disease response. Progression-free survival (PFS) was defined as the time from the onset of sorafenib to the date of progression by radiological criteria. Comparisons between two groups were conducted using the Student’s t test for continuous data and the X 2 test for categorical data. A Kaplan–Meier survival analysis was performed to compare the OS and the PFS of the patients in different groups, and the significance of the intergroup difference was evaluated using the log-rank test.

The prognostic variables in predicting OS were assessed by a multivariate Cox proportional hazards regression analysis. Variables that proved to be significant in the univariate analysis were subsequently tested with the multivariate Cox proportional hazard model using a forward selection method. The hazard ratio of survival by Cox proportional hazard model was calculated to compare the strength of predictors of survival.

To develop the prognostic score, we compared models with all possible variable combinations based on Harrell’s concordance index (C-index), the Akaike Information Criteria (AIC) and ROC curves analysis. Harrell’s concordance index (HCI) was used to rank the score’s ability to discriminate patients with regards to OS. A HCI (0.5 ≤ HCI ≤ 1) of 0.5 suggests no predictive discrimination power, while an index of 1.0 indicates perfect discriminatory power. The Akaike information criteria (AIC), where lower is better was used to assess the relative goodness of fit. The ROC curve analysis was used to test the discriminative ability of the models. The model with the highest C-index and area under the curve (AUC), and with the smallest AIC value was selected as the final model. Finally, patients were categorized into risk groups based on the number of baseline risk factors, and the median OS was calculated for each group. Internal validation of the predictive performance of the final model was assessed by a bootstrap resampling procedure and by calculating the pseudoR2 index.

All statistical tests were performed against a two-sided, alternative hypothesis using a significance level of 0.05 and a 95% CI. SPSS Statistics version 21.0 (IBM, Inc.) and RStudio Version 0.98.501 (RStudio, Inc.) were used for the statistical analyses. The statistical analysis was performed in October 2016.

Results

Patients’ characteristics

A total of 145 patients with aHCC treated with sorafenib from 2008 to 2015 at our Institution were included in this analysis. Baseline patients’ characteristics are summarized in Table 1.

Table 1 Baseline characteristics of the 145 patients

The patient population consisted of 109 men and 36 women, with a mean age of 62 years (range 26–82 years). Most patients (70%) had symptoms at diagnosis and presented with an ECOG PS 1, while only 7% had and ECOG PS 2. A Child–Pugh score of A was seen in 72% of patients, while the remainder had a Child–Pugh score of B. With regards to the BCLC criteria, 83% were stage C patients and 17% were stage B patients. 49% had distant metastases at diagnosis. 82% of patients had underlying liver cirrhosis, while portal vein tumour thrombosis was diagnosed in 37%.

More than a half of patients had received previously some kind of treatment for the HCC. The most common prior treatment was TACE, used in 49% of patients. Radiofrequency ablation (RFA) was used in 18% of patients. 12% had underwent a prior liver transplantation, while surgical curative resection was performed in 10% of patients.

Treatment characteristics: safety and toxicity

The median duration of sorafenib treatment was 6 months, with a range from 1 to 40 months. In the time of this analysis, 16 patients (11%) were still receiving sorafenib. The main reason for treatment discontinuation was disease progression in 90 patients (62%), whereas 39 patients (27%) discontinued the treatment secondary to unacceptable adverse events. There were no treatment-related deaths.

Ninety-six patients (66%) started with the dose of 800 mg daily, while 49 patients (44%) started with the reduced dose of 400 mg daily. The initial reduced dosage was significantly higher in those patients with PS 2 compared to those patients with PS 0 or 1 and in patients with C–P B score compared with C–P A score patients. However, there was no statistically significant relationship between initial dosage and older age, distant metastases, sex, portal thrombosis, aetiology of the cirrhosis and the presence of symptoms (PS 1 and 2 compared to PS 0 patients).

83 patients (57%) required dose reductions to manage adverse events during treatment. Table 2 summarizes grade 3–4 and early (all grades) toxicity linked with the use of sorafenib. Early toxicity was defined as any side effect (any grade) that appeared during the first 30 days of the treatment. The most frequent early side effects related to sorafenib (any grade) were diarrhoea in 42% of patients and HFS in 20% of patients. On the other hand, overall grade 3 or 4 toxicities were observed in 62 patients (43%). The most frequent grade 3–4 toxicity was fatigue (19%), HFS (8%) and diarrhoea (7%). Overall, but not early-onset grade 3–4 toxicity, was statistically higher in patients that begun treatment with the lower dosage of 400 mg everyday compared to the standard dosage (54.8% vs 37.5%, p 0.048).

Table 2 Safety and toxicity with the use of sorafenib

Treatment outcomes: response rate and survival

With a median follow-up of 43 months, median PFS and OS was 4.15 months (CI 95% 2.5–5.8) and 6.7 months (CI 95% 4.5–8.8), respectively, with a 1-year OS of 33%. The overall response rate (ORR) was 11% (all partial responses) while there were 26.2% of disease stabilizations, for an overall disease control rate of 37.2%.

Univariate analysis for overall survival

The NLR cut-off for OS prediction by ROC analysis was 4 (area under the curve, AUC = 0.62). In the univariate analysis, an ECOG PS of 0 (p = 0.0001), a Child–Pugh score of A (p = 0.001), a baseline NLR ≤4 (p = 0.027) and early-onset diarrhoea (p = 0.038) were significantly associated with an improved better OS (Table 3). In contrast, the aetiology of the hepatic dysfunction and the presence of cirrhosis, the initial dosage of sorafenib, the presence of portal thrombosis, an elevated baseline AFP level, or early-onset HFS, did not impact on survival in the univariate analysis (all p values >0.05).

Table 3 Univariate and multivariate analyses for improved overall survival

Multivariate analysis for overall survival

In the multivariate analysis of those significant factors from the univariate analysis, PS of 0 (p = 0.001), a Child–Pugh score of A (p = 0.0001), a baseline NLR ≤4 (p = 0.013) and early-onset diarrhoea (0.0001) retained their significance as independent factors linked to an improved OS (Table 3). The hazard ratios and 95% confidence intervals for these independent factors are outlined in Table 3.

Risk grouping

As shown above, four factors were identified as significant prognostic factors linked to improved OS. We assigned 0 points to each good prognostic factor and 1 point to each poor prognostic factor, except for a PS of 1–2, which was assigned 2 points because of its higher coefficient in the multivariate analysis. We calculated the number of these points in each patient and defined the sum as La Fe score. Patients were categorized into four risk groups based on the number of points of La Fe score. Different OS rates were found for each group (Table 4).

Table 4 Median overall survival times after categorization in the four-tier and three-tier systems

Patients with zero points were in the very low-risk category (n = 9, 6%); the median OS was 48.6 months [95% confidence interval (CI) 14.8–82.4]. Patients with one point were in the low-risk category (n = 36, 25%), where median OS was 11.6 months [95% confidence interval (CI) 3.3–19.9]. Patients with two to three points were in the intermediate-risk category (n = 61, 42%), median OS was 8.3 months [95% confidence interval (CI) 6.1–10.5]. Patients with four to five points were in the poor-risk category (n = 39, 27%), median OS was 4.4 months (95% CI 1.4–7.6) (p value 0.0001) (Fig. 1). In this model, HCI for OS was 0.659, while AIC was 1.180, with a value of 0.69 for the ROC curve analysis.

Fig. 1
figure 1

Kaplan–Meier curve for median overall survival among the four risk categories of our four-tiered prognostic model. Blue very low risk, green low risk, yellow intermediate risk, purple high risk

We also evaluated a theoretical prognostic model constructed by using only the 3 clinical factors, excluding early-onset diarrhoea (Table 4). The main advantage of this model is that it would allow a baseline assessment, with no need to wait for the first month of treatment to be able to make the risk stratification. However, after the analysis was performed, its ability to discriminate and its calibration, as measured by the ROC curve analysis and the C-index and AIC, were worse than the model including early-onset diarrhoea (data not shown). Thus, the four-tier system was shown to have greater accuracy for OS prediction in our population.

Discussion

There have been several reports of prognostic models for HCC patients. Most have included tumour extension and baseline liver function, as it is well known that a baseline hepatic impairment frequently has a significant impact on prognosis, regardless of the tumour stage. To date, the best known and more widely used scores include the Okuda [8], the Cancer of Liver Italian Program (CLIP) [9], the BCLB [12], the Groupe d’Étude et de Traitement du Carcinoma [11], the TNM [7], the Chinese University Prognostic Index [10] and the Japan Integrated Staging Score (JIS score) [14] scores. Unfortunately, most of these staging systems are useful to select patients candidate for local or loco-regional treatments, but may be suboptimal to classify patients treated specifically with sorafenib, since none of them have been validated specifically to address this subgroup of patients.

In our study, a prognostic index model was developed in our population of sorafenib-treated aHCC patients using 4 factors that were highly associated with an improved OS. These factors are readily available during routine patient management and can enable patients to be categorized into four distinct risk groups (very poor, poor, intermediate and good prognosis). Furthermore, the model has been internally validated, showing encouraging AUC curves and pseudoR2 coefficients. However, we are still waiting for its external validation in a larger population in a prospective manner.

With regards to the factors included in our score, both PS and the C–P score are already considered classical prognostic factors; however, more weight is given in our model to the presence of disease-related symptoms (PS 1 and 2) due to its higher coefficient in the regression analysis.

Regarding the baseline NLR, our NLR cut-off was defined by the ROC curve analysis (NLR <4 vs NLR ≥4) which is consistent with other published papers such as the work of Zheng et al. [24]; however, other authors have proposed other cut-off points (da Fonseca et al., with a NLR <3.5 vs ≥3.5 [19], Wei et al. with a NLR <3 vs ≥3 [21] or Ji et al. with a NLR <2 vs ≥2 [25]) and there is still a lack of consensus about the optimal baseline NLR threshold in this setting. Despite these discrepancies, the NLR remains a promising prognostic biomarker in several cancer subtypes, including HCC, and is easily available by flow cytometry in the routine clinical setting. Recent studies have also demonstrated that neutrophils and their molecular products have an important role in the tumorogenesis and invasion-metastases cascades, and could be considered legitimate candidates as potential therapeutic targets in cancer patients [26]. These findings, considered together with the recent encouraging reports of the activity of immune-checkpoint blockade agents in patients with aHCC [27] should pave the way of new strategies of immunotherapy for the treatment of HCC [28].

In terms of toxicity, our data showed that the development of early-onset diarrhoea had a positive and statistically significant impact in OS and played a relevant role for our model, improving its calibration and discrimination. Considering all the above mentioned, we decided to incorporate the early-onset diarrhoea into our final prognostic model, despite the disadvantages it brings, mainly the delay in its implementation until one month after the onset of the start of sorafenib treatment. On the other hand, we did not observe any differences in survival in those patients who experienced an early HFS, although its potential prognostic value has also been reported in the literature [16,17,18, 29]. Of note, if we analyse our overall safety data, our toxicity rates are consistent with those previously published, both in other real-world experiences [30] as in clinical trials. Paradoxically, those patients where treatment was begun with a lower dosage of sorafenib experienced higher overall, but not early-onset grade 3–4 toxicity. This may relate to the poorer PS and worse liver function seen in these patients, who perhaps where not the best candidates for sorafenib treatment to be started. A better selection of patients, especially with regards to liver function, is necessary and if treatment with sorafenib is decided, it should be started with the full dosage as recommended by most guidelines.

It is also noteworthy that our survival data in terms of median PFS and median OS (4.15 and 6.7 months, respectively) are similar to those reported from the daily clinical practice outside the setting of clinical trials [30]. Our poorer outcomes compared to the randomized phase III SHARP trial [2], which was done in a Western population fairly similar to ours, could be explained by the more restrictive inclusion criteria in the clinical trial, which only included good PS patients with preserved liver function and little or no comorbidities.

The present study has several limitations. First, this is a retrospective study. The initial sorafenib dose varied among individual patients, which could have led to bias. In addition, several off-label therapies were performed after discontinuation of sorafenib, also potentially leading to bias concerning the benefit survival. Finally, this is a single-centre study and prospective studies are needed to determine the utility of the model for selecting the treatment for patients with aHCC.

Conclusion

We have developed a prognostic index model, composed of four routinely available and readily assessable factors, that categorizes patients with aHCC treated with first-line sorafenib into four distinct risk groups. This model could be useful in our daily clinical practice to aid in the assessment of the prognosis of our patients, and may aid to plan the follow-up. However, an external validation in a larger population is required.