Introduction

In patients with hepatocellular carcinoma (HCC), tumor size is closely related to survival and recurrence rates and is a significant survival factor even after hepatic resection [1]. Patients with chronic hepatitis B and C are followed closely and many cases with HCC are detected in the early stages, although some are still diagnosed at advanced stages. Percutaneous ethanol injection and radiofrequency ablation are treatment modalities used for patients with small HCC, in conjunction with surgery. However, HCC larger than 10 cm in diameter (large HCC) is resistant to these non-surgical therapies [2, 3], and since liver transplantation is not an accepted treatment modality for large HCC [4], only hepatic resection offers the chance of long-term survival [5]. Compared with HCC smaller than 10 cm, large HCC requires major hepatectomy [6, 7], and the risk of postoperative liver failure is higher in such patients [8]. The postoperative recurrence rates in patients with large HCC is reported to range from 67 to 72% [7, 9] and is more frequent than in patients with all other sizes of HCC [1012]. Nevertheless, some patients with large HCC show a favorable surgical outcome. This also suggests that the selection criteria have a positive effect on outcome [5, 7, 1316], though this finding remains controversial.

At present, the TNM staging system of HCC is generally using worldwide. Especially in Japan, the TNM classification by the Liver Cancer Study Group of Japan (LCSGJ), which is similar to the TNM classification for HCC by the American Joint Committee on Cancer/International Union Against Cancer (AJCC/UICC) [17], is used across the country. Recently, new classification systems have been developed for patients with HCCs such as the Cancer of the Liver Italian Program (CLIP) scoring system and the Japan Integrated Staging (JIS) scoring system [18, 19]. These staging systems not only cover tumor progression but also include liver function and tumor marker(s). Many groups have reported that the new classification systems better reflect prognosis of patients than the conventional classification including the AJCC/UICC TNM classification [20, 21]. To our knowledge, however, these new classification systems have not been adjusted for predict prognosis after surgery of patients with HCC larger than 10 cm in diameter.

The present study was designed to determine the markers of poor overall survival (OS), and evaluate the accuracy of prediction of 1-year, 3-year and 5-year mortality by the existing staging systems. The results showed that the CLIP system provides the best prediction of prognosis after surgery for patients with large HCC.

Patients and methods

Patients and follow-up

From the beginning of January 1981 to the end of December 2001, 928 patients of HCC underwent hepatic resection in the Department of Surgery, Osaka Medical Center for Cancer and Cardiovascular Diseases. Out of these 928 patients, 42 patients (4.5%) had HCC greater than 10 cm in diameter. Large HCC was defined as HCC larger than 10 cm in diameter. These patients included 36 men and 6 women with a mean age of 57 years (range 35–82). The median follow-up period was 132 months.

Tumor resectability was assessed by ultrasonography (US), computed tomography (CT) and angiography. Intraoperative US was performed in all patients. Eighteen patients received transarterial chemoembolization (TACE) preoperatively. Major hepatectomy was defined as lobectomy or extended lobectomy or trisegmentectomy, while minor hepatectomy was defined as segmentectomy or subsegmentectomy or partial resection. The operative procedures were 30 major hepatectomy and 12 minor hepatectomy. Curative resection was defined as surgery in which all tumors were resected both macroscopically and microscopically with a clear margin. There were 25 curative resection cases and 17 non-curative resection cases. Hospital death was defined as death within the same hospital admission for hepatectomy while operative death was defined as death within 30 days after surgery. The patients were followed-up closely after hepatic resection, including physical examination, biochemical liver function tests, and analysis of various tumor markers such as serum level of α-fetoprotein (AFP) and protein-induced by vitamin K absence or antagonist II (PIVKA-II). The US and dynamic CT were performed every three months up to five years after surgery, and every six months afterward.

Statistical analysis of prognostic factors

All data were expressed as mean ± standard deviation. The survival curves were estimated using the Kaplan–Meier method and compared by the log rank test. Univariate analysis was conduced using 21 prognostic factors including gender, age, albumin, total bilirubin, indocyanine green (ICG) retention rate at 15 min (ICG-R15), prothrombin time (PT), AFP, hepatitis B virus surface antigen (HBs-Ag), hepatitis C virus antibody (HCV-Ab), Child–Pugh classification, preoperative TACE, liver cirrhosis, number of tumors, tumor diameter, macroscopic portal invasion, macroscopic venous invasion, intraoperative blood loss, blood transfusion, curability, histological differentiation by Edmondson–Steiner [22] and microscopic portal invasion.

Cox’s proportional hazards model was used for multivariate analysis employing 10 factors found to be significant in univariate analysis. All statistical analyses were conducted with either StatView (version 5.0; SAS Institute, Inc., Cary, NC) or SPSS II (version 11.0.1; SPSS Inc., Chicago, IL) software packages. All P values less than 0.05 were considered statistically significant.

The 42 patients were classified according to the criteria of the AJCC/UICC TNM classification, the LCSGJ TNM classification, the CLIP scoring system and the JIS scoring system. Tables 1, 2, 3 provide the definitions of these staging systems and scoring systems. The distribution of patients across stages and associated survival rates were compared in each staging system. The accuracy of prediction of 1-year, 3-year and 5-year mortality for each system was evaluated by calculating the area under the receiver operating characteristic (ROC) curve (AUC) to assess the discriminatory ability for the prediction of death. The ROC curve is a graphical display of the false-positive rate and the true-positive rate from multiple classification rules [23, 24]. An AUC of 1 represents a perfect test; an AUC of 0.5 represents a test no better than random prediction. In this analysis, patients followed within 1, 3 and 5 years were excluded in prediction of 1-year, 3-year and 5-year mortality, respectively.

Table 1 Definition of the AJCC/UICC TNM Classification by the International Union Against Cancer (UICC)
Table 2 Definition of the LCSGJ TNM classification by the Liver Cancer Study Group of Japan and the JIS scoring system
Table 3 Definition of the CLIP scoring system

Results

The overall incidence of postoperative complications was 10 in 42 cases (23.8%). These included ascites, pleural effusion, bile leakage, liver abscess and gastrointestinal bleeding, with ascites or pleural effusion forming 50% of the cases. All complications responded to conservative treatment and did not cause death or required surgery. There were hospital death or operative death cases in this series. The 1-year, 3-year and 5-year cumulative OS rates for all 42 cases were 57%, 33% and 27%, respectively. For the curative resection group, the overall recurrence rate was 72%.

Prognostic factors for survival

Table 4 shows the clinical and pathological features and results of univariate analysis used to identify the significant prognostic factors of OS for all 42 patients. Ten significant OS poor prognostic factors were selected by univariate analysis. They included age (≥60 years), albumin (<3.5 g/dL), ICG-R15 (≥15%), AFP (≥1,000 ng/mL), number of tumors (multiple), macroscopic portal invasion (present), macroscopic venous invasion (present), intraoperative blood loss (≥3,000 mL), and curability (non-curative resection).

Table 4 Prognostic factors of overall survival by univariate analysis

Cox’s proportional hazards model using the factors assessed as significant by univariate analysis identified AFP (≥1,000 ng/mL, P = 0.023) and curability (non-curative resection, P = 0.003) as significant and independent factors of poor prognosis affecting OS (Table 5).

Table 5 Prognostic factors of overall survival by multivariate analysis

Figure 1a and b shows the survival curves according to curability and AFP level, respectively. The 1-year, 3-year and 5-year cumulative OS rates of the curative resection group were 82%, 52% and 42%, respectively, while those of the non-curative resection group were 29%, 6% and 6%, respectively. The latter rates were each significantly lower than the respective rate of the curative resection group (P < 0.0001). The 1-year, 3-year and 5-year cumulative OS rates of the low AFP group (AFP < 1,000 ng/mL) and high AFP group (AFP ≥ 1,000 ng/mL) were 78%, 47%, 42% and 41%, 18%, 12%, respectively (P = 0.0118).

Fig. 1
figure 1

Survival curves for all 42 patients estimated by the Kaplan–Meier method. a Comparison of survival rates of patients with curative resection and those with non-curative resection (log rank test P < 0.0001). b Comparison of survival rates of patients according to serum AFP level using a cutoff level of 1,000 ng/mL (P = 0.0118)

Patients’ classification according to the four staging systems

Table 6 shows that the disease of each patient could be staged by the four staging systems (the AJCC/UICC TNM classification, LCSGJ TNM classification, CLIP scoring system and JIS scoring system). The distribution of patients according to each system and associated survival rates were compared. In the AJCC/UICC TNM classification, the 42 patients were classified as stage I (n = 9), stage II (n = 0), stage IIIA (n = 31), stage IIIB (n = 2) and stage IVA–IVB (n = 0). In this study of patients with large HCC, the distribution of patients was unbalanced and none of the patients was classified as stage II by the AJCC/UICC TNM classification. In the LCSGJ TNM classification and the JIS scoring system, the distribution of patients showed similar patterns because the JIS score equals the sum of the number of LCSGJ stage and the Child–Pugh stage. In fact, most patients with stage II, III and IVA in the LCSGJ TNM classification were grouped into the 1, 2 and 3 in the JIS scoring system, respectively. In the CLIP scoring system, patients were evenly classified into the five levels of CLIP scores (0–4), and the survival rates in each stage were evenly calculated. The cumulative OS rate in the CLIP score = 0 group was 100%, and the 1-year, 3-year and 5-year cumulative OS rates in the CLIP score = 1 group were 75%, 57% and 46%, respectively (Table 6). On the other hand, the 1-year, 3-year and 5-year cumulative OS rates in the CLIP score = 2 group were 50%, 13% and 0%, respectively. The OS rates of patients with CLIP scores 0 and 1 were significantly higher than those of CLIP score ≥2.

Table 6 Comparison of distribution of patients and associated survival rates according to the four systems (the AJCC/UICC TNM classification, LCSGJ TNM classification, CLIP scoring system and the JIS scoring system)

Evaluation of accuracy of prediction

The discriminatory ability of the AJCC/UICC TNM classification, the LCSGJ TNM staging system, the CLIP scoring system and the JIS scoring system to predict death at 1-year, 3-year and 5-year after surgery was evaluated by AUC (Fig. 2). The CLIP scoring system was superior in differentiating between 1-year death or not (CLIP; AUC = 0.733, LCSGJ; AUC = 0.685, JIS; AUC = 0.672, AJCC/UICC; AUC = 0.671). The CLIP scoring system had the highest ability in differentiating both 3-year death and 5-year death (3-year; AUC = 0.865, 5-year; AUC = 0.956). Thus, the CLIP scoring system showed the best performance among the four staging systems in prediction of death at all the time points examined in the present study.

Fig. 2
figure 2

Discriminatory ability for 1-year, 3-year and 5-year death evaluated by ROC curves for the AJCC/UICC TNM classification, LCSGJ TNM classification, CLIP scoring system and the JIS scoring system

Discussion

The efficacy of surgical resection of large HCC has been questioned and debated. With the development of surgical techniques and improvement of perioperative care, there has been a significant improvement in the postoperative outcome of patients who undergo hepatic resection for large HCC [25, 26]. Several studies have reported the prognostic factors that may influence the outcome of patients with large HCC. In 1997, Noguchi et al. [13] evaluated the clinicopathological features and long-term survival of 20 cases of large HCC (>10 cm in diameter) and reported that patients of the curative resected group had a favorable long-term prognosis when clinicopathological examination showed no macroscopic portal vein invasion and nonaneuploid of the DNA ploidy pattern. However, their study was performed in a small sample size of only nine cases of the curative resection group. Lee et al. [7] analyzed 40 patients with large HCC and reported three factors of multiple tumors, venous invasion and impaired liver function to be associated with postoperative recurrence. The authors concluded that resection offered the chance of long-term disease-free survival in selected patients. Poon et al. [14] evaluated the factors related to outcome following surgical resection in a single centre by univariate and multivariate analysis. Multivariate analysis identified three factors of macroscopic residual tumor, macroscopic venous invasion, and multiple tumors as independent prognostic factors. In other studies, several factors were reported to influence prognosis, including tumor rupture, satellite lesions, high serum AFP level, intraoperative blood loss, vascular invasion, and cirrhosis [5, 15, 16]. These previous studies identified various factors related to recurrence or survival after resection, but the selection criteria for hepatic resection in patients with large HCC is still controversial.

The present study identified non-curative resection and high serum AFP level as independent and significant factors of poor overall survival in patients with large HCC (>10 cm in diameter) after hepatic resection by multivariate analysis. Noguchi et al. [13] also reported that curative resection was one of the prognostic factors, that the 1-year survival rate of patients with non-curatively resected HCC larger than 10 cm in diameter was 27.3%, and the rate was significantly lower than that of patients with curatively resected HCC. As explained above, our study identified AFP serum level as a significant and independent factor, in agreement with other reports [2729]. In fact, AFP was reported to be a marker of a biologically more aggressive phenotype [19]. Thus, AFP serum level seems an independent marker of prognosis of patients with large HCC.

The AJCC/UICC TNM classification has been widely accepted as the standard classification and allows evaluation of prognosis of patients with malignant diseases [30, 31]. In the AJCC/UICC TNM classification for HCC, three factors: size of tumor, presence of vascular invasion, and distribution of tumor are the basis of the T classification. However, these three factors are not equal. For example, a solitary tumor lacking vascular invasion is classified by the AJCC/UICC TNM as T1 irrespective of the size of tumor. In our series, nine patients were classified as stage I, and 31 patients were stage III, and none of the patients was classified as stage II. Thus, in the AJCC/UICC TNM classification, patients with HCC larger than 10 cm are distributed unevenly.

In the LCSGJ TNM classification, the tumor stage is defined by the extent of tumor alone, similar to the AJCC/UICC TNM classification. The T factor of the LCSGJ TNM classification consists three secondary factors: number of tumors, diameter (over 2 cm or not) and macroscopic vascular involvement. Thus, the cutoff diameter in this classification is 2 cm, suggesting that the LCSGJ TNM classification is not suitable for classification of HCC larger than 10 cm in diameter.

Recent studies have indicated that the prognosis of patients with HCC does not only depend on the extent of tumor alone but also on the functional state of the non-cancerous liver [32, 33] or the level of tumor markers such as AFP. Thus, the inclusion of non-cancerous factors into any staging system for HCC sounds logical since it seems to allow more accurate prognostication. The JIS scoring system was established taking into consideration liver function, in addition to the criteria of the LCSGJ TNM classification [18]. The JIS scoring system is simply calculated by the sum of the number of LCSGJ stage and the Child–Pugh stage. However, the liver function of patients with large HCC who had surgical resection is relatively preserved and most such patients are classified as Child–Pugh stage A. The patient distribution patterns according to the JIS score and the LCSGJ TNM classification were similar. In this regard, no tumor marker such as AFP is considered in the JIS score. Hence, the JIS score and the LCSGJ TNM classification do not seem suitable for evaluating the prognosis of patients with large HCC.

The CLIP scoring system was established by the Cancer of the Liver Italian Program investigators and consists of four independent predictive factors of Child–Pugh stage, distribution of tumor, AFP level and portal vein thrombosis [17]. In the present study, patients were evenly distributed across stages of the CLIP score and the discriminatory ability of the CLIP scoring system for death at 1, 3 and 5 years was the best among the four staging systems tested in the present study. The results of the present study suggested that the CLIP scoring system seems most suitable for classification of patients with HCC larger than 10 cm with regard to the prediction of prognosis after surgery. In fact, the survival rates of patients classified with low CLIP scores (score = 0 or 1) were significantly favorable (1-year survival rate; CLIP score = 0, 100%; CLIP score = 1, 75%). On the other hand, patients with large HCC and a CLIP score of >2 are less likely to have favorable survival if only treated with surgical resection; thus these patients seem to need other adjuvant therapies to improve their prognosis.

In conclusion, the CLIP scoring system seems the most suitable for classifying patients with HCC larger than 10 cm and prediction of their prognosis after surgery. Patients with CLIP scores of 0 and 1 are expected to have a satisfactory surgical outcome, while those with scores >2 seem to require other adjuvant therapies in addition to surgical resection to improve their prognosis.