Introduction

Intrahepatic cholangiocarcinoma (ICC) is the second most common primary liver cancer worldwide, accounting for 10–15% of all primary liver malignancies with an increasing incidence and mortality.1,2 While surgical resection remains the only potentially curative therapy, reported 5-year overall survival (OS) after hepatectomy ranges widely from 10 to 80%.3,4,5 In part, as a result of these disparate reported outcomes, the ability to predict accurately patient prognosis after ICC resection remains poor.6,7 Recently, Buettner et al. compared the ability of multiple proposed nomograms8,9 and staging systems (Okabayashi 2001, American Joint Committee on Cancer [AJCC] 2017, the Liver Cancer Study Group of Japan [LCSGJ] 2016, the Society of Hepatobiliary Surgery Japan [SHPBSJ] 2014, and Nathan 2009)10,11,12,13 to predict long-term survival outcomes after resection of ICC.14 Most staging systems performed poorly, with the AJCC staging system having a c-index to predict survival of only 0.57.15 To date, the nomogram proposed by Wang et al. has had the best discriminatory prognostic ability with a modest c-index of 0.668 and 0.607 to predict overall and disease-free survival (DFS), respectively.14 Jeong et al. proposed a nomogram to predict recurrence after ICC resection based on seropositivity for hepatitis B surface antigen, tumor size > 5 cm, Child-Pugh class B, and lymph node metastasis.15 While the authors reported a c-index of 0.71 in the training dataset, the c-index of the nomogram decreased to 0.65 when applied to the external validation test set.

Accurate estimation of risk of recurrence and patient prognosis remains important to guide perioperative management of ICC patients undergoing resection, as well as counsel patients with ICC. Currently available nomograms and staging systems have demonstrated only a limited ability to predict DFS and OS. The objective of the current study was to develop a nomogram based on an analysis of a large international, multi-institutional dataset that would more accurately predict individual patient risk of recurrence after curative-intent hepatectomy for ICC. In addition, we sought to identify patients for whom the probability of recurrence was underestimated by the nomogram to better understand the limitations of standard clinicopathologic features to accurately risk-stratify prognosis after resection of ICC.

Patients and Methods

Study Population and Data Collection

Patients who underwent curative-intent hepatectomy for ICC at 1 of 14 participating international hepatobiliary centers between 1990 and 2015 were identified. The Institutional Review Board of each participating institution approved the study parameters. Only patients who underwent curative-intent R0 or R1 liver resections for non-metastatic ICC were included. Patients who underwent debulking or macroscopically positive (R2) resection, laparoscopic staging, ablation, or intra-arterial therapy were excluded.

Standard patient demographic and clinicopathologic characteristics were collected, including age, gender, and American Society of Anesthesiologist (ASA) class. Pathologic variables collected included tumor size and focality, presence of underlying cirrhosis, tumor invasion of adjacent organs, liver capsule involvement, margin status, tumor grade, morphological subtype, major vascular/lympho-vascular/perineural invasion, and nodal status. Patients who underwent only clinical and/or radiological lymph nodal evaluation without formal lymphadenectomy were included in the Nx group. Treatment-related variables such as extent of hepatic resection, lymphadenectomy, and receipt of neoadjuvant and adjuvant chemotherapy were also recorded. Tumor stage was categorized according to the eighth edition of the American Joint Committee on Cancer (AJCC).16

Statistical Analysis

Continue variables were reported as medians with interquartile range (IQR) and categorical variables were recorded as totals and frequencies. Comparisons between categorical variables were assessed using the chi-square test or fisher’s exact test, as appropriate. The primary outcome measure for survival analysis was disease-free survival (DFS), defined as the time interval between the date of surgery and the date of recurrence; DFS was censored at the date of last follow-up for patients who remained disease-free. The secondary outcome measure for survival analysis was overall survival (OS). DFS and OS were estimated by Kaplan-Meier methodology and survival curves compared using log-rank analysis. Cox proportional hazards regression analysis was used to evaluate any association among variables and survival outcomes, with coefficients reported as hazard ratios (HR) and corresponding 95% confidence intervals (CI). Variables with a p value <0.1 on univariable analysis were included in the final multivariable model; variables that remained statistically significant on multivariable analysis were in turn used to construct the nomogram to predict DFS. The nomogram assigned a score to each variable included in the model and calculated the risk of recurrence at 3 and 5 years associated with the sum of the partial scores calculated for each variable. Multivariable logistic regression analysis was used to identify variables potentially associated with the subset of patients in whom the model significantly underestimated the risk of recurrence. All analyses were performed using STATA version 12.0 (StataCorp LP, College Station, TX, USA) and R software for statistical computation, v. 3.0.2 34, with the additional packages: survival, Hmisc.

Results

Demographics and Clinicopathologic Features

Among 897 patients who underwent liver resection for ICC, most patients were male (n = 519, 57.9%), younger than 65 years old (n = 573, 63.9%), and ASA class 1–2 (n = 566, 63.1%) (Table S1). The majority of patients had a mass-forming ICC subtype (MF-ICC, n = 790, 88.1%), a solitary tumor (n = 749, 83.5%), tumor size > 5 cm (n = 535, 59.6%), no liver capsule involvement (n = 752, 83.8%), and no invasion of adjacent organs (n = 788, 94.7%). While 362 (43.5%) patients underwent segmentectomy or sectorectomy, 305 (36.6%) patients underwent hemi-hepatectomy and 166 (19.9%) patients underwent extended right or left hepatectomy. On final pathology, 105 patients (11.8%) had R1 microscopically positive surgical margins, 150 (16.7%) had poorly differentiated/undifferentiated ICC, 274 (30.8%) had lympho-vascular invasion (LVI), and 163 (18.5%) had perineural invasion (PNI). Lymphadenectomy was not performed in 503 patients (56.0%); among the 394 (44.0%) patients who had a least one lymph node examined, 232 (25.9%) patients had no lymph node metastasis whereas 162 (18.1%) had nodal metastases.

Univariable and Multivariable Disease-Free Survival Analyses

During a median follow-up of 29.7 months (IQR, 14.6–47.8), median DFS was 38.9 months, while 3- and 5-year DFS were 52.5% (95% CI 48.9–56.0) and 38.9% (95% CI 34.9–42.9), respectively. Variables associated with DFS on univariable analysis included age, ASA class, morphologic type, extent of resection, liver capsule involvement, invasion of adjacent organs, tumor size, number of lesions, grade of tumor differentiation, major vascular invasion, LVI, PNI, and lymph node metastasis (Table S1). On multivariable Cox regression analysis, tumor size > 5 cm (HR 1.98, 95% CI 1.44–2.13; p < 0.001), multifocal ICC (HR 1.64, 95% CI 1.32–2.03; p < 0.001), lymph node metastasis (HR 1.63, 95% CI 1.25–2.11; p < 0.001), poorly differentiated tumor grade (HR 1.50, 95% CI 1.21–1.89; p < 0.001), and periductal infiltrating type (PI) morphology (HR 1.42, 95% CI 1.09–1.83; p = 0.008) were independent adverse risk factors associated with decreased DFS (Table 1). Variables significant on multivariable analysis were included in the nomogram to predict DFS (Fig. 1). The Harrell’s c-index for the nomogram was 0.633 (with n = 5000 bootstrapping resamples) and plots comparing predicted and actuarial DFS demonstrated a good calibration of the model at 1, 2, 3, and 5 years (Supplemental Fig. 1a–d).

Table 1 Multivariate cox proportional hazards regression analysis of risk factors associated with disease-free survival
Fig. 1
figure 1

Nomogram for prediction of 3- and 5-year disease-free survival after resection of intrahepatic cholangiocarcinoma

Comparison Between Predicted and Actuarial Disease-Free Survival

Among the 548 (61.1%) patients who developed recurrent disease, 354 (64.6%) patients developed recurrence within the first 12 months after resection, 119 (21.7%) recurred between 12 and 24 months, 46 (8.4%) recurred between 24 and 36 months, and 29 (5.3%) patients recurred > 36 months after surgery. Using predictions based on the nomogram, differences in predicted versus actuarial DFS were calculated. Among patients without a recurrence, 136 (15.2%) patients were excluded because of insufficient follow-up data to rule out a recurrence > 6 months earlier than the nomogram-based estimated DFS. In turn, 761 patients were available for additional analyses to compare predicted versus actuarial DFS (Fig. 2). Among these 761 patients, a subset of patients had a DFS worse than predicted. Specifically, 282 patients (37.1% of 761) recurred > 6 months earlier than the DFS predicted by the nomogram (ΔPredicted DFS − Actuarial DFS > 6 months). In contrast, 266 patients (34.9% of 761) recurred within 6 months of the nomogram predicted DFS (ΔPredicted DFS − Actuarial DFS ≤ 6 months); 213 (28.0%) patients who did not recur had a follow-up time period long enough to exclude a recurrence > 6 months earlier than the estimated DFS (ΔPredicted DFS − Follow-up without recurrence ≤ 6 months).

Fig. 2
figure 2

Stratification of patients by nomogram predicted versus actuarial disease-free survival

Univariable and multivariable analyses were performed to examine the association of clinicopathologic variables among patients who had a DFS worse than or similar to the DFS predicted by the nomogram. Of note, the incidence of clinicopathologic characteristics historically regarded as “favorable” (e.g., smaller lesion size, unifocal tumors, no lymph node metastasis) was higher among patients who had a DFS worse than predicted (Table S2). For example, among patients who had a DFS worse than predicted, 48.9% (n = 138) had an ICC tumor ≤ 5 cm compared with only 25.5% (n = 122) of patients who had DFS similar to the DFS predicted by the nomogram (p < 0.001). Moreover, 94.7% (n = 267) of patients who had a DFS worse than predicted cohort had unifocal disease versus 73.9% (n = 354) of patients who had a DFS similar to the DFS predicted by the nomogram (p < 0.001). The incidence of well-to-moderately differentiated tumors was also greater among patients with a DFS worse than predicted (89.7 vs. 77.2%, p < 0.001), as was the incidence of MF ICC morphology (91.5 vs. 84.1%; p = 0.004). In addition, the incidence of patients with lymph node metastasis was also lower among patients who had a DFS worse than predicted (11.4 vs. 24.0%, p < 0.001).

On multivariable analysis, the two “favorable” factors that were associated with the greatest risk of “overestimation” of prognosis were tumor size < 5 cm (OR 2.99, 95% CI 2.13–8.48, p < 0.001) and solitary ICC disease (OR 5.31, 95% CI 2.98–8.48, p < 0.001) (Table 2). Other factors associated with overestimation of a favorable prognosis (i.e., DFS worse than predicted) included well-to-moderate tumor grade (OR 2.33, 95% CI 1.44–3.78, p = 0.001), negative nodal status (OR 2.34, 95% CI 1.37–4.01, p = 0.002), lymph nodes not examined (not examined vs. nodal metastasis: OR 2.11, 95% CI 1.29–3.45, p = 0.003), and MF ICC subtype (OR 1.97, 95% CI 1.14–3.41, p = 0.016).

Table 2 Multivariate regression analysis of risk factors associated with underestimation of recurrence risk (DFS worse than predicted)

Heterogeneity in Prognosis Among Patient Prognosis Independent of Disease Stage

Among the entire cohort of 897 patients, median OS was 37.8 months while 3- and 5-year OS were 52.0 and 38.5%, respectively. On Kaplan-Meier survival analysis, patients stratified by within the same T-stage still demonstrated marked different risks of death especially among patients who had a DFS worse than predicted by the nomogram (Fig. 3). For example, among patients with stage T1a/T1b tumors, patients who had a DFS worse than predicted by the nomogram had a 5-year OS of 24.1% (95% CI, 16.3–32.7) versus 63.7% (95% CI, 54.4–71.6) for patients whose DFS was accurately predicted by the nomogram (p < 0.001) (Supplemental Fig. 2a). Similarly, among patients with more advanced stage T2–T4 tumors, patients who had a DFS worse than predicted had a 5-year OS of 12.7% (95% CI, 5.9–22.1) versus 39.7% (95% CI, 32.4–46.9) for patients who had a DFS similar to predicted (p < 0.001) (Supplemental Fig. 2b).

Fig. 3
figure 3

Forest plot comparing 5-year overall survival analysis of patients who had a DFS worse than predicted versus patients who had a DFS similar to predicted stratified by T1a/T1b stage versus T2–T4 stage and patients with “favorable” versus “unfavorable” pathology characteristics

Patients were further divided into two subgroups based on clinical characteristics. One subset of patients (n = 138, 18.1%) was classified into a “favorable pathology” cohort, which consisted of patients who presented with clinicopathologic features historically associated with a favorable prognosis (i.e., unifocal disease, MF ICC morphology, no lymph node metastasis, tumor size < 5 cm, and well-to-moderately differentiated tumor grade). The second subset of patients (n = 623, 81.9%) consisted of individuals with historically “unfavorable pathology” (i.e., multifocal disease, PI ICC subtype, lymph node metastasis, tumor size > 5 cm, and poor-to-undifferentiated tumor grade). Of note, while the majority of patients who had traditional “unfavorable pathology” had a DFS similar to that predicted by the nomogram (n = 426, 68.4%), most patients who had “favorable pathology” had a DFS worse than would have been predicted (n = 85, 61.6%). Furthermore, while patients who had “favorable pathology” had similar homogenous characteristics, patients who had a DFS worse than predicted had a 5-year OS of 39.7% (95% CI, 27.9–51.2) versus 97.0% (95% CI, 80.9–99.6) for patients who had a DFS similar to that predicted by the nomogram (p < 0.001; Supplemental Fig. 2c). Among patients who had unfavorable pathology, the 5-year OS for patients with a DFS worse than predicted was 6.7% (95% CI, 2.3–14.6) versus 40.7% (95% CI, 34.3–46.9) for patients with a DFS similar to that predicted by the nomogram (p < 0.001; Supplemental Fig. 2d).

Of note, among patients with ICC > 5 cm, the 5-year OS of patients in the cohort that had a DFS similar to that predicted by the nomogram was 39.2% compared with 10.5% for patients who had a DFS worse than predicted (p < 0.001). Among patients with lymph node metastasis (n = 147), 5-year OS was 31.8% among patients who had a DFS similar to that predicted by the nomogram versus 17.8% for patients with a DFS worse than predicted (p < 0.008) (Table S3). On multivariate analysis, PI morphology, tumor size > 5 cm, multifocal ICC, poorly-to-undifferentiated tumor grade, lymph node metastasis, and DFS worse than predicted were independent risk factors associated with decreased OS (Table 3). The c-index of the final multivariable model for OS incorporating underestimation of recurrence as represented by the variable “DFS worse than predicted” was 0.762.

Table 3 Multivariate cox proportional hazards regression analysis of risk factors associated with overall survival

Discussion

A fundamental prerequisite for effective postoperative management and counseling of patients after curative-intent surgery for malignancy is accurate evaluation and estimation of patient prognosis.17 For most cancers, the ability to assess disease stage and predict long-term outcome has traditionally been based on the internationally recognized AJCC staging systems.18,19,20 Recently, nomograms or clinical scoring systems have been proposed as prognostic tools that may be better able to estimate individual risk of recurrence and death using specific clinicopathologic or molecular features of a patient’s tumor.21,22,23,24 For cholangiocarcinoma, these tools have not been effective, however, in clinical practice and there has been an inability to predict accurately patient prognosis after liver resection for ICC.14 In a recent analysis that compared the ability of multiple proposed ICC nomograms and staging systems to predict both OS and DFS, none of the prognostic tools demonstrated a c-index ≥ 0.7, which is considered the threshold value for good discriminative ability.14 Several authors have hypothesized that the suboptimal results of these prognostic tools has been due to the need to include other yet-to-be identified prognostic variables.8,9,14 To examine this hypothesis, we sought to develop a nomogram for DFS from a large multi-institutional cohort of patients undergoing curative-intent liver resection for ICC. More importantly, we specifically defined and characterize the subset of patients in whom the risk of recurrence was markedly underestimated by the nomogram tool. In turn, we were able to quantify and define the limitation of standard clinicopathologic features to accurately risk-stratify patient prognosis after resection of ICC.

The proposed nomogram to predict risk of recurrence after ICC resection incorporated standard variables including tumor size and focality, morphologic type, lymph node status, and grade of differentiation, which have been proposed by previous investigators.25 In particular, one or more of these variables have been associated with prognosis in each of the other seven previously proposed prognostic clinical tools for ICC. In the present model, lymph node metastasis was associated with increased risk of recurrence (HR 1.63), similar to the hazard ratio of 1.70 estimated by Wang et al.8 In the present study, multifocal ICC was associated with an increased risk of recurrence (HR 1.64); the number of ICC lesions has also been included in each of the seven previously proposed clinical tools, with the additional risk attributed to multifocal disease ranging from an HR of 1.58 in Hyder et al. to 4.60 in Okabayashi et al.9,12 In addition, morphologic type and tumor size have been incorporated into several prior clinical tools, confirming their suggested association with patient prognosis.14 In the current analysis, grade of tumor differentiation demonstrated a strong independent effect on patient prognosis, with a 50% increased risk of recurrence for patients with poorly/undifferentiated ICC compared with patients with well-to-moderately differentiated ICC. A recent prognostic scoring system for ICC developed by Raoof et al. reported that high grade/poorly differentiated tumors were associated with a nearly twofold increased hazard of death.26 Similar to previously proposed nomograms and staging systems, the current nomogram demonstrated only a modest ability to predict DFS (c-index = 0.633). Of note, even though the nomogram in the current study was based on one of the largest multi-institutional international cohorts of patients who underwent curative-intent liver resection for ICC (n = 897), the nomogram still failed to perform well and accurately predict risk of recurrence for many patients with ICC. To better define this result, several models were tested including continues, categorical, and transformed different versions of the variables selected to develop the nomogram. Particularly, even when tumor size (continue value: c-index 0.623; categorical with cut-offs at 3, 6, and 10 cm: c-index 0.624) and lymph node status (number of positive node: c-index 0.636; lymph node ratio [LNR]: c-index 0.643; log odds ratio [LODDS]: c-index 0.641) were modeled in different ways, the ability of the nomogram to predict risk of recurrence did not improve.

Of note, when the DFS predicted by the nomogram was compared with actuarial DFS, nearly one third of patients (n = 282, 31.4%) had their risk of recurrence substantially underestimated. Moreover, the underestimation of a patient risk of recurrence was greater among patients with clinicopathologic features traditionally associated with a more favorable prognosis (e.g., unifocal disease, MF ICC morphology, negative lymph nodes, tumor size < 5 cm, and well-moderately differentiated tumor grade) and early T-stage (T1a/Tb) disease. To evaluate whether this underestimation of recurrence risk was associated with an additional, as yet unknown, factors impacting patient prognosis, additional survival analysis of patients who had DFS worse than predicted were compared with patients who had DFS that was similar to that predicted by the nomogram. Even within the seemingly homogenous subset of patients within the “favorable pathology” characteristics, which encompassed patients with no identifiable adverse pathologic features and early T-stage disease, OS was widely disparate. In fact, patients who had a DFS worse than predicted in the nomogram had a two- to threefold increased risk of death compared with patients who had a DFS similar to that predicted in the nomogram (Fig. 3). This additional risk of death was not predicted by the nomogram and was not associated with tumor size, number of ICC, grade of tumor differentiation, morphologic type, or lymph node status, suggesting that perhaps one or more variables not captured in most nomograms was driving these divergent prognoses.

Other factors beyond the classical tumor features including in most nomograms have recently gained attention and may explain the heterogeneous clinical outcomes observed among patients after resection of ICC. For example, Farshidfar et al. performed an integrative genomic analysis of ICC and described an IDH mutant-enriched subtype of ICC, which might have a strong impact on the prognosis of patients with resected ICC.27 Previously, Zhu et al. identified IDH1 and KRAS as the most commonly mutated genes in ICC and suggested that genetic classification of ICC based on the pattern of gene mutations might have an impact on patient’s prognosis after surgery.28 Additionally, Jusakul et al. described the whole-genome and epigenomic landscapes of etiologically distinct subtypes of cholangiocarcinoma.29 The authors reported four distinct subtypes of ICC with different clinical and genomic characteristics and different associated prognoses.29 An increasing volume of evidence has also emphasized the complex relationship between cancer and the host immune system.30 Several recent studies have contributed to an improved understanding of the immunobiology of ICC and suggested that the host immune response against the tumor may have a significant impact on disease prognosis.31,32,33 ICC demonstrates the ability to modify the local microenvironment, expressing immune-checkpoint proteins, such as cytotoxic T lymphocyte-associated antigen-4 (CTLA­4) and programmed cell death protein-1 (PD­1), in order to halt the host antitumor immune responses.34,35 The neutrophil to lymphocyte ratio (NLR), an estimate of the patient’s immune response to a malignancy, has also been identified as one of the inflammatory parameters potentially associated with prognosis in some solid tumors, including ICC.36,37,38 In a recent meta-analysis, Tan et al. reported that elevated preoperative NLR was associated with an increased risk of death among patients who underwent resection of ICC.36,37,38

Limitations of the current study included its retrospective design and the inclusion of only those variables assessed in the multi-institutional clinical database. However, variables in the database were comparable to those factors included in previous nomograms, which allowed for a direct comparison. While the results suggest that traditional clinicopathologic variables were not able to predict adequately patient prognosis after surgery, it was not possible in a retrospective fashion to examine patterns of gene mutations or expression of immune-checkpoint biomarkers. Future studies will need to focus on the role of these factors to explain why a substantial number of patients with otherwise favorable clinicopathologic features and early stage disease had a poor prognosis and a DFS worse than predicted. Moreover, the role of adjuvant chemotherapy in patients undergoing surgery for ICC is still unclear.39,40,41 Recently, our group assessed the impact of adjuvant chemotherapy on survival of patients with ICC, using a cohort of patients similar to that of the present study.42 While adjuvant chemotherapy did not influence the prognosis of all ICC patients following surgical resection, it was associated with a potential survival benefit in subgroups of patients at increased risk for recurrence, such as those with advanced tumors. Moreover, the current National Comprehensive Cancer Network (NCCN) guidelines only recommend adjuvant chemotherapy following resection of ICC in the setting of nodal metastasis (N1)/positive surgical margins (R1).43 A sub-analysis was performed to reduce the possible risk of bias due to the effect of adjuvant chemotherapy in specific subset of patients. Particularly, among patients with “favorable” characteristics (e.g., smaller lesion size, unifocal tumors, no lymph node metastasis), adjuvant chemotherapy was not associated with prognosis (p = 0.46) and did not negatively impact the prognosis of patients with “favorable” characteristics. Interestingly, among the subset of patients with “favorable” characteristics and a DFS worse than predicted (ΔPredicted DFS − Actuarial DFS > 6 months), patients who received adjuvant chemotherapy tended to have a prolonged OS, compared with those who did not receive adjuvant chemotherapy, even though this association was not statistically significant (5-year OS: chemotherapy 47.8% [95% CI, 22.5–69.7] vs. no-chemotherapy chemotherapy 37.8% [95% CI, 24.5–50.9]; p = 0.29). Finally, given the multi-center and retrospective nature of the study, it was not possible to standardize the operative approach including performance and extent of lymphadenectomy.

In conclusion, a nomogram based on standard clinicopathologic characteristics was suboptimal in its ability to predict accurately risk of recurrence among patients with ICC after curative-intent liver resection. Particularly, the risk of underestimating patient risk of recurrence was highest among patients with historically favorable characteristics. Specifically, over one third of patients recurred > 6 months earlier than the DFS predicted by the nomogram. Undefined variables may be important drivers of recurrence and survival, the exclusion of which limit the discriminatory ability of current stratification tools. Further research to investigate the potential impact of these factors such as ICC genetic subtypes and immune-checkpoint biomarkers is needed to improve the ability to provide accurate prognostic information for patients after resection of ICC.