Introduction

Perihilar cholangiocarcinoma (PHC), defined as adenocarcinoma of the biliary tract originating from the second-degree bile ducts to the insertion of the cystic duct into the common bile ducts [1, 2], accounts for 60–70% of all cholangiocarcinoma [1, 2]. The annual incidence of PHC is 1–2 per 100,000 individuals in the United States [3]. Most patients are ineligible for surgical resection at diagnosis, because the tumor is locally unresectable or has metastasized [1, 4, 5]. Surgical resection is the only potentially curative treatment for patients with resectable perihilar cholangiocarcinoma, resulting in a median overall survival (OS) of around 35–40 months [6,7,8].

Several clinical factors have been associated with long-term survival outcomes following curative surgical resection of PHC, including surgical resection margin, lymph node (LN) status, tumor staging, tumor size, tumor differentiation, perineural invasion and adjuvant chemotherapy [9, 10]. Of note, LN status has been reported to be one of the strongest predictive factor [11, 12]. While patients with LN metastases (LN +) have worse long-term overall survival rates versus patients with node-negative (LN − ) disease, there is still no consensus on the optimal extent of lymphadenectomy in PHC [13,14,15].

Current practice guidelines from the National Comprehensive Cancer Network (NCCN) note that lymphadenectomy is “reasonable for staging purposes”. However, resection and pathologic evaluation of LNs at the time of surgery for PHC only ranges from 19 to 53% [1, 16, 17]. To be consistent with recommendations for gallbladder cancer, the 7th and 8th edition of the American Joint Committee on Cancer (AJCC) Cancer Staging Manual now recommends resection of 6 or more LNs for intrahepatic bile duct cancers, which is still controversial. Several studies have showed, however, that fewer than 25% of PHC patients had four or more nodes examined histopathologically [18, 19]. Insufficient identification of LN metastasis may lead to understaging of the disease, misclassification of patients as N0, as well as overestimation of survival expectancy [13]. The objective of the current study was to perform a systematic review and meta-analysis of the available evidence to determine the value and the optimum number of LNs to be removed and examined among patients undergoing surgical resection of PHC.

Patient and methods

A systematic review and meta-analysis on the existing published medical literature was conducted following the Cochrane Collaboration guidelines [20].

Literature search strategy

The PubMed(OvidSP), Embase and Cochrane Library were searched for studies published before July 2020 using the following terms and strategy to identify relevant studies: (“cholangiocarcinoma” or “perihilar cholangiocarcinoma” or “hilar cholangiocarcinoma” or “bile duct tumor”) AND (“lymph node resection” or “lymph node dissection” or “lymphadenectomy”) AND (“liver resection” or “bile duct resection” or “surgical resection”). The references of the included studies, relevant reviews and meta-analysis were manually screened to look for other eligible studies. Only studies written in English, regardless of patient population, were included. The study protocol was approved by the Institutional Review Board of Eastern Hepatobiliary Surgery Hospital, and informed consent from the patients was waived. Written for the data to be used for clinical researches was obtained from the published article.

Eligibility criteria

The inclusion criteria for the eligible studies were: (1) prospective and retrospective studies that reported patients with perihilar cholangiocarcinoma undergoing surgical resection, as well as the number of retrieved LNs; (2) information on the patient population with perihilar cholangiocarcinoma was provided; (3) information on the utility or prognostic value of LN resection was provided. Studies that met any one of the following criteria were excluded: (1) studies only on patients with intrahepatic cholangiocarcinoma or distal bile duct carcinoma; (2) studies only on patients with gallbladder carcinoma; (3) recurrent perihilar cholangiocarcinoma; 4) abstracts, reviews, case reports, letters to the editor, and articles written in languages other than English.

Data extraction

Two reviewers (L.L. and C.L.) independently performed data extraction and a third author (T.Y.) cross-checked the data. Any disagreement was resolved through discussion. The data extracted included the surname of the first author, year of publication, period of patient inclusion, number of patients, mean age, gender of patients, entity of tumor, surgical procedures, AJCC staging, pT1-2, N0, surgical margin (R0), median survival, 5-year survival, patients with lymphadenectomy, median retrieved LN count, median positive LN count and cut-off value of lymphadenectomy. The hazard ratios (HRs) in the overall survival (OS) curves were extracted to assess prognosis if the data were not provided directly. The methods for data extraction and calculation, especially the data in the Kaplan–Meier curves, were adopted from the methods described in detail by Tierney et al. [21] and Parmar et al. [22]. A calculation spreadsheet in Microsoft Excel was developed to obtain the observed minus expected events (O-E), the variance (V), the HR, the log [hazard ratio], and its standard error (SE) for each of the individual trials.

Quality assessment of the included studies

The modified Newcastle–Ottawa Scale (NOS) was used to assess the quality of the non-randomized studies included in the meta-analysis [23]. The maximum possible score was 9 and the minimum score was zero; a sum score ≥ 6 indicated high quality. The Cochrane methodology was used to assess the ‘risk of bias.’ The Grading of Recommendations Assessment, Development and Evaluation (GRADE) System was used to assess the quality of the evidence and the strength of the recommendations [24].

Data analysis

The total retrieved LN count was reported as median and range, as described most studies. The distribution of variables for non-normally distributed data was expressed as median and interquartile range (IQR). The Mann–Whitney U test or the one-way analysis of variance test was used for multiple comparisons. The Spearman’s rank correlation was computed among continuous variables. Significance was set at a p value of less than 0.05. Statistical analyses were performed with the software package SPSS 24 (SPSS, Inc, Chicago, IL) and GraphPad Prism Software Version 6.0.

The Review Manager (RevMan, the Cochrane Collaboration, Oxford, UK) version 5.3 was used for data pooling. The endpoint of the meta-analysis was overall survival. The effect measures for OS were expressed as HR. The HR and 95% confidence interval of the outcomes were calculated, depending on the level of heterogeneity, using either the fixed- or the random-effect models. The results of data pooling in the meta-analysis were presented as “forest plots.” Generally, heterogeneity among studies was assessed using the I2 statistic and the chi-square (X2)-based Q test. A P < 0.1 or I2 > 50 indicated significant heterogeneity [25] when the random- or fixed-effect model was used. A P < 0.05 in the Z test on pooled data was considered statistically significant.

Results

Included studies

Through searches of PubMed (n = 688), Embase (n = 119) and Cochrane library (n = 16) databases, 823 articles were identified; 252 duplicate references were excluded. After abstract reviewing, 528 of the 571 original articles were eliminated for failure to meet inclusion criteria. In addition, of the remaining 43 studies, 15 were excluded after reviewing the full-text due to incomplete data or non-English language. Eventually, 28 retrospective cohort studies were included in the systematic review [9, 11, 26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51]. No randomized controlled trial was identified. The searching and screening processes of the medical literature are summarized in Fig. 1.

Fig. 1
figure 1

PRISMA flow diagram showing selection of articles for review

Characteristics and quality of the included studies

The 28 studies that reported lymphadenectomy or LN status among patients undergoing resection of PHC were published between 2001 and 2019. Among 7748 patients with a median age of 65 (range 53–70), 5775 patients (75%) had perihilar cholangiocarcinoma. The remaining 10973 patients (25%) reported in 8 studies [9, 26, 34, 40, 41, 45, 46, 50] had heterogeneous tumors that included ampullary cancer, gallbladder cancer, distal bile duct cancer (dBDCA), middle bile duct cancer (mBDCA) and extrahepatic bile duct cancer (exHBDCA). In addition, in 13 studies [9, 11, 30, 32, 34, 36, 39, 40, 44,45,46, 49, 51], 382 patients (5%) underwent pancreaticoduodenectomy. None of these studies included patients with distant metastases. The characteristics of all 28 included studies are listed in Table 1. Of these studies, 21 studies [11, 27, 28, 30,31,32, 34,35,36,37,38, 40,41,42,43,44,45, 47, 48, 50, 51] were of relatively high quality with overall NOS scores ranging from 6 to 8, and 7 studies [9, 26, 29, 33, 39, 46, 49] were of relatively moderate quality with overall NOS scores ranging from 4 to 5. (Supplement Table 1).

Table 1 Characteristics of the included studies

Clinicopathologic and prognosis characteristics

Most studies [9, 27, 28, 35, 36, 38, 39, 41, 44,45,46,47,48,49,50,51] (16/28) used the 7th American Joint Committee on Cancer (AJCC) Cancer Staging Manual. T stage (pT1-T2, n = 1600, 42% and pT3–T4, n = 2,220, 58%) was reported on 3 820 patients (50%) from 17 studies [9, 11, 27, 28, 31, 32, 34, 36, 37, 40, 43,44,45,46,47, 50, 51]. N stage (N0, n = 2,653, 42% and N1-2, n = 1938, 58%) was reported on 4 591 patients (59%) from 26 studies [9, 11, 27,28,29,30, 32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48, 50, 51]. Meanwhile, the surgical margins (R0, n = 2309, 74% and R1, n = 797, 26%) were reported on 3,106 patients (40%) from 21 studies [9, 27,28,29,30, 32, 34,35,36,37,38,39,40,41,42,43,44,45,46,47,48, 51]. Overall median survival was 22 months (range 17–49) among 17 studies [9, 27, 29,30,31,32,33,34,35, 38, 40,41,42, 44, 46, 48, 49], while median 5-year overall survival was 25% (range 7–52%) reported in 20 studies [9, 11, 27, 28, 30, 33,34,35,36, 38,39,40, 43,44,45,46,47,48,49, 51]. The baseline oncologic data of the patients are shown in Table 2.

Table 2 Clinicopathologic and prognosis characteristics

The number of retrieved lymph nodes

In the entire cohort, 6649 (86%) of 7748 patients underwent lymphadenectomy. The median number of retrieved LNs ranged from 1 to 24 LNs reported in 22 studies (76%). The most commonly reported median retrieved LN count in 6 studies [28, 29, 34, 41, 43, 46] was 3 (27%). Only 11 studies [9, 11, 35,36,37,38,39,40, 44, 48, 51] reported a median retrieved LN count of over 6, which was the recommended target number of the 7th AJCC edition. The median number of metastatic LNs ranged from 0 to 4 reported in 13 studies [9, 11, 26, 28, 31, 34, 36, 37, 39, 41, 43, 44, 48]. Most patients underwent regional lymphadenectomy including removal of lymph nodes along the cystic duct, common bile duct, hepatic artery, and portal vein. Only 4 studies [11, 27, 36, 50] reported patients who underwent extended lymphadenectomy, which was defined as lymphadenectomy involving the periaortic, pericaval, superior mesenteric artery, and/or celiac artery LNs. The types of lymphadenectomy and the number of retrieved LNs in the studies are described in Table 3.

Table 3 The number of retrieved lymph nodes

Primary outcomes

The utility value of lymph node dissection

There was only one study by Morine et al. that indicated routine LN dissection did not improve 3-year OS among PHC patients [33]; however, only 22 patients were included in this study. To evaluate whether the number of LNs examined was associated with OS, patients were divided into 3 groups according to the median LN count. Of note, an increase in retrieved LN count was not associated with median survival and 5-year OS (Fig. 2a, b). Meanwhile, an increase in the number of metastatic LNs retrieved at the time of surgery was also not associated with improve median survival or OS (Fig. 2c, d). However, among patients with node-negative disease, several studies reported that an increasing number of LN retrieved was associated with an improved OS [31, 35, 37]. For example, Mao et al. [31] reported that, among node-negative patients, the group of patients with a total retrieved lymph node count (TLNC) ≥ 13 had a better overall survival versus patients with TLNC ≤ 12 (5-year OS: 52.8% vs. 39.7%, respectively, P = 0.001). On the contrary, Oshiro et al. [40] reported no significant difference among patients who had less than 12 LNs versus patients who had 12 or more LNs removed (P = 0.484). A possible explanation for the difference in these findings may be due to the increased chance of detecting metastatic LNs when the total retrieved LN count increased. There was no evidence to suggest that an increase in the retrieved LN count was itself per se associated with a survival benefit in patients with PHC.

Fig. 2
figure 2

The utility value of lymph node dissection. a, b The median LN count retrieved was divided into 3 groups, and compared with the median survival time and 5-year OS. c, d The median LN ( +) count retrieved was divided into 2 groups (multiple vs. single), and compared with the median survival time and 5-year OS. P < 0.05 was considered as statistically significant

Prognosis value of lymph node status on overall survival

Although LN dissection did not improve OS, metastatic LN status was an important predictor of OS among patients undergoing R0 PHC resection. The pooled HR for OS among the 20 studies that included 4 591 patients demonstrated an increased risk of death with metastatic LNs (n = 2,653, 58% for LN ( − ) vs. n = 1,938, 42% for LN ( +); HR 2.07, 95% CI 1.65–2.59, P < 0.001; I2 = 90%, P < 0.001). Because the heterogeneity test demonstrated a significant heterogeneity (I2 = 90%, P < 0.001), the pooled effect was estimated by the random-effect model as demonstrated in the forest plot (Fig. 3). The funnel plot of OS is depicted in Supplement Fig. 1; no significant asymmetry of the funnel plot was detected (Begg’s and Egger’s P > 0.05). A subgroup analysis using different eligible criteria was subsequently performed to decrease the heterogeneity. As demonstrated in Fig. 4, there was no significant heterogeneity for the PHC patients (HR 2.16, 95% CI 1.94–2.41, P < 0.001; I2 = 40%, P = 0.07) and no significant asymmetry of the funnel plot was detected (Begg’s and Egger’s P > 0.05).

Fig. 3
figure 3

The overall meta-analysis comparing overall survival between the patients with negative lymph nodes and positive lymph nodes

Fig. 4
figure 4

The subgroup meta-analysis of the studies with different eligibility. Latest 5 year: studies published latest 5 year, N0 > 50%: studies with more than 50% patients with negative nodes status, R0 > 50%: studies with more than 50% patients with negative resection margin, without PD: studies without patients performed pancreatoduodenectomy, Westerner: studies published from western countries, Asians: studies published from Asian countries, PHC only: studies only reference perihilar cholangiocarcinoma. P < 0.05 was considered as statistically significant

The best number of harvested lymph nodes

Inadequate examination of LNs can lead to inaccurate and understaging of cancer stage. Extended lymphadenectomy was associated with an increased median number of retrieved LN count as described in 4 studies [11, 27, 36, 50]. Extended lymphadenectomy was also performed more frequently when pancreaticoduodenectomy was done concurrently (Table 3) with a higher median LN retrieval count (P = 0.025) among these patients. However, no significant difference was reported in the median number of metastatic LNs (P = 0.122) (Fig. 5a, b). To assess the minimum number of LN required for histological examination in patients with PHC, several cut-off values were created for survival comparison. The median number of retrieved LN count was reported in 22 (76%) studies ranging from 1 to 21 LNs. Two studies were excluded for the following reasons: Schwarz et al. [26] did not describe the number of negative LN and negative LN patients and Lee et al. [46] only provided data on the node-negative patients (Table 2, 3). As a consequence, 20 studies were included for further analyses.

Fig. 5
figure 5

The number of harvested lymph nodes. a compared the median lymph node count retrieved between regional and extended lymphadenectomy. b compared the ratio of LN ( +) between regional and extended lymphadenectomy. c The median LN count retrieved was divided into 5 groups, and compared the ratio of LN ( +) among the 5 groups. d, e compared the ratio of LN ( +) at different cut-off values. P < 0.05 was considered as statistically significant

The ratio of LN ( +) was divided into five groups according to the median number of LNs retrieved (Fig. 5c). The ratio of LN ( +) in the group (6–8) was higher than in the group (0–2) or (3–5), although there were no significant differences detected between the groups (P = 0.055 and P = 0.061). Increasing the number of LN retrieved was not associated with increase in the LN ( +) ratios (P = 0.865 and P = 0.480). As no study included median numbers of LNs of 4 or 6, the cut-off values were set at 3, 5, and 7 in the meta-analysis. At the cut-off value of 3, there was no significant differences (P = 0.056, Fig. 5d). At the cut-off values of 5 and 7, significant differences in the number of positive LN were noted (p = 0.002, and P < 0.001, respectively, Fig. 5e, f).

Discussion

The current meta-analysis demonstrated that lymphadenectomy itself and increased numbers of LNs harvested were not associated with improved OS among patients who underwent surgical resection of PHC. However, lymphadenectomy was important in that metastatic LN status was an important predictor of long-term survival as patients with PHC and LN ( +) had a significantly worse prognosis compared to those with LN ( − ). Based on the currently available data, the optimal number of LNs required for histological examination was 5. Collectively, data from this systematic review and meta-analysis provide evidence to define the important role of lymphadenectomy for patients with resectable PHC.

Only one study by Morine et al. [33] reported that routine LN dissection did not improve 3 years OS. However, the number of patients included in this study was only 22. No randomized trials exist to address whether there is survival benefit of LN dissection in patients with PHC. Controversies exist on the impact of number of LN retrieved relative to long-term survival. Guglielmi et al. [38] reported median survival of PHC patients to be 3.0, 18.5 and 29.0 months for patients with 0, 1–3 and 3 LN retrieved, respectively (P < 0.01). Oshiro et al. [40] demonstrated that there was no significant difference among patients who had less than 12 LNs removed versus patients who had 12 or more LNs harvested (P = 0.484). In this meta-analysis, the median survival and the 5-year overall survival of patients with PHC were divided into three groups according to the number of median LNs retrieved. The results demonstrated that increasing the retrieved LN count was not associated with a survival benefit. Furthermore, some studies reported that for LN ( +) patients, increasing the number of LN ( +) removed at the time of surgery not improved OS; while other studies reported that increasing the number of LN removed at the time of surgery improved the OS among patients with negative LNs [31, 35, 37]. One possible explanation may be that increasing the number of LN retrieved increased the detection of the number of patients with LN ( +) patients. Currently, there was not enough evidence to suggest that lymphadenectomy or increasing the number LN retrieved resulted in survival benefit for patients with PHC.

Although LN dissection did not improve the overall survival of patients with PHC, LN status was demonstrated to be an important predictor of OS after surgical resection for patients with PHC. In a pooled OS analysis calculated based on 20 studies with 4591 patients (n = 2 653, 58% for negative LN vs. n = 1 938, 42% for positive LN), the pooled estimate for the hazard of death was over two-fold higher among patients with LN metastasis.

Furthermore, although LN count had no association with long-term survival, inadequate examination of LN led to insufficient nodal assessment with an increased false N0 rate due to missing of metastastic nodes. In contrast, higher number of LNs retrieved and examined lead to a higher detection rate for patients with LN ( +) and a lower detection rate for patients with N0 status. This phenomenon had been observed by several authors [52, 53]. In this systematic review, extended lymphadenectomy was used in 4 studies [11, 27, 36, 50]. The median numbers of LN retrieved in these studies were significantly increased. A larger number of LNs removed does not necessarily correlate with a higher ratio of LN ( +). In some studies which involved additional pancreaticoduodenectomy, although a higher median count of LN was retrieved (P = 0.025), no significant difference was found in the median detection rate of metastatic LNs (P = 0.122).

Thus, to assess the minimum requirement for histological examination of LN in PHC, several cut-off values were created for survival comparison. We divided the number of harvested LN into five groups according to the median number; the ratio of LN ( +) group (6–8) was higher than group (0–2) or (3–5), although this was not statistical different (P = 0.055 and P = 0.061). Meanwhile, increasing the number of retrieved LN count did not increase the ratio of LN ( +) (p = 0.865 and p = 0.480) when the cut-off values were 3 or 8. Given that there were significant differences when the cut-off values were 5 (P = 0.002) and 7 (P < 0.001), a cut-off value of 5 was chosen as the target goal for LN count at time of PHC surgery.

There are several limitations of the current study. Data from surgical resections performed in highly specialized HPB surgical centers were included in the meta-analysis. As most of studies were retrospective and non-randomized, there was a high chance of selection bias. Furthermore, the method of the histological examination of dissected lymph nodes must be different between the institutions that they collected. In addition, data at the individual patient level were not available. All analyses were also based on the provided median lymph node count, as this was the only feasible strategy in the process of a systematic review. Extended lymphadenectomy was associated with an increased number of retrieved LN count. HPD patients, thus, should be excluded as strictly. In addition, adjuvant chemotherapy may affect the overall survival of patients with PHC. However, in the present study, most of the studies included the patients with adjuvant chemotherapy showed that adjuvant chemotherapy was not an independent risk factor of overall survival. And the remaining studies did not explicitly describe whether the included patients had received adjuvant chemotherapy. Thus, it is not available to evaluate the effect of adjuvant chemotherapy in the present study. Adjuvant chemotherapy regimens and the suitable subgroup patients still need further study. Finally, some studies were low quality with insufficient number of patients with some studies even including tumor types other than PHC.

In conclusion, this systematic review and meta-analysis demonstrated that increasing retrieved LN count or extended LN dissection was not associated with a survival benefit among patients who underwent curative surgical resection for PHC. LN status, however, was prognostically important as patient with LN metastasis had a markedly worse long-term prognosis. Retrieval of 5 LN at the time of surgery was associated with the identification of more LN-positive patients and, therefore, should be used as the goal cut-off value to avoid understaging patients with PHC.