Introduction

Despite improvements in perioperative management and a decrease in postoperative mortality, pancreaticoduodenectomy (PD) is still associated with significant postoperative morbidity, mainly due to postoperative pancreatic fistula (POPF). The estimated post-PD incidence of POPF is around 20% in high-volume centers and is associated with increased postoperative morbidity, mortality, in-hospital stay, and readmission rates.1 Thus, accurate prediction of post-PD POPF is still a major concern.

Although certain POPF risk factors have been identified (such as soft pancreatic tissue and small main pancreatic duct (MPD) diameter), accurate preoperative prediction of POPF remains difficult. Over the last years, several scores have been developed to predict the occurrence of POPF.2,3,4,5,6,7,8,9,10 The most frequently used scores are the validated fistula risk score (FRS),5 the NSQIP-modified FRS (mFRS),9 and more recently, the alternative fistula risk score (aFRS).10 However, perioperative management has evolved in this population, with changing patient characteristics, an updated definition of POPF by the International Study Group on Pancreatic Surgery (ISGPS),11 and the introduction of neoadjuvant treatment. Regarding the latter, preoperative chemotherapy is now routinely used but some recent studies have also advocated the use of radiation therapy.12,13,14 Thus, these scores need to be constantly adapted to clinical practice.

The aim of this study was to develop and validate an accurate predictive score based on datasets from two high-volume centers, in order to optimize individual treatment decisions.

Methods

Study Population and Model Design

All patients who underwent PD at Beaujon (BJN) University Hospital from 2012 to 2017 were identified, and their data were retrospectively extracted from a prospective database. This first cohort (BJN training dataset) was used to develop the predictive model. A second cohort including patients who underwent PD at the Institute Paoli Calmettes (IPC) during the same study period was used for external validation, and data were retrospectively extracted from this separate prospective database (IPC testing dataset). This study was approved by the institutional review board (IRB 12-055) and performed in accordance with the Declaration of Helsinki.

Baseline Characteristics and Intraoperative Course

Usual preoperative demographic characteristics were obtained for all patients, as well as preoperative radiation therapy and/or chemotherapy. Preoperative chemotherapy was administered intravenously during 8 to 12 weeks according to guidelines in patients presenting with borderline resectable or locally advanced pancreatic adenocarcinoma, or more recently with resectable tumors if they were included in ongoing clinical trials (NCT02959879).12, 13, 15 Preoperative radiation therapy, usually associated with oral capecitabine (chemoradiation therapy), was given in patients with borderline resectable or locally advanced pancreatic adenocarcinoma according to guidelines16 or ongoing clinical trials (NCT02676349) only in patients with stable or responsive disease under preoperative intravenous chemotherapy. Radiation (or chemoradiation) therapy was administered during 5 weeks. Resection was decided in multidisciplinary boards according to clinical and biological findings and imaging performed 4 weeks after completion of neoadjuvant treatment.

Patients underwent standard open PD for malignant and benign disease in both cohorts. Surgical reconstruction consisted of duct-to-mucosa pancreaticojejunostomy or pancreaticogastrostomy, depending on the surgeon’s preference. Other intraoperative data included pancreatic stump texture (soft or firm), remaining MPD diameter (measured intraoperatively), operative time, and total intraoperative blood loss. A drain was placed close to the pancreatic anastomosis in all patients.

Postoperative Outcome

The primary outcome was the development of clinically relevant POPF, based on the updated 2016 ISGPS definition.11 Other postoperative data included complications according to the Clavien-Dindo classification17 (severe postoperative complications were defined as Clavien-Dindo ≥ 3), 90-day postoperative mortality, hemorrhage, embolization, percutaneous or endoscopic procedures, delayed gastric emptying, signs of infection, the need for reoperation, intensive care unit requirement, overall in-hospital stay, and 90-day readmission rates.

Statistical Analysis and Predictive Modeling

Categorical variables were presented as absolute numbers (percentages) and quantitative variables as medians (range). The model was developed with internal and external validation based on TRIPOD guidelines18 for multivariable prediction models. Univariate analysis and multivariate logistic regression were performed in the BJN training dataset to develop the predictive model. P < 0.05 was considered to be statistically significant. There were no missing data. The model was identified based on a threefold cross validation with random splits. This procedure was repeated one thousand times to account for the inherent variability in the dataset, to obtain better estimates of the true distribution of the model parameters. The imbalance between the two classes of patients was taken into account. The intercept of the model (i.e., the constant parameter) was then corrected. Variables were selected using the Student t test to evaluate the null hypothesis based on parameter distributions. Selected variables were further cross-validated with l1-penalized logistic regression, i.e., logistic regression with Laplace prior on the parameters distributions. It is important to note that these variables should not be considered separate predictive factors of the occurrence of clinically significant POPF but combined factors that predict whether the score effectively identifies patients “at risk” or not of developing clinically significant POPF. All analyses were performed with Scikit-Learn.

The predictive capacity of our model was also compared to that of the original FRS,5 mFRS,9 and aFRS.10 The discriminative performances were evaluated based on the area under the receiver operating curves (AUCs) in all cases. Confusion matrices were also reported. While the AUCs provide a general overview of the model’s discriminative performance, the confusion matrices allow the reader to visualize the quality of the model’s predictions for a given condition.

Results

Population Characteristics

Baseline patient characteristics in both cohorts are presented in Table 1. A total of 448 consecutive patients who underwent PD in the BJN training dataset were included, of which 103 (23.0%) developed clinically significant POPF. There were 251 (56.0%) men; median age was 64 (19–84) years and median body mass index (BMI) 24.3 (15.4–46.1) kg/m2. Two hundred and forty-three (54.2%) patients were treated for ductal adenocarcinoma, and 99 (22.1%) and 65 (14.5%) received preoperative chemotherapy and radiation therapy, respectively. The pancreatic stump texture was soft in 208 (46.4%) patients, and the median MPD diameter was 4.0 (0–27) mm. Most surgeons performed pancreaticojejunostomy (96%), and median intraoperative blood loss was 300 (50–4000) cc.

Table 1 Baseline characteristics and intraoperative course

The IPC testing dataset used for external validation included 213 patients, of which 26 (12.2%) developed clinically significant POPF. A total of 153 (71.8%) patients were treated for ductal adenocarcinoma, and 62 (29.1%) and 11 (5.2%) received preoperative chemotherapy and radiation therapy, respectively. Comparatively to patients from the BJN dataset, those from the IPC dataset were slightly older (median age = 65 years, p < 0.001), were more frequently classified as ASA 3 (21.2% vs. 10.3%, p < 0.001) and operated on for ductal adenocarcinoma (71.8% vs. 54.2%, p < 0.001), received less frequently radiation (or chemoradiation) therapy (5.2% vs. 14.5%, p < 0.001), and had more frequent pancreaticogastrostomy (8% vs. 4%, p < 0.001) performed on a more frequently soft pancreas (61.5% vs. 46.4%, p < 0.001) but harboring a slightly more dilated MPD (median = 5.0 vs. 4.0 mm, NS).

Postoperative Course

Detailed postoperative course data of patients in both cohorts are presented in Table 2. The incidence of clinically significant POPF (grade B–C) was 23.0% and 12.2% in the training (BJN) and the testing (IPC) cohorts, respectively. Patients of the IPC testing dataset developed more frequently a grade C POPF (7.0% vs. 3.1%, p < 0.001), needed more frequent admissions in the intensive care unit (18.8% vs. 7.6%, p < 0.001), and had a longer median hospital stay (18 vs. 15 days, p < 0.001).

Table 2 Postoperative outcomes

Risk Factors for Clinically Significant POPF

In the BJN training cohort, preoperative and intraoperative characteristics associated with the development of clinically significant POPF in univariate analysis were an increased BMI (17% for BMI < 25 kg/m2 vs. 29% for BMI ≥ 25, p = 0.005), underlying disease (16% in the case of adenocarcinoma vs. 27% in the case of other etiologies, p = 0.007), the absence of preoperative intravenous chemotherapy (12% vs. 26% in patients with and without preoperative chemotherapy, respectively, p = 0.004), the absence of preoperative radiation or chemoradiation therapy (6% vs. 26% in patients with and without preoperative radiation therapy, respectively, p < 0.001), soft pancreatic stump texture (52% if soft vs. 8% if not, p < 0.001), and small MPD diameter (37% if MPD < 3 mm vs. 18% if MPD ≥ 3 mm, p < 0.001). A firm gland was significantly more often observed in patients who received preoperative radiation therapy (52/65 vs. 188/383, p < 0.001). Total bilirubin, total blood loss, increased operative time, and gender were not associated with increased POPF.

In multivariate analysis, independent POPF risk factors were increasing age, the absence of preoperative radiation or chemoradiation therapy, a soft pancreatic stump, and a small MPD diameter (Table 3). The distribution of these 4 variables in both the training and testing cohorts is shown in Fig. 1.

Table 3 Perioperative factors associated with increased clinically significant POPF: univariate and multivariate analysis (BJN training dataset)
Fig. 1
figure 1

Distribution of statistically relevant POPF risk factors in training (a, BJN) and testing (b, IPC) datasets. POPF independent risk factors in training and testing datasets: “POPF” indicates clinically significant postoperative pancreatic fistula according to ISGPS 2016 definition. “Texture” indicates pancreatic stump texture. “Radio” indicates preoperative radiation (or chemoradiation) therapy. “MPD” indicates main pancreatic duct diameter (mm). Age is presented in years. “Prob. dist.” indicates probability distribution

Predictive Model and External Validation

The final model included the four following POPF predictors: increasing age (OR = 1.029, 85% CI = 1.015–1.042), preoperative radiation or chemoradiation therapy (OR = 0.328, 95% CI = 0.116–0.787), soft pancreatic stump (OR = 5.367, 95% CI 3.450–7.810), and increasing MPD diameter (OR = 0.827, 95% CI = 0.734–0.894, Table 3). A probabilistic approach was used to evaluate all four independent POPF risk factors to establish the following predictive score:

$$ P=\frac{1}{1+\exp \left(0.64-0.03\times \mathrm{age}+1.24\times \mathrm{radiotherapy}-1.65\times \mathrm{texture}+0.20\times \mathrm{MPD}\right)} $$

The discriminative capacity of the model was found to be adequate with an AUC of 0.79 (0.74–0.84) (Supplementary material 1). The IPC testing database was used for external validation, and the FRS discrimination capacity was also found to be adequate with an AUC of 0.73 (Supplementary material 1).

Risk Groups (Fig. 2)

Patients were divided into four different risk groups based on the present predictive score. The risk of developing clinically significant POPF was considered to be negligible when the score was < 0.25 (2% and 0% POPF rates in the training and testing cohorts), low when the score was between 0.25 and 0.5 (4% and 0% POPF rates in the training and testing cohorts), intermediate when the score was between 0.5 and 0.75 (12% and 4% POPF rates in the training and testing cohorts), and high when the score was above 0.75 (42% and 19% POPF rates in the training and testing cohorts). We also analyzed the different risk groups in training (BJN) and testing (IPC) cohorts according to different scores and observed POPF rates, in non-radiated and radiated patients separately, and identified similar results compared with overall population (Supplementary material 2).

Fig. 2
figure 2

Risk groups in training (BJN) and testing (IPC) cohorts according to different scores and observed POPF rates. POPF risk groups according to different risk scores: “POPF” indicates clinically significant postoperative pancreatic fistula according to the ISGPS 2016 definition. a FRS, b mFRS, c aFRS, d present

Comparison with Pre-existing Scores

The present risk score was compared to the FRS, mFRS, and aFRS in both the BJN training and IPC testing cohorts (Supplementary material 1). The AUCs of the 4 scores were similar for the training cohort: 0.79 (0.74–0.84) for the present score, 0.73 (0.68–0.78) for FRS, 0.74 (0.68–0.79) for mFRS, and 0.75 (0.70–0.80) for aFRS, as well as for the testing cohort: 0.73 (0.62–0.82) for the present score, 0.76 (0.59–0.81) for FRS, 0.75 (0.62–0.83) for mFRS, and 0.75 (0.62–0.82) for aFRS. The predictive value of each score for the training and testing cohorts was also analyzed and compared through confusion matrices (Fig. 3). The false-negative rates of the present score in the training and testing cohorts were 6% and 0%, respectively. The negative predictive value of the three available scores was between 67 and 91% while the negative predictive value of the present score ranged between 86 and 100%.

Fig. 3
figure 3

Confusion matrices summarizing predictions of the different scores applied to the a training and b testing datasets. Confusion matrices: starting from the upper left corner and moving in a clockwise direction, confusion matrices provide true-negative, false-positive, true-positive, and false-negative rates, respectively. POPF, clinically significant postoperative pancreatic fistula according to the ISGPS 2016 definition

Discussion

Despite improvements in perioperative management in the past few decades, POPF following PD is still a challenge, and is the main determinant of postoperative morbidity, mortality, prolonged hospital stay, and readmission.1, 11 Accurately identifying patients who are at a low risk of developing clinically significant POPF could help decrease their in-hospital stay and encourage inclusion in enhanced recovery protocols.19, 20 In addition, it is crucial to identify high-risk patients who could benefit from increased postoperative care and prolonged in-hospital surveillance. Thus, accurately predicting the development of POPF is a major clinical issue. The predictive score developed in this study included four preoperative variables: age, preoperative radiation (or chemoradiation) therapy, soft pancreatic stump texture, and small MPD diameter (Table 3). The predictive value of POPF was satisfactory, and this score was also associated with a lower false-negative rate (≤ 6%) than other preexisting scores in both the training and testing cohorts. Indeed, only 2% and 0% of patients who were considered to have a negligible risk of POPF with the present predictive score actually developed clinically significant POPF, in training and testing cohorts, respectively (Fig. 2).

Like all pre-existing predictive scores, the independent POPF risk factors that were identified and used to develop the present score included a soft pancreatic stump texture and a small MPD diameter.2, 3, 5, 9, 10 Our findings on age are similar to results published by certain authors who identified increasing age to be an independent risk factor for clinically significant POPF following PD, possibly due to malnutrition or fatty infiltration of the pancreatic parenchyma which are more frequent conditions in older patients.2, 21 However, other studies have concluded that age per se should not be a contraindication to PD and was not associated with a poor postoperative outcome.22, 23 Similarly to results reported by Callery et al.,5 BMI was not found to be an independent POPF risk factor, unlike results found by Kantor et al.9 and Mungroop et al..10 These conflicting results might be explained by different methodologies, including expression of BMI as a categorical or continuous variable, and development of the present predictive model. In the present study, optimal statistical analysis (detailed in “Methods”) was chosen to more accurately select variables relevant to POPF prediction. BMI was thus not a predictive variable. Also, the initial FRS5 and some other authors24 reported that increased intraoperative blood loss was a risk factor of POPF. Like the study on aFRS,10 we did not identify intraoperative blood loss as an independent risk factor for clinically significant POPF. These apparently discordant results are probably due to the standardization of surgical techniques in tertiary referral centers allowing for a low intraoperative blood loss (median of 300 and 220 mL in the training and testing datasets, respectively) with a beneficial effect on perioperative outcome.

Several authors have already reported a reduced incidence of POPF following PD in patients who underwent neoadjuvant therapy.25,26,27,28,29 However, these studies did not analyze separately the influence of neoadjuvant chemotherapy and radiation therapy,25,26,27 or failed to identify radiation therapy as an independent protective factor of POPF in multivariate analysis.28, 29 In previously established scores, neoadjuvant therapy was evaluated either globally, without distinction between chemotherapy and radiation therapy5, 10 or was not identified as an independent risk factor.9 In the present series, neoadjuvant chemotherapy only was associated with a reduced risk of clinically significant POPF in univariate analysis but was not an independent protective factor in multivariate analysis. However, preoperative radiation (or chemoradiation) therapy was found to be an independent protective factor.

Although preoperative radiation can induce pancreatic fibrosis, other factors may contribute to gland firmness as well, including the more frequent pancreatic ductal obstruction observed in PDAC comparatively to other diseases, restriction of indications of radiation therapy in only patients with PDAC, and a longer duration of ductal obstruction during the time needed for radiation therapy administration followed by appropriate reevaluation. Importantly, despite collinearity between radiation and pancreatic firmness, the two were independently identified as protective against the risk of POPF by multivariate analysis. So, the most likely mechanisms explaining the decrease in POPF rate could be both radiation-induced and obstruction-induced pancreatic fibrosis with decreased exocrine function, resulting in better healing of the pancreatic anastomosis.26, 28,29,30 Identifying radiation therapy as a protective factor regarding clinically significant POPF is relevant because of the increasing use of this treatment in borderline pancreatic adenocarcinoma with recent promising results.13, 31, 32 It is noteworthy that, although there were more patients treated for adenocarcinoma in the IPC cohort, the rate of soft pancreatic stump texture was higher than in the BJN training cohort, which is probably due to the higher number of patients who underwent radiation therapy in the latter cohort.

The different FRS-like scores were also compared in order to evaluate their clinical relevance. FRS and mFRS are discrete scores (i.e., the patient’s score is assigned based on a system of points), while aFRS is probabilistic (i.e., a probability between 0 and 1 of developing POPF is assigned to each patient) which is similar to the score presented in the present study. When comparing the 4 scores, similar AUCs were found, ranging between 0.73 and 0.79 in the training cohort and 0.73 and 0.76 in the testing cohort. We also identified a progressive rise of risk of POPF in the 4 different risk groups in non-radiated and radiated patients, suggesting that the newly proposed score performs as well in non-radiated patients (Supplementary material 2). However, ROC curves and the related AUC mainly emphasize both true- and false-positive rates. To identify patients who can be safely discharged, emphasis should be placed on obtaining a score with a very low false-negative rate. This was obtained in the present study by explicitly accounting for class imbalance (i.e., uneven POPF incidence) during the development of the model.

Further in-depth quantitative comparisons of the four scores evaluated in this study are shown in Fig. 2, which reports the prevalence of clinically significant POPF in the different risk groups (defined according to the guidelines suggested in each paper) in the training and testing cohorts. The original FRS did not classify any patients in the testing cohort as part of the high-risk group, thus explaining the 0% prevalence of POPF reported in Fig. 2a for this group and underlying one limitation of this risk score. It is interesting to note that mFRS, aFRS, and the present score all tended to predict a smaller prevalence of clinically significant POPF in the high-risk group for the IPC testing cohort. These slightly different prevalences are probably related to the different characteristics of the two cohorts (Table 1). However, the prevalence of clinically significant POPF in the low-risk group defined by mFRS ranged from 5 to 13%, while it ranged from 0 to 4% with the present score. Direct comparison with aFRS is difficult because the authors only defined three risk groups.10 Nevertheless, the prevalence of clinically significant POPF in the aFRS low-risk group ranged between 1 and 7% in both cohorts. These results suggest that the discriminative value of both probabilistic approaches (i.e., aFRS and the present score) and their definition of risk groups could be better than the discrete point-based scores (i.e., FRS and mFRS) for identifying low-risk patients.

Differences between training and testing cohorts were found in terms of postoperative outcomes, mainly POPF. Possible explanations could be the smaller size of the testing cohort, which included older patients, with a higher operative risk; also, patients of the testing cohort were more frequently operated on for pancreatic cancer but received less frequently radiation (or chemoradiation) therapy. The fact that the present score has similar predictive ability with very low false-negative rates in two cohorts with inherent differences is noteworthy.

The present score may have some important clinical applicability. If a practitioner must decide to not use peri-pancreatic drains or to discharge a patient based on whether or not he is in a low-risk group, the present score could help to take a more reliable decision. Figure 3 reports the associated confusion matrices in both cohorts. The false-negative rate with the present score was 6 and 0% in the training and testing cohort, respectively, whereas the performance in the same subset of patients with the other three scores was inconsistent with false-negative rates ranging from 14 to 29% in the training cohort and 4 to 19% in the testing cohort (Fig. 3). With a superior negative predictive value ranging between 86 and 100%, the present score might be more reliable than the others in determining whether peri-pancreatic drainage can be omitted or when a patient can be safely discharged. Further analyses, taking into account clinical presentation and drain amylase level when drain is present, with prospective validation are needed in larger cohorts to confirm these results.

The present study has some limitations. First, despite the fact that both departments are high-volume centers, experienced in surgery for pancreatic adenocarcinoma, preoperative patient selection and neoadjuvant radiation (or chemoradiation) therapy administration were probably heterogeneous. Second, the reason why POPF was graded B was not analyzed, so subgroups of patients with grade B POPF in both training and testing datasets were possibly different.33 Third, abdominal drain management policies were not compared, which can influence the rate of clinically relevant POPF.34 Fourth, the length of stay was rather long in both groups, so we could not appreciate the influence of a low risk score on the length of stay. Fifth, the present score was developed using data from high-volume French centers but whether it would apply to different countries, with different perioperative regimens, patient population, and surgeon’s preferences remains to be confirmed. Lastly, we cannot assume that pancreatic stump texture, which is difficult to quantify, was evaluated similarly in both centers.

Conclusions

The present study identified preoperative radiation therapy as an independent protective factor of POPF following PD. A risk score of POPF taking into account a patient’s characteristics and preoperative radiation therapy, which represents a recent therapeutic modality of pancreatic cancer, is associated with a very low false-negative rate (< 6%). This new score is clinically relevant since it allows to accurately identify patients unlikely to develop clinically relevant POPF after PD and to adapt perioperative management accordingly.