Introduction

Acute pancreatitis (AP) is usually a self-limiting process; however, 25 % of patients present with or subsequently develop a severe form of the disease that is associated with a mortality of up to 50 %.1 Scoring systems aim to stratify the severity of the AP, and this in turn guides management with the aim of improving outcomes. The scoring system currently regarded as the gold standard for assessment of AP, namely the Acute Physiology and Chronic Health Evaluation (APACHE II), is labor intensive and is not widely adopted for patients with acute pancreatitis outside of the intensive care setting.2,3 Other scoring systems such as the Sequential Organ Failure Assessment (SOFA) have been developed but are still suitable only in the intensive care setting and not for routine use in all patients presenting with AP. As such, they are not suitable for stratifying patients at the time of admission or shortly thereafter.4 Simplified tests using serum markers such as procalcitonin, interleukin-6, and interleukin-8 have been said to be able to predict the severity of AP, but these are expensive, non-validated in the clinical arena, and not readily available.3,57

The white cell count (WCC) is a routine serum hematological test that is already incorporated in many of the current AP scoring systems, and routinely performed on all surgical emergency admissions. Components of the total WCC include neutrophils and lymphocytes, both of which can be used individually as markers of inflammation. Poor outcome in severe AP is said to be due to an uncontrolled systemic inflammatory response syndrome (SIRS), with progression to a multi-organ dysfunction syndrome (MODS). Indeed, the WCC is one of the criteria assessed in scoring of the SIRS.

Neutrophils propagate SIRS and the inflammatory cascade in AP whereas lymphocyte depletion occurs in severe sepsis, and is associated with a poor outcome.811 An increased neutrophil count with a concomitant decreased lymphocyte count has previously been associated with severe sepsis, bacteremia, and surgical stress.8,12 The neutrophil–lymphocyte ratio (NLR) is a measure of the divergence of these two WCC components, and may be more accurate than the total WCC or individual neutrophil/lymphocyte counts in predicting poor outcome in benign and malignant surgical conditions.13,14

The aim of the current study was to investigate the validity of NLR in predicting outcome, and to determine an optimal cut-off value that would allow division of patients into mild (MAP) and severe acute pancreatitis (SAP) groups based on NLRs within the first 48 h of hospitalization.

Materials and Methods

Patients

Permission for the study was obtained from the institutional research and ethics committee. Sequential patients admitted to our unit with a confirmed diagnosis of AP during the period January to December 2010 were included. Acute pancreatitis was defined as clinical findings consistent with a diagnosis of pancreatitis together with an elevation in serum amylase of three times the upper limit of normal.1,15 Patient outcomes were classified to mild (MAP) and severe acute pancreatitis (SAP) using the Atlanta classification.16 SAP was defined as objective evidence of organ failure (circulatory shock, acute renal failure, and acute pulmonary failure) set out in the Atlanta classification and/or local complications of necrosis, abscess of pseudocyst.16 Patients with mild pancreatitis experienced none of these complications. High dependency admission was defined as patients requiring invasive monitoring (central venous and/or arterial catheter), inotropic or renal support requirement, or patients developing respiratory complications. Imrie scores17 were calculated at baseline, and 24 and 48 h as per our clinical practice. The white cell differential count was analyzed and the NLR determined by calculating the ratio between the absolute neutrophil and lymphocyte counts.

Statistical Methods

To compare patients with severe versus mild AP, chi-square and Fishers’ exact tests were used for categorical data, and Student’s t test for continuous data. The Mann–Whitney test was used for ordinal data.

The sensitivity and specificity of the NLR for the identification of patients with severe AP were calculated at baseline, and 24 and 48 h. Positive predictive values (PPV), negative predictive values (NPV), and accuracies of the NLR were also determined at each time point.

Receiving operator characteristic (ROC) curves were generated with corresponding area under curve (AUC) analysis and computation of 95 % confidence intervals (CI). This was to determine the optimum cut-off for NLR, by showing the trade-off between sensitivity and specificity. SPSS (Statistical Package for Social Sciences, v18) was used for analysis. A p value <0.05 indicated statistical significance.

Results

One hundred forty-six patients were admitted with AP during the period of the study. Twenty-two patients had SAP: necrosis (6), pseudocyst (6), pancreatico-pleural fistula (1), organ failure (26), and surgery (1) with some patients fulfilling more than one criterion for SAP. Seven patients died.

There was no significant difference between patient age, gender, disease etiology, or amylase on admission between severe and mild AP (Table 1). The SAP group had significantly impaired renal function as indicated by higher urea levels, and lower glucose and albumin levels on admission. The WCC of the SAP cohort was also significantly increased compared to the favorable group. There was no difference in other serum markers of the Imrie score including liver function tests and calcium.

Table 1 Patient demographics, pancreatitis etiology, and components of the Imrie score comparing favorable and poor prognosis groups at day 0. Data expressed as mean ± standard deviation. Values indicated with Imrie score components indicate limits for scoring severity points

The differences in NLR between MAP and SAP groups are summarized in Table 2. The two groups were comparable at baseline, but at 1 and 2 days following admission there were highly statistically significant differences between the MAP and SAP groups. The sequential changes in the NLR are illustrated graphically (Fig. 1). This demonstrates that in MAP, the NLR is highest on the day of admission and falls rapidly towards normal over the subsequent 48 h indicating resolution of the inflammatory process. In SAP, the NLR peaks on day 1 then starts to fall but remains higher on day 2 than the baseline score in the MAP group indicating significant ongoing inflammation.

Table 2 Sequential changes in NLR in the favorable and poor prognosis groups
Fig. 1
figure 1

Sequential changes in the NLR (a), neutrophil count (b), and lymphocyte count (c) in the mild and severe AP groups during the first 48 h following presentation

Examining the changes in the NLR in more detail, it is clear that the neutrophil counts are at their peak at presentation indicating early recruitment to the inflammatory milieu and then fall, although again the 2-day neutrophil count for the poor prognosis group is greater than the baseline level in the mild disease (Fig. 1b). Lymphocyte counts were suppressed to a similar magnitude in both groups at admission but were significantly lower on day 1 and 2 in the SAP group, whereas in the favorable outcome group levels of lymphocytes start to rise by 24 h (Fig. 1c).

The optimal NLR was generated for each day using ROC and corresponding AUC analysis. This demonstrated an optimal baseline NLR of 10.6 (day 0), with subsequent values on day 1 of 8.1 and 4.8 on day 2 (Fig. 2).

Fig. 2
figure 2

Receiver operator curves (ROC) and corresponding area under curve (AUC) analyses demonstrating accuracies of NLR on admission days 0, 1, and 2

These NLRs were compared against other possible NLR cut-off values by demonstrating trends in sensitivity, specificity, negative predictive value, positive predictive value, and accuracy (Table 3). All NLR cut-off values demonstrated increasing accuracy over the first 48 h. Similarly, the PPV of each NLR also increased over 48 h (22.1 % to 30.3 %). The NPV remained high: >90 % at admission and during the subsequent 48 h.

Table 3 Accuracy of daily optimal NLR as reported by ROC analysis, compared to other NLR in every 24 hours, and an NLR = 4.7 as suggested by Azab and colleagues23

Discussion

The primary finding of this study is that the NLR is elevated in patients presenting with AP, and is able to effectively differentiate patients who experience mild and severe acute pancreatitis. Furthermore, the NLR represents a dynamic process, with the optimal NLR for assessment varying with time from presentation, returning towards normal in those with favorable outlook and remaining elevated in patients with complicated AP.

The WCC is a marker of infection and inflammation, and is part of many AP prognostic scoring systems including Ranson, Imrie, APACHE II, and the Simplified Acute Physiology Score (SAPS II). Neutrophils and lymphocytes are important components of the WCC. Neutrophils propagate inflammation and tissue destruction in AP via activation of a cascade of inflammatory cytokines (IL-6, IL-8, and TNF-α), proteolytic enzymes (myeloperoxidase, elastase, collagenase, and β-glucoronidase), and oxygen free radicals.18 An ncrease in neutrophil numbers corresponds with the development of SIRS and progression to MODS, which are hallmarks of SAP.

Lymphocyte numbers increase following the initial stress and mediate the subsequent inflammatory response. The traditional view is that neutrophilia is the primary cause of an elevated NLR, SIRS, and poor prognosis, while lymphocyte count remains static. However, as can be seen in Fig. 1c, lymphopenia within 24 h of admission and persistent lymphopenia beyond this period is just as much a contributor to increased NLR and poor prognosis as neutrophilia. This is replicated in other studies where persistent lymphopenia is an independent marker of progressive inflammation, bacteremia, or sepsis in emergency admissions and intensive care patients.8,10 Uncontrolled inflammation is thought to precipitate lymphopenia by lymphocyte redistribution and accelerated apoptosis, and lymphopenia (3 % vs. 16 %) is associated with a higher mortality in patients with septic shock.9 The extent of lymphopenia, as with neutrophilia, also correlates with the severity of the insult. Further studies should also report the independent prognostic value of lymphopenia.

Similar to the current study, Pezzilli et al., comparing patients with AP to other acute abdominal conditions and healthy controls, reported a lymphopenia on day 1 in those with AP which persisted on days 3 and 5 following admission.19 Takeyama and colleagues studied lymphocyte populations in 48 patients with SAP and noted that in those developing subsequent infective complications, the counts were significantly lower.20 Furthermore, they noted that the CD8-positve lymphocytes were the main populations depleted as a result of apoptosis, leading to impaired cellular immunity, which in turn may lead to infective complications.

It is the divergence of these two components of the WCC namely neutrophilia and lymphopenia that led to the proposal of assessing the NLR as a single and more accurate predictive factor than either component alone. An increase in NLR has been shown to be associated with poor outcomes in both benign and malignant conditions. An elevated NLR predicted in-hospital death and 6-month mortality in acute coronary syndrome, early hospital death and heart failure following myocardial infarction,21 and poor survival after coronary artery bypass grafting.22 Similarly, an elevated NLR has been shown to predict cancer recurrence, poor disease-free and overall survival following hepatic resection for both hepatocellular cancer (HCC)23 and colorectal liver metastases,24 and liver transplantation for HCC.13 Finally, the NLR has been shown to reflect SOFA and APACHE II scores in patients in the intensive care setting, these scores being used to predict AP severity.11

The majority of studies evaluating the NLR as an inflammatory marker have used a cut-off value for NLR of ≥5 with most existing data relating to benign disease21,22 or cancer surgery.13,23,24,2628 We anticipated the NLR being higher in cases of pancreatitis associated with a poor outcome as a result of the combination of marked neutrophilia and concomitant leucopenia seen in severe SIRS/MODS.

To date, only one paper has investigated the role of NLR in AP. Azab et al. reported the NLR to be superior to the total WCC or individual neutrophil and lymphocyte counts in predicting ICU admission and death in patients with AP.25 The authors recommend a cut-off value for NLR of ≥4.7 to identify poor outcome in AP. This was extrapolated from division of the highest neutrophil count with the lowest lymphocyte count in a healthy American population.29 However, neutrophil and lymphocyte counts of normal populations may not be representative of the AP population. This was reflected in their study population which reported tertiles of <3.6 (n = 93), 3.6 to <7.6 (n = 96), and ≥7.6 (n = 94). Thus, two thirds of patients had a NLR >3.6 with a third with NLR ≥7.6.24 When we investigated utility of this proposed cut-off in our study, we found that an NLR ≥4.7 had the highest (90.9 %) sensitivity but was the least accurate (33.1 %) due to low specificity (22.5 %) and PPV (17.7 %). Similarly an NLR of ≥5, which is used in the majority of studies, showed similarly high sensitivity but again low specificity, PPV, and accuracy.23,24,26,27 It would therefore appear that an NLR of 4.7 or even 5 is too low to use to distinguish good and poor outcome in an AP population as all members of this cohort exhibit some degree of inflammatory response. We considered a minimum 75 % sensitivity for clinical use in keeping with other scoring systems such as APACHE II, Imrie, Ranson, SOFA, Multiple Organ Dysfunction Score, and the Pancreatitis Outcome Prediction score which report sensitivities of 60–90 %.1,2,3032

The majority of NLR studies to date have divided study populations into tertiles, and each tertile group was a categorical variable used to predict poor outcome. We did not divide our population into tertiles as we aimed to determine an optimal NLR cut-off value. Instead, we used ROC analysis to define the optimal NLR cut-off using NLR as a continuous variable. This optimal NLR was then compared with other potential cut-off values. Higher NLR cut-off values were more accurate in predicting poor outcome on all 3 days (day 0, 33.1 % to 57.7 %; day 1, 46.2 % to 62.3 %; day 2, 60.2 % to 72.4 %) but exhibited decreased sensitivity. Azab et al. recommended an NLR of 4.7 on the notion that sensitivity is of greater importance than specificity as conservative supportive management should be the initial level of care in all cases.25 However, we aimed to develop a tool with acceptable sensitivity but greater specificity. This is important as poor specificity and PPV (not sensitivity) are the weaknesses of current AP scoring systems. The incorporation of a variable with high specificity would enhance current scoring systems rather than one with high sensitivity but low specificity.

The cut-off values generated by ROC analysis were highest on admission and progressively decreased during admission to an NLR of 4.8 at 48 h. This is similar to the NLR of 4.7 derived from the healthy American population.25,29 This return to normal NLR is indicative of the fact that the majority of our study consisted of patients with favorable outcome (n = 122/142) who had recovered during the first 48 h of hospitalization. Thus, the overall study NLR would be expected to return to normal population levels except for those who had an ongoing inflammatory process. At this stage, a persistently high NLR would have greater predictive value than at admission, and this is demonstrated in the superior accuracy of all NLR cut-off values on day 2 compared to the same NLR on day 0. This could lead to a model where different NLR cut-offs are used at admission (NLR ≥ 8) and at 48 h (NLR ≥ 5). Imrie scores were also significantly different in the two groups, confirming the Imrie score as a valid tool in reflecting prognosis. The benefit of the NLR is that it only requires a full blood count and is thus easier to perform, more repeatable, and does not require arterial blood gas or extended biochemistry.

Conclusions

The NLR can simply be determined from an element of the routine work-up of patients with AP and therefore accumulates no additional cost, and appears to correlate with outcome. Continuous NLR monitoring on each day of admission provides a dynamic reflection of the variable course of AP, with optimal NLRs varying with changes in patient status. We aim to optimize the NLR and investigate if incorporation into current AP prognostic scoring systems increases the accuracy of current methods.