Introduction

Infectious diseases with bacteremia are increasing worldwide, with mortality rates as high as 18–42%, particularly when diagnosis is delayed and treatment is inappropriate [1,2,3,4,5,6,7]. Blood cultures (BC) are the gold standard for the detection of bacteremia, allowing pathogen identification and antimicrobial susceptibility testing [1, 4, 6]. The reported sensitivity of BC to detect bacteremia varies between 73 and 100%, depending on the number of BC drawn, the infectious source, patient’s characteristics, and antibiotic pretreatment [8,9,10,11]. The diagnostic yield of BC in a clinical routine test is limited. Of all BC drawn in emergency departments, only 7–20% turned out to be positive [8, 12,13,14,15]. Despite this limitation, the threshold to draw BC is low, thereby increasing health care costs with a limited benefit or even causing harm in case of contaminated (false positive) BC [16]. As there are no universal guidelines on when to draw BC, non-standardized clinical judgment by treating physicians is followed in the clinical routine practice [17, 18]. Several studies improved the prediction of bacteremia by applying clinical and laboratory scores, including biomarkers [8, 19,20,21,22]. A score published by Shapiro et al. combined three major and nine minor criteria (Table 1), suggesting BC sampling if at least one major or two minor criteria were present [19]. The sensitivity of 98%, with a resulting negative likelihood ratio (LR) of 0.08, allowed to safely abstain from BC in an important number of patients.

Table 1 Modified Shapiro score

At our hospital, the performance of different bacteremia prediction scores was analyzed in 1083 consecutive patients with BC collection in the medical emergency department by Laukemann et al. [23]. Thereby, BC had been collected according to non-standardized clinical judgment. The best predictor for the 104 (9.6%) bacteremia episodes was an adapted Shapiro score with ≥ 3 points (when applying two points for each major and one point for each minor criterion), with an admission procalcitonin (PCT) level of > 0.25 µg/l in patients with a systemic inflammatory response syndrome (SIRS) and suspected infection [23]. Applying this algorithm, the number of BC could have been reduced by 42% while missing 4 bacteremia only (4%).

The aim of the present study was to prospectively validate these criteria for predicting bacteremia and to analyze the potential for saving unnecessary BC.

Methods

Study design

Between 4 January 2016 and 28 February 2017, we conducted a prospective cohort study in the Medical and Neurological Emergency Department of the Kantonsspital Aarau, a 600-bedded tertiary-care hospital in Switzerland. All adult patients who had at least one BC collected and were admitted to medical or neurological wards (including intensive-care unit) were included (Fig. 1). For the sample size calculation, the following parameters were used: power 90%, α = 5%, and a significance level 5% (one-sided). 1500 patients over a period of 1 year were, therefore, planned to be included.

Fig. 1
figure 1

Study population according to the presence of SIRS with suspected infection, SP criteria as well as overruling safety criteria (A). Presented are numbers and rates of positive blood cultures. SIRS systemic inflammatory response syndrome, SP modified Shapiro score ≥ 3 points AND Procalcitonin > 0.25 µg/l, SPA modified Shapiro score ≥ 3 points AND Procalcitonin > 0.25 µg/l, OR overruling safety criteria (A); BC blood culture

The study was performed in a quasi-experimental design that compared patients meeting predefined SPA (Shapiro score, procalcitonin, and overruling criteria) criteria for BC collection with patients not meeting these criteria, but with BC collected according to non-standardized clinical judgment by the treating physician, which represents the current clinical practice.

Definitions

Systemic inflammatory response syndrome (SIRS): ≥ 2/4 criteria of body temperature > 38°C or < 36°C, heart rate > 90 beats per minute (bpm), respiratory rate > 20/min or PaCO2 < 32 mmHg, and WBC count > 12 G/l or < 4 G/l.

Shapiro–Procalcitonin-Overruling criteria (SPA): “S” for modified Shapiro score ≥ 3 points (Table 1), “P” for admission PCT level > 0.25 µg/l, and “A” for overruling safety criteria including severe sepsis and septic shock (i.e., SIRS with hypotension and end-organ damage or refractory hypotension), suspected endovascular infection or bacterial meningitis, immunosuppression, neutropenic fever (neutrophil count < 0.5 G/l), and fever in hematologic stem cell (HSCT) and solid organ (SOT) transplantation patients.

Shapiro–Procalcitonin (SP): “S” for modified Shapiro score ≥ 3 points (Table 1) and “P” for admission PCT level > 0.25 µg/l.

Quick sepsis-related organ failure assessment (qSOFA score): 1 point each for respiratory rate ≥ 22 breaths/min, Glasgow coma scale < 15 points, and systolic blood pressure ≤ 100 mmHg.

Positive BC: Pathogen compatible with the clinical presentation.

False positive BC: Contaminated BC (were counted as negative).

Indication for BC sampling and control group

BC sampling was suggested in all patients with suspected infection and SIRS fulfilling SPA criteria, i.e., fulfilling either a modified Shapiro score ≥ 3 points plus admission PCT level > 0.25 µg/l, or fulfilling overruling safety criteria. The latter were defined to identify the most vulnerable patients.

The study was approved by the local ethics committee (Ethikkommission Nordwest- und Zentralschweiz). Given that this was an observational quality-control study, the need for an individual informed consent was waived.

Data collection and laboratory data

Data on comorbidities and symptoms were collected using a case report form (CRF). In addition, laboratory and clinical parameters, data on prior hospitalizations, antibiotic pre-treatment within 30 days before admission, use of antipyretic drugs within 6 h before hospitalization, symptom onset (> 7 days and ≤ 1 day), and acute deterioration (< 24 h and > 2 days) before admission were recorded. Every completed CRF was reviewed by an infectious disease (ID) specialist.

The PCT measurement was performed on an ADVIA Centaur (Siemens Healthineers, D) using license partner reagents (B·R·A·H·M·S PCT).

Blood culture sampling

One BC included aerobic and anaerobic bottles containing 8–10 ml blood each. Routinely, two BC were collected per patient. After skin disinfection with 77% ethanol, BC were drawn from one venous puncture [24]. In suspected infective endocarditis, three BC were collected within 6 h in stable or within 20–30 min in unstable patients. For the bacterial culture, BACTEC FX (BD Becton Dickinson) and BacT/Alert (BioMérieux) were used according to the manufacturer's instructions. Pathogen identification and susceptibility testing were performed according to the European Committee on Antimicrobial Susceptibility Testing (EUCAST) [25]. Every positive BC was categorized by an ID specialist during CRF review as positive or false positive, i.e., contaminated. Low-virulent organisms of the skin flora (i.e., coagulase-negative staphylococci (CNS), Corynebacterium spp., Cutibacterium spp., and Bacillus spp.), without evidence of an endovascular infection, were classified as contaminants, particularly if cultured in one bottle only or after > 24 h of culture [26].

Statistical analysis

For the primary outcome, the diagnostic yield of SPA compared to non-standardized clinical judgment was calculated by logistic regression analysis and presented as odds ratios (OR) with 95% confidence intervals (95% CI) for positive BC. Secondary outcomes included missed bacteremia by SPA, individual performance of SIRS with suspected infection, modified Shapiro score ≥ 3 points (S), PCT > 0.25 µg/l (P), overruling safety criteria (A), and the combination of them (SP and SPA, respectively) to predict blood culture positivity. For each single item of the algorithm, logistic regression analysis was performed providing OR and 95% CI for positive BC. In addition to sensitivities and specificities, areas under the receiver-operating characteristic curves (AUROC) for every item of the algorithm were calculated, as well as negative (NPV) and positive (PPV) predictive values for SPA.

Continuous variables were presented as medians with interquartile ranges (IQR), and categorical variables as numbers and percentages. Continuous variables were compared with the Wilcoxon rank-sum test and frequencies by Chi-square test. All tests were two-tailed and a p-value < 0.05 was considered statistically significant. The statistical analysis was done using STATA 15.1 (Stata Corp, College Station, TX).

Results

Baseline characteristics

We included 1438 patients with BC sampling during the 14-month study period. The median age was 72 years (IQR 59–80), among whom 594 (41%) were females.

In Table 2, baseline characteristics of patients with and without true positive BC were compared. Patients with positive BC had more comorbidities; sepsis, severe sepsis, and septic shock were more common as was acute deterioration (< 24 h) before admission. Bacteremia patients had higher modified Shapiro scores with median 4 vs. 3 points, higher qSOFA scores with median 1 vs. 0 points, and higher inflammatory markers including median PCT with 2.89 vs. 0.32 µg/l, median C-reactive protein (CRP) with 120 vs. 80 mg/l and median neutrophil count with 9.6 vs. 8.2 G/l; platelet and lymphocyte counts were lower. The duration of symptoms before hospitalization was similar between the two groups, but patients with negative BC were pretreated with antibiotics more often. Urogenital, hepatobiliary, endovascular, and bone and joint infections were more common in patients with positive BC, while ear/nose/throat and pulmonary infections were more common in patients with negative BC.

Table 2 Baseline characteristics of the 1438 patients with blood culture sampling, comparing patients with positive and negative blood cultures

Blood culture collection according to SPA versus non-standardized clinical judgment

Of the 1438 patients with BC sampling, 989 (69%) presented with SIRS and suspected infection, of whom 555 met SP criteria (39% of all) and additional 194 (13% of all) overruling safety criteria (A) (Fig. 1). Thus, 749 (52%) fulfilled SPA criteria and qualified for BC. The remaining 689 patients (48%) were sampled according to non-standardized clinical judgment. Of the 215 patients (15%) with positive BC, 194 patients (13% of all) fulfilled SPA (173 (12% of all) fulfilled SP and 21 (1% of all) overruling safety criteria), and 21 patients (1% of all) were sampled according to non-standardized clinical judgment.

The logistic regression analysis, to test the efficiency of SPA for positive BC compared to non-standardized clinical judgment, is shown in Fig. 2. In patients fulfilling SP criteria (n = 555), the proportion of positive BC increased to 31% (173/555) with an OR of 9.07 (95% CI 6.34–12.97) for positive BC. When SPA was fulfilled (n = 749), the number of positive BC increased to 194 with a slightly decreased BC positivity rate (26%), but an increased OR of 11.12 (95% CI 6.99–17.69) for positive BC. Among the 194 patients not fulfilling SP, but with at least one overruling safety criterion (A), 21 patients (11%) had positive BC. In SPA-negative patients with BC collection, according to non-standardized clinical judgment, only 3% (21/689) had positive BC with an OR of 0.09 (95% CI 0.06–0.14) for positive BC. Similarly, the yield was low for patients without SIRS (OR 0.15, 95% CI 0.09–0.25) or with SIRS but not SPA, either due to a modified Shapiro score < 3 points (OR 0.26, 95% CI 0.08–0.82) or a PCT ≤ 0.25 µg/l (OR 0.13, 95% CI 0.03–0.54).

Fig. 2
figure 2

Performance of SP criteria with and without overruling safety criteria (A) for positive blood cultures compared to non-standardized clinical judgment: Logistic regression analysis. Analyzed are 1438 patients with blood culture sampling. SP modified Shapiro score ≥ 3 points AND Procalcitonin (PCT) > 0.25 µg/l, SPA modified Shapiro score ≥ 3 points AND Procalcitonin (PCT) > 0.25 µg/l, OR overruling safety criteria (A), BC blood culture, SIRS systemic inflammatory response syndrome, OR odds ratio, CI confidence interval

Performance of SPA and its individual components to predict positive blood cultures

Figure 3 shows the BC positivity rates and the number of patients qualifying for BC sampling comparing SPA with its individual components, i.e., SIRS and suspected infection, overruling safety criteria (A), modified Shapiro score ≥ 3 points (S), PCT > 0.25 µg/l (P), and the combination of the latter (SP). Sensitivity, specificity and AUROC for BC diagnostic are shown for each single item of the algorithm. Adding the overruling safety criteria to SP increased the sensitivity from 81% (SP) to 90% (SPA) at the price of reducing the specificity from 69% (SP) to 52% (SPA). While S and P criteria alone showed sensitivities (each 91%) similar to SP or SPA, they required a significantly higher number of BC collections [n = 926 for (S) and n = 900 for (P) vs. n = 555 (SP) and n = 749 for (SPA), respectively]. By applying SPA, 48% (689/1438) of BC could have been, by applying SP even 61% (883/1438).

Fig. 3
figure 3

Performance of SP criteria with and without overruling safety criteria (A) for positive blood cultures in comparison to the performance of its individual components: sensitivity, specificity, ROC, and logistic regression analysis. Analyzed are 1438 patients with blood culture sampling. The height of the white columns indicates the number of patients requiring BC sampling. The black columns indicate the number of positive BC with the percentage of positive BC within the column. SP modified Shapiro score ≥ 3 points AND Procalcitonin (PCT) > 0.25 µg/l, SPA modified Shapiro score ≥ 3 points AND Procalcitonin (PCT) > 0.25 µg/l, OR overruling safety criteria (A), BC blood culture, SIRS systemic inflammatory response syndrome, ROC receiver-operating characteristic, CI confidence interval, OR odds ratio

The AUROC illustrates the higher diagnostic yield of SP (0.746) and SPA (0.724) compared to S (0.657) and P (0.667). Logistic regression analysis also calculated more favorable OR for BC positivity for the combinations SPA and SP with 11.12 (95% CI 6.99–17.69) and 9.07 (95% CI 6.34–12.97), respectively, as compared to Shapiro score ≥ 3 points and PCT > 0.25 µg/l each alone with 6.97 (95% CI 4.31–11.30) and 7.53 (95% CI 4.66–12.20), respectively. PPV and NPV of SPA were 25.9% (95% CI 22.8–29.2%) and 97% (95.4–98.1%), respectively.

Patients with positive blood cultures missed by SPA and contaminated blood cultures

The 21 patients (10%) with positive BC missed by SPA suffered from urogenital infections (n = 5), cholangitis (n = 4), infective endocarditis or infected aneurysma spurium (n = 4), and skin and soft-tissue infections (n = 3) as the most common infections (Table 3). While 16 of these patients were missed, because SIRS was not fulfilled, 3 did not meet the modified Shapiro score and 2 had PCT below the threshold of 0.25 µg/l. Eight pathogens could be detected alternatively (4 by urine, 2 by synovial fluid, and 1 by vertebral tissue culture; 1 by pneumococcal antigen testing in urine). If only SP criteria were applied, 42/215 (20%) true positive BC would have been  missed, thereby underlining the importance of the safety overruling criteria.

Table 3 Patients with positive blood cultures, but no formal indication for blood culture sampling (i.e., missed by SPA)

Of the 1438 BC, 52 (4%) were considered contaminated (false positive). The most common contaminants were CNS (65%), followed by viridans group streptococci (13%), Cutibacterium spp. (6%), and Corynebacterium spp. (4%).

Discussion

Without general guideline recommendations regarding indications, BC in clinical practice are drawn based on the physicians’ subjective judgment [27]. This is associated with a low diagnostic yield (positivity rates between 7 and 15% on average) and unnecessary costs [1, 12,13,14, 23, 28]. False positive BC may prompt unnecessary, expensive, and potentially harmful examinations [27, 28]. Therefore, we aimed to prospectively validate a prediction tool for bacteremia in unselected patients admitted to internal medicine and neurology wards by introducing SPA, which is an algorithm combining a modified Shapiro score (S) with PCT values (P) and overruling safety criteria (A) in patients admitted with SIRS and suspected infection. We explicitly decided to validate the algorithm against the current clinical practice, i.e., BC sampling according to non-standardized clinical judgment, as this allowed to demonstrate the net benefit.

A modified Shapiro score ≥ 3 points plus an admission PCT > 0.25 µg/l (i.e., SP) doubled the BC positivity rate in our cohort from 15 to 31%. Following that, SP criteria would have allowed to abstain from 61% of all BC while missing 20% of positive BC. The safety overruling criteria (A) add indications for supplementary BC for the most vulnerable patients. The 86% specificity of these criteria to predict bacteremia underlines their clinical benefit. The full algorithm increased the sensitivity from 81% (SP) to 90% (SPA), which would have allowed to detect 90% bacteremia despite omitting 48% of all BC. SPA, herein, outperformed its individual components with an OR of 11.1 in predicting BC positivity.

Selection of an algorithm

Several studies have identified clinical and laboratory predictors of bacteremia and established prediction scores for defined infections, such as community-acquired pneumonia or cellulitis [22, 29,30,31], and also for general internal and emergency medicine [8, 19, 32, 33].

Jones et al. demonstrated that 95% of bacteremia fulfilled SIRS criteria [20]. With a negative LR of 0.09 (95% CI 0.03–0.26), we made SIRS a prerequisite to consider BC in our study [27]. Lee and Metersky retrospectively developed scores to predict bacteremia including SIRS criteria, liver disease, and recent antibiotic treatment amongst others [21, 22]. If BC were abandoned in patients with a low risk of bacteremia, 38% and 79% of BC could be saved in these studies. However, opposite to SPA, both scores only applied to patients with pneumonia, and the most vulnerable patients were not represented by overruling safety criteria.

Modified Shapiro score, procalcitonin, and SPA

Looking at individual biomarkers for bacteremia, PCT with an AUROC of 0.880–0.760 outperformed CRP with an AUROC of 0.600–0.650 [23, 34,35,36]. By combining a PCT > 0.4 µg/l and a Charlson comorbidity index ≥ 2, Tudela et al. identified a subgroup of 26% among 412 study patients with a bacteremia rate of 3% as compared to 10% overall, which is low enough to abstain from BC sampling if the criteria were not fulfilled [32]. Takeshima et al. included 11 clinical and laboratory parameters and separated 759 (38%) of their 1982 study patients with a bacteremia rate of 2–3% as compared to 16% overall, again low enough to abstain from BC sampling [33]. Unfortunately, no safety information regarding missed bacteremia was provided in that study. Shapiro et al. combined twelve clinical and laboratory parameters and evaluated them in 3730 emergency room patients with a bacteremia rate of 8% [19]. The sensitivity of Shapiro’s algorithm was 98%. The negative LR of 0.08 allowed a 27% BC reduction with only 2% of bacteremia missed. Still, important overruling safety criteria were missing.

Combination scores have shown AUROC values for bacteremia of 0.800–0.827 in derivation and 0.730–0.770 in validation cohorts, which are very similar to Laukemann’s algorithm (AUROC 0.827; derivation) and the SP criteria (AUROC 0.746) [19, 23, 32,33,34,35,36,37]. With a 48% reduction in BC sampling, SPA showed a similar efficiency as the retrospective analysis by Laukemann et al., while the overall BC positivity rate was comparable (i.e., 10% and, in our study, 15%) [23]. The relatively low sensitivity of SP criteria for bacteremia (81%) could be increased to 90% by introducing the overruling safety criteria and, therefore, was in the range of the 96% in the Laukemann study [23].

The very low additional yield (3%) of BC collected solely, according to the physicians’ clinical judgment from patients not fulfilling SPA, underlines the robustness of the algorithm. Other studies also found unguided clinical judgment to be inefficient to guide BC sampling in febrile medical in-patients [17, 18].

Missed bacteremia

By applying SPA, 21 patients (10%) with bacteremia would have been missed and 4% were missed in the Laukemann study [23]. Two factors might have reduced the sensitivity of SPA: 29% of patients described a symptom duration ≤ 1 day and 53% an acute deterioration in the 24 h before admission. This may have been too short to reach SIRS or PCT cut-offs. As a matter of fact, 18/21 bacteremia were missed, lacking inflammation criteria (16 × SIRS, 2 × PCT), with nine (50%) reporting a symptom duration < 24 h (Table 3). Consistently, both patients with normal PCT on admission reached the threshold on day 3 (data not shown). Therefore, BC may be considered in SIRS- or PCT-negative patients with a symptom duration < 24 h. This might be particularly true for patients presenting with a safety overruling criterion as seen in 5 SIRS-negative patients in our analysis (4 × endovascular infection, 1 × neutropenic colitis). This fact was recently shown in patients with a central venous catheter infection with S. aureus, where the interval between clinical symptom onset and the diagnostic confirmation by BC collection was very short [38]. Second, prescribed betablockers and antipyretics may have mitigated the two SIRS criteria, tachycardia and fever.

In 8/21 patients (38%), the causative pathogen was identified by alternative testing, particularly by urine culture or antigen testing, thereby causing a rise in the overall pathogen detection rate to 94%. From a more general point of view, bacteremia detection might be dispensable if treatment decisions are not altered, such as in febrile urinary tract infections or in community-acquired pneumonia which often are also of viral etiology [39,40,41,42].

Limitations and strengths

The main weakness is the lack of characterization of the SIRS-negative patients, for whom we only collected data if they underwent BC sampling. With our quasi-experimental study design, the control group of non-standardized clinical judgment is indeed heterogenous. The algorithm might appear complex and some physicians might be reluctant to calculate SPA, although almost all parameters are routinely collected in patients with suspected infections. Finally, the availability, turnaround time, and costs of PCT may vary in different settings and could impair applicability. However, future point-of-care testing (POCT) could facilitate testing and decrease costs.

The main strength was the large, unselected and, therefore, representative cohort of internal medicine and neurologic patients presenting in an emergency department. Consequently, generalizability to surgical and outpatients may not be given. Patient safety was a core component with the inclusion of overruling criteria for the most vulnerable patients. With the study duration of 14 months, the seasonality of infectious diseases is represented. SPA was tested against non-standardized clinical judgment, which includes individualized decisions (i.e., the “gut feeling” of the treating physicians) and, therefore, is able to show the net benefit to the current clinical practice.

Conclusions

As infectious diseases with bacteremia are associated with high morbidity and mortality, a reliable diagnostic is essential. A rational diagnostic can limit unnecessary costs and potentially harmful false positive results. We prioritized patient safety by including overruling criteria for the most vulnerable patients. SPA outcompetes SIRS, modified Shapiro score, and PCT, both alone and in combination. Compared to non-standardized clinical judgment, which is the current clinical practice, SPA could reduce BC sampling by 48% while still detecting 194/215 (90%) of the relevant pathogens. This makes SPA a valuable diagnostic stewardship tool. Nevertheless, one should bear in mind that patients with an acute symptom onset or deterioration (< 24 h) should be considered for BC sampling despite not fulfilling the algorithm, as SIRS and PCT could still be false negative. This should especially be considered in patients fulfilling overruling safety criteria.