Introduction

Preoperative diagnosis of periprosthetic joint infection (PJI) is a challenging yet critical task [42]. The distinction between failures occurring as a result of infection and aseptic etiologies is an important requisite for delivery of appropriate surgical care [28].

Lack of a uniform and standard definition for PJI makes investigations difficult to compare [3]. The criterion of a minimum of two positive cultures of periprosthetic tissue material has been the most common definition in previous studies [29]. However, microbiologic cultures are not always successful in isolating the infecting organisms and contamination of samples may result in false-positive results [6]. Some authors have suggested different adjunctive criteria [5, 27, 36, 38] to overcome shortcomings of bacteriologic culture, leading to discrepancy in their inclusion and exclusion criteria. To resolve this inconsistency, an expert panel from the Musculoskeletal Infection Society (MSIS) has reviewed existing evidence and published a set of diagnostic criteria for PJI [30]. This new definition integrates clinical, serologic, microbiologic, and histopathologic findings and joint aspirate analysis to distinguish between infected and aseptic failures.

The introduction of erythrocyte sedimentation rate (ESR) and 12 Creactive protein (CRP) as criteria for diagnosis of PJI emphasizes the need for precise definition of their thresholds. Despite a considerable volume of literature, the appropriate thresholds are still unclear. Thresholds of 12 to 40 mm/hour for ESR and 3 to 13.5 mg/L for CRP have been proposed, with no distinction being made between PJI occurring in knees versus hips or late versus early infection [4]. This wide range of thresholds makes use of ESR and CRP confusing for PJI diagnosis at least for the purpose of uniform research. The MSIS suggests the conventional thresholds of 30 mm/hour and 10 mg/L for ESR and CRP, respectively, which were selected arbitrarily due to lack of studies determining the threshold [4, 9, 20, 24, 34, 36].

An important yet unaddressed question is whether thresholds of ESR and CRP for hips and knees should be similar. Based on receiver operating characteristics (ROC) analysis, some investigations have suggested thresholds for ESR and CRP, evaluating hips and knees separately or in combination [8, 10, 13, 14, 35]. However, they did not compare the mean or median values of ESR and CRP between infected prosthetic hips and knees. Therefore, these investigations have never examined whether any difference should exist between hips and knees in ESR and CRP thresholds for diagnosing PJI. They consistently reported thresholds higher than the conventional threshold for CRP, but their proposed magnitudes for ESR were less consistent and were slightly higher [8, 10, 13] and lower [14, 35] than the conventional threshold.

Even after uncomplicated arthroplasty, ESR and CRP remain elevated for 3 to 8 weeks [7, 14, 21, 26]. Thus, time after index arthroplasty can have a confounding influence on ESR and CRP values. This may hinder interpretation of the results of ESR and CRP in the early-postoperative setting, implying different thresholds might be required for early-postoperative and late-chronic PJIs.

We therefore determined (1) whether there is a difference in the threshold value of ESR and CRP between hips and knees, (2) whether the threshold value for ESR and CRP should be different for early-postoperative and late-chronic PJI, and (3) the optimal thresholds for ESR and CRP in diagnosis of PJI.

Patients and Methods

After obtaining institutional review board approval, we used the institutional computerized infection database to identify 2203 patients who underwent revision arthroplasty at our institution between 2000 and 2009 and had adequate preoperative workup to confirm or refute PJI (Fig. 1). Medical records were reviewed to retrieve demographic details and associated comorbidities. We excluded 180 patients for the following reasons: (1) comorbid conditions with confounding effects on ESR and CRP (eg, inflammatory autoimmune disorders, malignancies, organ failure [kidney, liver, heart], or preexisting infectious diseases [12, 25]) and (2) revision surgeries indicated for periprosthetic or component fractures. Our aseptic cohort consisted of 1095 patients in the hip group and 594 patients in the knee group. Based on the MSIS criteria, our periodically updated retrospective infection database was accessed to identify our septic cohort, which comprised 108 patients with hip PJI and 165 patients with knee PJI. Briefly, the MSIS suggested two major and six minor criteria for PJI. Major criteria, with each being indicative of PJI, included presence of a draining sinus and isolation of a pathogen from two separate tissue or fluid cultures. The presence of at least four of six minor criteria was also proposed to suggest PJI. The six minor criteria proposed were elevated ESR and CRP, elevated synovial white blood cell (WBC) count, increased synovial fluid polymorphonuclear cell percentage, isolation of a pathogen from one culture only, presence of purulence, and positive microscopy of the frozen section of periprosthetic tissue samples [30]. We do not routinely utilize frozen section of the periarticular tissues, thus eliminating one of the six minor criteria proposed by the MSIS. Since the aim of this study was to determine optimal cutoff values of ESR and CRP for diagnosing PJI, we excluded elevated ESR and CRP as a minor criterion and considered patients as having PJI if they had all of the remaining four minor criteria. Sixty-one patients from the original cohort were disqualified for this study since their PJI could not be established independently from ESR and CRP levels according to the MSIS criteria. The patients included in the PJI cohort fulfilled the diagnostic criteria as follows. Among hips, of the 108 patients, 24 (22%) had draining sinuses, 94 (87%) had at least two positive cultures, and 20 (19%) had four minor criteria (Fig. 2A). Among knees, of the 165 patients, 21 (13%) presented with draining sinuses, 144 (87%) were diagnosed by at least two positive cultures, and 101 (61%) met four minor criteria (Fig. 2B).

Fig. 1
figure 1

A flow diagram shows method of inclusion and exclusion of patients. The MSIS definition for PJI was strictly applied. Thirteen patients were disqualified for subgroup analysis due to unclear date of their index arthroplasty.

Fig. 2A–B
figure 2

Venn diagrams show the distribution of the MSIS criteria among patients with (A) hip PJI and (B) knee PJI. The majority of patients with PJI met the commonly used definition of two positive cultures, although PJI diagnosis in a minor proportion (13%) was based on other MSIS criteria.

The mean age of the patients in the PJI cohort was 67 years (range, 40–93 years) in the hip arthroplasty group and 68 years (range, 37–90 years) in the knee arthroplasty group. Patients with aseptic hip revisions were 65 years old on average (range, 26–96 years) while those with aseptic knee revisions were 65 years old on average (range, 28–89 years). Women constituted 56% and 48% of the aseptic and septic hip revision groups, respectively, and 61% and 47% of aseptic and septic knee revision groups, respectively.

During revision arthroplasty, three to five samples of periprosthetic fluid or tissue material from the prosthesis-bone interface were sent for culture. Samples from the draining sinus were not included. Isolates were considered significant if they grew on solid agar or when an indistinguishable strain grew on enrichment medium more than once. Prophylactic intraoperative antibiotics were administered after sample extraction for all patients except those in whom infection was suspected but no organism had been isolated. Gram-positive bacteria were the responsible pathogens in the majority of PJIs (65% in hips, 58% in knees). Staphylococcus aureus was encountered as the most common organism in hips and knees, although in knees it was closely followed by coagulase-negative staphylococci (Table 1).

Table 1 Microbiologic profile of patients with periprosthetic joint infection

Before revision arthroplasty, we routinely obtain ESR and CRP and aspirate the joint in patients with abnormal serology and/or high index of suspicion for PJI (ie, painful prosthetic joint in the context of predisposing risk factors, constitutional symptoms, or clinical signs of PJI). We included the ESR and CRP values within 1 month before revision surgery. In case of multiple measurements, the values closest to the date of surgery were accepted. The laboratories of our institution utilize the semiautomated method for measurement of ESR. Immediately after extraction, samples were transferred to vacuum tubes containing prefilled sodium citrate and mixed by a special mixer (ESR-657; Streck, Omaha, NE, USA). An automated analyzer (ESR-Auto Plus®; Streck) measured ESR by using the QuickMode method. In this method, an infrared scanner measured red blood cell sedimentation level after 30 minutes on two scans (forward and backward); these values were then converted to an equivalent Westergren results in millimeter per hour using a mathematical formula. The turbidimetric method was utilized to measure CRP levels (Beckman Coulter Inc, Brea, CA, USA). Synovial WBC count and differential were measured using automated analyzers (Sysmex XE5000; Sysmex, Mundelein, IL, USA). In the PJI group, ESR and CRP values were missing in two patients (one hip and one knee) and two patients (two hips), respectively. ESR and CRP values were higher in the PJI groups than in the corresponding aseptic groups in both joints (Table 2).

Table 2 ESR and CRP values in the septic and aseptic groups

To control for the confounding effect of the time factor, we classified the patients into early-postoperative PJI if infection occurred within 4 weeks of the index arthroplasty (42 hips, 42 knees). This time frame is arbitrary but is clinically practical. It allows for making decisions since prosthesis salvage has a favorable prognosis in the early-postoperative period [39]. The rest of the patients with PJI were categorized as late chronic (57 hips, 119 knees) except for nine hips and four knees for which previous surgery dates could not be verified and therefore were excluded from this subgroup analysis (Fig. 1).

We first compared hips and knees and then analyzed them separately and in combination. ESR and CRP were compared between septic and aseptic groups using a nonparametric Wilcoxon rank-sum test, setting alpha error at 0.05 as significant. The Kolmogorov-Smirnov test was used to compare the distribution pattern of ESR and CRP values between aseptic hip and knee revisions. We calculated routine characteristics of the diagnostic tests, including true-positive rate (sensitivity), true-negative rate (specificity), false-negative rate (1 − sensitivity), false-positive rate (1 − specificity), positive predictive value (PPV) (the probability of PJI when the test result is positive), negative predictive value (NPV) (the probability of not being PJI when the test result is negative), and the positive and negative likelihood ratios (LR+ and LR−), by constructing 2 × 2 tables for ESR and CRP separately in hips and knees. LR+ (sensitivity divided by 1 − specificity) and LR− (1 − sensitivity divided by specificity) are the proportion of patients with PJI with positive and negative test results to patients without PJI with positive and negative test results, respectively. ROC curves were subsequently constructed by mapping true-positive rate (sensitivity) against false-positive rate (1 − specificity) for each test-joint combination. The ROC curve is a graphical statistical tool that illustrates the discriminative effectiveness for a diagnostic test [31]. It shows how the trade-off between sensitivity and specificity occurs when different cutoff thresholds are examined consecutively. Better performing tests demonstrate larger areas under the curve (AUCs) and are plotted farther from the diagonal line of indiscrimination. The optimal cutoff value is the threshold by which the test can best classify a maximum number of cases as true positive and true negative and was calculated by Youden’s index (J) [31]. This index is a function of both sensitivity and specificity and finds where the sum of them is largest. Graphically, it is a point at the shoulder of the ROC curve, closest to the point of x = 0, y = 1 (ideal point with sensitivity and specificity of 100%) or farthest from the diagonal line of nondiscrimination. Subgroup analysis was also performed within each test-joint combination between early-postoperative and late-chronic cohorts using a Wilcoxon rank-sum test. Separate ROC curves were mapped only for those subgroups with significant differences. We performed the statistical analyses using SAS® statistical software (SAS Institute Inc, Cary, NC, USA).

Results

ESR values were not different (p = 0.31) between hip and knee PJI groups (median: 83 and 84 mm/hour, respectively). Likewise, ESR values were not different (p = 0.29) between aseptic hip and knee revisions (median: 19 and 20 mm/hour, respectively; mean: 24 and 26 mm/hour, respectively). Using the Kolmogorov-Smirnov test, there was no difference in the distribution of ESR values between aseptic groups (p = 0.49). CRP values were different (p = 0.02) between hip and knee PJI groups, with median values of 73 and 133 mg/L for hips and knees, respectively. Median CRP for aseptic hip and knee groups were different, at 6 and 7 mg/L, respectively (p < 0.001).

Subgroup analysis comparing early-postoperative hip with early-postoperative knee PJI revealed no difference for ESR and CRP. However, ESR and CRP values were both higher in late-chronic knees than in late-chronic hips (p = 0.005 and p < 0.001, respectively). Therefore, separate ROC analysis was performed for late-chronic hip and knee PJIs while sole ROC analysis was done combining patients with early-postoperative hip and knee PJI.

We also compared early-postoperative and late-chronic PJIs in each joint. A difference was detected for CRP in hips and for ESR in knees (p < 0.001 and p = 0.012, respectively). Median values for CRP in early-postoperative and late-chronic hips were 143 mg/L (interquartile range: 75–255 mg/L) and 56 mg/L (interquartile range: 21–85 mg/L), respectively. Median values for ESR in early-postoperative and late-chronic knees were 78 mm/hour (interquartile range: 44–91 mm/hour) and 90 mm/hour (interquartile range: 61–104 mm/hour), respectively.

For ESR in late-chronic PJI, ROC analysis yielded optimal magnitude of 48.5 and 46.5 mm/hour for hip and knee, respectively, which are similar yet higher than the conventional threshold of 30 mm/hour. For CRP in late-chronic PJI, optimal thresholds were consistently higher than the commonly used threshold of 10 mg/L. ROC plots established cutoff points of 13.5 and 23.5 mg/L for hips and knees, respectively. For early-postoperative PJI, common thresholds for ESR and CRP were calculated as 54.5 mm/hour and 23.5 mg/L, respectively (Fig. 3).

Fig. 3A–B
figure 3

ROC plots for (A) ESR and (B) CRP in early-postoperative (hips and knees combined) and late-chronic PJI (hips and knees separated) show cutoff points of optimum sensitivity and specificity. The AUC in all conditions approximated 1, supporting the accuracy of these tests in PJI diagnosis.

Comparison of the test characteristics between conventional (Table 3) and optimal (Table 4) thresholds for ESR and CRP consistently demonstrated improved specificity, PPV, and LR+ at the expense of slight worsening of sensitivity and LR−, while NPV remained practically unchanged. In both joints, the combination battery of ESR and CRP with new optimal magnitudes revealed improved specificity, PPV, and LR+ at the expense of worsening of sensitivity and LR− and with NPV unchanged between 97% and 98%.

Table 3 Test characteristics with conventional measures
Table 4 ROC analysis and test characteristics with new optimal thresholds

Discussion

Until recently, a consistent and uniform definition for PJI did not exist [29]. The variation in PJI definition has resulted in uncertain conclusions from the existing evidence. The introduction of the MSIS criteria was an important step toward uniform research regarding PJI [30]. However, regardless of the diagnostic criteria, measurement of ESR and CRP has been deemed an important part of the workup of patients suspected of PJI [42]. These valuable markers can be performed rapidly, inexpensively, and with minimal inconvenience. However, they are nonspecific inflammatory markers and their measurement is affected by innumerable factors, including demographics (age and sex), underlying diseases, medications, severity and stage of inflammation, and other unknown factors [12, 25, 32, 36, 37, 41]. Therefore, it seems to be impossible to convert these quantitative tests into absolute binary systems (ie, infected versus noninfected). Nevertheless, strategies such as combination testing and ROC analysis can improve their performance as diagnostic armamentarium in PJI. We therefore determined (1) whether there is a difference in the threshold value of ESR and CRP between hips and knees, (2) whether the threshold value for ESR and CRP should be different for early-postoperative and late-chronic PJI, and (3) the optimal thresholds for ESR and CRP in diagnosis of PJI.

We recognize some limitations to our investigation. First, due to the retrospective nature of the study, adequate data for confidently distinguishing between acute hematogenous and late-chronic PJI were unavailable. Clinical distinction between these two conditions is not obvious and requires rigorous criteria to rule out hematogenous spread of a primary source of infection to a prosthetic joint. Moreover, perioperatively acquired infections can remain silent for several months up to 2 years [33]. This could have distorted our subgroup analysis compared to isolated grouping that could have consisted of patients with early-postoperative, late-chronic, and acute hematogenous PJI. Second, in our institution, histologic analysis of intraoperative specimens is not performed; therefore, of the six minor criteria, only five were available for this study and elevated ESR and CRP was eliminated as a criterion since we were studying those parameters, which left only four. Moreover, joint aspiration was performed on the basis of the treating surgeon’s discretion and was not routinely performed for every patient with PJI. If these data had been available, we would have excluded fewer than 61 patients (14 hips) from our infection database. However, our relatively large study size permits us to be confident regarding our results. Third, many factors affect the level of inflammatory markers. This may cause some uncertainty regarding how the constellation of unknown and known factors could have biased our results. Finally, other useful diagnostic modalities, such as leukocyte esterase, IL-6, sonication of explanted prosthesis, and PCR, remain potential diagnostic tools for PJI [4, 11, 22, 28, 30, 38]. Therefore, the MSIS definition can be subject to future modifications, as are the thresholds for any criteria within new combinational algorithms. ESR and CRP are not exempted from this process.

Based on our findings, ESR and CRP levels were higher in knee PJIs than in hip PJIs. This fact was reflected in the comparison between late-chronic PJIs. Nevertheless, it seems index surgery conceals this difference in the early-postoperative period. For hips, CRP levels were higher in early-postoperative PJI than in late-chronic PJI, while for knees this difference was not detected. Moreover, ESR levels were unexpectedly slightly higher in late-chronic knee PJIs. It is possible some unrecognized acute hematogenous PJI cases could have skewed ESR levels in our late-chronic knee PJI subgroup.

This study suggests optimal thresholds for CRP should be different for late-chronic PJI in hips and knees (13.5 and 23.5 mg/L, respectively). This finding is additive to the existing evidence and may reflect the normal physiologic difference of inflammatory reaction to arthroplasty that is more intense in knees than in hips [7, 21, 24, 40]. The exact mechanism remains unknown, though Larsson et al. [21] suggested TKA is more traumatic to bone and marrow tissue, which has a higher content of inflammatory cells. Although ESR values were higher in late-chronic knees than in hips, thresholds were calculated at approximately similar points (48.5 and 46.5 mm/hour for hips and knees, respectively). We suggest similar thresholds for the two joints in the early postoperative period (54.5 mm/hour for ESR and 23.5 mg/L for CRP, respectively). However, pathophysiologic differences in ESR and CRP should also be considered. Two studies [21, 26] have shown, after arthroplasty, ESR increases more slowly and less prominently than CRP. It also decreases more slowly, reacts less consistently, and may have more frequent atypical patterns than CRP. Studies evaluating postoperative trends of these markers agree CRP is a more reliable indicator for detecting early-postoperative PJI [1, 19, 23, 24, 26, 34].

Our suggested cutoff values for ESR and CRP were uniformly higher than the conventional thresholds. Regarding CRP, this is relatively concordant with most previous studies (Table 5). However, our proposed threshold for ESR is higher than the conventional magnitude and is not consistent with the same studies reporting thresholds lower [14, 35] or slightly higher [8, 10, 13] than the conventional threshold. This discrepancy could be due to several reasons. Although relatively similar criteria were used for PJI, the value attributed to components, such as major or minor criteria, was not similar. Moreover, the factor of time after index surgery (early-postoperative versus late-chronic) was not taken into account in those studies. Finally, technical details, such as type of anticoagulant, type of collection tube (simple versus vacuum), mixture technique of sample with anticoagulants, and measuring method (manual versus automated), can potentially affect ESR measurements. The International Council for Standardization in Haematology [18] has published recommendations as reference for ESR measurement. Nevertheless, several studies [2, 1517] have reported discordance between the traditional Westergren method and modern automated analyzers and even among different automated analyzers. Technical details of ESR and CRP measurement were not provided in previous studies [8, 10, 13, 14, 35]. We have been using a semiautomated method for ESR measurement in our institution and wonder how technical issues could have contributed to this inconsistency.

Table 5 Previous studies reporting optimal thresholds for ESR and CRP based on ROC analysis

In conclusion, it seems a similar threshold for ESR (that is higher than the conventional threshold) should be applied for PJI diagnosis in hips and knees. The optimal threshold for CRP seems to be higher than conventional thresholds, but magnitudes for hips and knees should be different. Moreover, different thresholds should be implemented for CRP in the early-postoperative and late-chronic PJI settings, at least for hips, while ESR thresholds are probably similar in both conditions. Conventional thresholds for these inflammatory markers are useful for screening of individual patients. However, they need to be refined to improve accuracy of this test battery as a diagnostic criterion for PJI.