Introduction

Acute osteomyelitis (OM) is often associated with clinical symptoms, such as fever, tenderness, redness, hyperthermia (at the infection site) and swelling, as well as increased erythrocyte sedimentation rate (ESR), left differential shift, increased white blood cells (WBC), and elevated C-reactive protein (CRP) [1]. In patients with chronic musculoskeletal infections, however, clinicians are often confronted with an atypical clinical presentation. Clinical symptoms and abnormal laboratory findings can be absent, nonspecific, or otherwise difficult to recognize. Indeed, the manifestations of chronic OM are heterogeneous, depending on the age of the patient, specific causative microorganisms, the anatomical area of involvement, the segment of affected bone, route of contamination, systemic and local host factors and the presence of underlying comorbidities [2], as well as the orthopaedic implant. The diagnostic markers of infections based on blood findings are not always reliable, being either uninformative or of uncertain importance. In particular, ESR and WBC both have relatively low sensitivity and specificity as markers of prosthetic joint infection [3], and the sensitivity of the inflammatory markers IL-6, WBC, ESR and CRP are all <0.25 in periprosthetic shoulder infections. These infection markers are of insufficient sensitivity for reliable diagnosis [4].

Morphological changes near the site of a prosthetic implant are often nonspecific, and are thus not sufficiently reliable for differentiating active infection from postoperative reparative effects [5]. Imaging modalities, such as plain radiography, CT, MRI and ultrasonography, can nonetheless detect focal infectious and inflammatory processes when substantial anatomical changes are present [6]. Plain radiography is often the first procedure to be used in suspected joint infection, as it provides an overview of the anatomy and the pathological state of the bone of the region of interest. However, X-ray images do not show specific periosteal reactions, and fail to depict osteolysis in OM until about 50 % of the bone mineral content has been lost. Consequently, radiography can only detect advanced infections which have proceeded to osteolytic lucency characteristic of chronic infections [7].

CT is not sensitive for the detection of bone infection, but it can be useful for detecting early osseous erosion and for documenting the presence of a sequestrum, foreign body or gas formation [2]. MRI provides anatomical detail and accurate information on the extent of the infectious process and soft tissues involved using T2-weighted imaging for the assessment of bone oedema and T1-weighted contrast-enhanced sequences for the evaluation of pathological enhancement. However, the sensitivity and specificity of MRI and CT are frequently inadequate for the detection of infections per se, as nonspecific findings might be caused by pathologies other than chronic infection. Correct diagnosis by these methods can be made more difficult when bone has been altered by prior trauma or surgery, and is particularly challenging when an orthopaedic implant is present [5, 8].

The timely identification and precise localization of OM are essential for surgical debridement and choice of long-term antimicrobial therapy, both of which can have a significant impact on patient outcome [9]. Despite optimal treatment, the recurrence rate of chronic OM in adults is about 30 % at 12 months [10]. With progression of the infection, OM can be complicated by septic arthritis and soft tissue infection even leading to fistula formation to the skin [11].

In this context, more accurate imaging is needed to diagnose OM. [18F]FDG PET has become an established diagnostic tool in oncology [12] because of the Warburg effect in which tumour cells frequently exhibit elevated glucose consumption. Similarly, leucocytes, especially granulocytes and macrophages, show increased glucose metabolism upon activation in response to infection or chronic inflammation. [18F]FDG uptake in immune cells correlates with metabolic rate and the number of glucose transporters. Conversely, [18F]FDG has low physiological uptake in bone, bone marrow and inactive muscles, and thus provides relatively high target to background ratios [5]. Degenerative bone alterations usually show only mildly increased [18F]FDG uptake [13], whereas [18F]FDG sensitively detects a variety of infections of the bone and soft tissues [1422], as well as prosthetic joint infections [23]. Even poorly perfused tissues are accessible to the small [18F]FDG molecule in the circulation [5]. A meta-analysis of suspected OM showed that PET/CT using [18F]FDG has promising sensitivity, specificity and accuracy in comparison to bone scintigraphy and leucocyte scintigraphy [6]. However, only studies with small numbers of patients have been published so far.

In the study reported here we investigated a large and well-characterized patient cohort with only latent and minor clinical symptoms, often nonspecific laboratory findings, and inconclusive radiological imaging findings. The aim of the study was to assess the accuracy of [18F]FDG PET and [18F]FDG PET/CT for the detection of chronic OM and implant-associated infections, and for the prediction of the long-term outcome in this patient cohort.

Materials and methods

Patient population

This study was approved by the institutional ethics committee. We retrospectively identified 215 patients with suspected chronic OM treated in a level one trauma centre. The inclusion criteria were persistent or increasing pain and the absence of clear clinical markers for acute local infection, such as pyrophoric wounds, joint swelling, erythema and/or hyperthermia. At the time of the PET scan, the patients had only nonspecific clinical and laboratory findings. The results of plain radiography and MRI in patients without metallic bone prostheses prior to the PET scan had been ambiguous, resulting in the decision to perform an additional PET scan. The patients underwent either [18F]FDG PET (84 patients) or [18F]FDG PET/CT (131 patients).

[18F]FDG PET and PET/CT scan

PET scans (nine whole-body scans, of which one was a stand-alone PET scan and eight were PET/CT scans, or segmental scans) were acquired in three-dimensional mode on five different PET scanners with similar performance (listed in Table 1). In order to accommodate the various PET scanners used in the present study, we calculated a standardized uptake value conversion factor based on the results of phantom studies (NEMA NU2-2001), which allowed valid pooling of the scan results.

Table 1 Patient characteristics

Patients fasted for at least 6 h prior to scanning to ensure blood glucose levels below 150 mg/dl. Blood glucose levels were measured just prior to the scan (mean 94 ± 17 ng/mL). A diuretic was administered intravenously (furosemide, Furorese 20 mg; Hexal AG, Holzkirchen, Germany) and in patients undergoing a whole-body scan, an antispasmodic medication (butylscopolamine bromide, Buscopan 20 mg/1ml; Boehringer Ingelheim, Ingelheim am Rhein, Germany) was also administered, followed by intravenous bolus administration of [18F]FDG at a weight-adapted dose (mean 252 ± 76 MBq). The emission sequence was initiated 60 min after tracer administration. Attenuation correction was based on CT or external 68Ge source transmission. Just prior to the PET scan, patients were asked to void their bladder, and then recline on the scanning bed. CT was performed either as a full-dose CT scan (automated dose modulation, mean 220 mAs; 120 kV; kernel B19f very smooth and kernel B30f medium soft) with (106 patients) or without (24 patients) administration of contrast agent or as low-dose CT scan for attenuation correction and localization of the PET findings (35 mAs; 120 kV; 1 patient). If contrast agent was administered prior to the CT scan, a weight-adapted mean of 120 mL of iodine-containing contrast agent (iomeprol, Imeron 350 mg I/mL; Bracco Imaging Deutschland GmbH, Konstanz, Germany) was intravenously infused at a rate of 2.5 mL/s, with initiation of the CT scan 50 s after injection to depict the venous contrast-medium phase.

In the present retrospective study, images were interpreted by consensus of two experienced nuclear medicine physicians with experience in PET/CT scan reading, who were blinded to the pathological, bacteriological and surgical data and to the patients’ clinical follow-up. A dedicated software package was used (Hermes Hybrid Viewer, version 2.0; Hermes Medical Solutions, Stockholm, Sweden). In an initial visual evaluation, [18F]FDG PET or PET/CT scans were rated as positive when focally increased uptake was seen within the bone, with higher intensity than the uptake in adipose tissue or inactive muscles. Scans were rated as negative when no focal uptake was seen.

In addition, [18F]FDG uptake was quantified to evaluate possible cut-off values for positivity. The maximum standardized uptake values corrected for body weight (SUVmax) of the lesions were measured. SUVmax ratios, defined as the ratio of the lesion SUVmax to the SUVmax measured in the contralateral noninfected site as the internal reference region, were also calculated. Receiver operating characteristic (ROC) curves were calculated for discriminatory power. SUVmax values and SUVmax ratio cut off-values, as well as sensitivities and specificities, were calculated to reveal the best cut-off value between sensitivity and specificity for discriminating between infection and noninfection. For technical reasons SUVmax values could not be calculated in 14 patients. For the SUVmax ratio five further patients had to be excluded due to amputation of the contralateral extremity, due to suspected infection of unpaired organs and due to simultaneous infections of the ipsilateral and contralateral extremities.

Validation of PET results

PET findings were confirmed by means of intraoperative tissue cultures or long-term follow-up of more than 1 year. The clinical follow-up data were documented during routine visits to the Trauma Center, and included clinical assessment and evaluation based on the SF-36 questionnaires.

OM was diagnosed on the basis of:

  • Positive intraoperative bone microbiology

or

  • A clinically and radiologically eventful follow-up

OM was excluded on the basis of:

  • Negative intraoperative bone microbiology

or

  • A clinically and radiologically uneventful follow-up without clinical symptoms, normal infection parameters and no further therapy for at least 1 year

Statistical analysis

Statistical analyses were performed using the SPSS software package (SPSS Statistics 23.0; IBM Corp., Armonk NY). A p value of less than 0.05 was considered statistically significant. Variables were compared between clinically infected groups and noninfected groups using Student’s t test, the chi-squared test and the Mann-Whitney U test. Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were calculated by comparing the PET results with the intraoperative findings and the clinical follow-up findings.

Results

Patient cohort

A general overview of clinical and patient data is given in Table 1. The mean age of the patients was 47 ± 15 years. Of the 215 patients, 47 (22 %) were women and 168 (78 %) were men. In 16 patients (7 %) the suspected OM was located in the upper extremities, in 188 (87 %) in the lower extremities, in 9 (4 %) in the pelvis (including the hip joint), and in 2 (1 %) in the spine. The mean clinical follow-up was 69 ± 49 months. The clinical diagnosis was made on follow-up in 72 patients (33 %) who did not undergo surgery. Evidence of clinical infections was during follow-up in 12 patients, and no evidence in 60 patients.

The causative events prior to the PET scan were mainly bone injuries (173 patients; 80 %), soft-tissue trauma (13 patients; 6 %), or less commonly bagatelle trauma (6 patients; 3 %). Other less common causes are shown in Table 1. In five patients, there was no known accident leading to the complaints. The event prior to the PET scan dated back 12 ± 13 years in the clinically infected group, and 10 ± 12 years in the clinically uninfected group (p = 0.058). In 66 patients (31 %), the [18F]FDG PET scan was performed in the presence of metallic implants. In nine patients a whole-body scan was performed (one PET scan and eight PET/CT scans). In these nine patients, the whole-body scan did not reveal any additional or distal infection sites.

In 101 of the 215 patients (47 %), infection was confirmed by positive intraoperative microbial culture (89 patients) or by diagnosis on follow-up (12 patients). The pathogens consisted of gram-positive bacteria (89 %), gram-negative bacteria (5 %) and mixed infections (6 %). In the remaining 114 patients (53 %), neither intraoperative microbiological bone culture (54 patients) nor clinical follow-up (60 patients) was indicative of infection.

Results of [18F]FDG PET/CT

In the PET/CT imaging group, the results were true-positive in 53, true-negative in 54, false-positive in 17, and false-negative in 7, indicating sensitivity, specificity, PPV, NPV and accuracy of 88 %, 76 %, 76 %, 89 % and 82 %, respectively. The [18F]FDG PET/CT imaging findings in five example patients are shown in Figs. 1, 2, 3, 4, and 5. Sensitivity and accuracy were higher in the PET/CT group with contrast agent than in the group without contrast agent.

Fig. 1
figure 1

A 72-year-old patient with persistent bone pain. The [18F]FDG PET/CT images show the presence of osteitis in the distal femur. The cortical bone is interrupted with formation of a fistula to the skin. The muscles of the anterior compartment of thigh show signs of atrophy, probably due to immobilization. The PET/CT scan was rated true-positive based on a positive intraoperative microbial culture result (gram-positive bacteria)

Fig. 2
figure 2

A 73-year-old male patient with pathological [18F]FDG PET findings along the left hip prosthesis in contrast to the nonsuspicious prosthesis of the right hip. However, in this patient the intraoperative microbial culture was negative, and the PET scan was therefore rated false-positive. The FDG uptake was probably due to sterile inflammation occurring with implant loosening

Fig. 3
figure 3

A 39-year-old male patient with nonspecific laboratory findings. The patient complained of persistent pain, suggestive of osteomyelitis. The [18F]FDG PET/CT images show focally intense [18F]FDG uptake along the duct of the pin, which was subsequently removed from the marrow cavity. The zone of elevated [18F]FDG uptake reaches the skin through a fistula. The [18F]FDG PET/CT scan was rated true-positive for infection, and this was confirmed by culture showing gram-positive bacteria at the time of reoperation

Fig. 4
figure 4

A 20-year old patient with nonspecific laboratory findings and persistent pain 9 months after fracture of the tibia. The [18F]FDG PET/CT images show marginally elevated [18F]FDG uptake along the fracture line. This [18F]FDG uptake is reactive, and not due to an infection. The CT images reveal the fracture line with partially rounded edges and sclerosis of the bone fragments. The bone still had not consolidated at the time of this report, and the patient suffered from a hypertrophic pseudarthrosis. No signs of infection were detected, so the [18F]FDG PET/CT scan was rated true-negative

Fig. 5
figure 5

A 42-year-old patient with persistent pain of the femur. The [18F]FDG PET/CT images show osteitis. The cortical bone is interrupted, and the muscles of the right distal femur are infected. The [18F]FDG PET/CT scan was rated true-positive for infection, and this was confirmed by culture showing gram-positive bacteria at the time of subsequent surgery

Results of [18F]FDG PET and pooled data

Taken together, [18F]FDG PET and [18F]FDG PET/CT yielded positive findings in 125 of the 215 patients (58 %). Of the 125 lesions, 18 were located in the soft tissue only. The PET and PET/CT scan results were true-positive in 87 patients, true-negative in 76, false-positive in 38, and false-negative in 14, indicating a sensitivity of 86 %, a specificity of 67 %, a PPV of 70 %, a NPV of 84 % and an accuracy of 76 %. The stand-alone PET scans (84 patients) showed a sensitivity of 83 %, a specificity of 51 %, a PPV of 62 %, a NPV of 76 % and a diagnostic accuracy of 67 %. In summary, the specificity, PPV, NPV and accuracy were higher with PET/CT than with stand-alone PET. The diagnostic yields of [18F]FDG PET and [18F]FDG PET/CT in the whole patient group are presented in Table 2 and Fig. 6.

Table 2 Diagnostic yields of [18F]FDG PET and [18F]FDG PET/CT together and both modalities separately
Fig. 6
figure 6

Diagnostic yields of [18F]FDG PET and [18F]FDG PET/CT. The specificity, PPV, NPV and accuracy were higher with PET/CT than with stand-alone PET

Patients with infection of the lower limb

Suspected infections were located in the lower limb in the majority of patients (165; 77 %) so that a subgroup analysis was performed. In 20 patients infection was located in the foot, in 108 patients in the lower leg and in 37 patients in the upper leg. [18F]FDG PET and [18F]FDG PET/CT yielded positive findings in 103 of these 165 patients (62 %). Sensitivity was comparable in these three groups: 80 % in those with infection in the foot, 90 % in those with infection in the lower leg and 86 % in those with infection in the upper leg. In a similar manner to sensitivity, specificity did not differ significantly (60 %, 64 % and 64 %, respectively). No significant differences were seen among the groups (p = 0.83). There were also no significant differences among these groups in relation to evidence or no evidence of clinical infection (p = 0.39), clinical proof of infection (MIBI scan or follow-up; p = 0.34), gender (p = 0.77), age (p = 0.17) or the PET scan performed (p = 0.16). Therefore, there were no significant differences among patients with suspected infection of the upper leg, lower leg and foot. The diagnostic yields of [18F]FDG PET and [18F]FDG PET/CT in patients with suspected infection of the lower extremity are presented in Table 3.

Table 3 Diagnostic yields of [18F]FDG PET and [18F]FDG PET/CT in patients with suspected infection of the lower extremity

Patients with and without an orthopaedic implant

In 66 patients (31 %) a PET or PET/CT scan was performed in the presence of an implant. The implant was located in the foot (in 7 patients), the lower leg (in 26), the upper leg (in 9), the pelvis (in 2), the spine (in 1), the upper arm (in 2), the knee (in 12), the hip joint (in 6) and in the elbow (in 1). Infection was clinically diagnosed in 37 patients (56 %) of those with an implant and in 88 patients (59 %) of those without an implant (p = 0.19).

Considering the PET/CT scans only (131 patients), 15 were true-positive, 17 were true-negative, 2 were false-positive and 3 were false-negative, indicating a sensitivity of 83 %, a specificity of 89 %, a PPV of 88 %, a NPV of 85 % and an accuracy of 86 % in patients with an implant. Of the PET/CT scans in patients without an implant, 38 were true-positive, 37 were true-negative, 15 were false-positive, and four were false-negative, indicating a sensitivity of 90 %, a specificity of 71 %, a PPV of 72 %, a NPV of 90 % and an accuracy of 80 %. Thus, in the PET/CT group without an implant, the sensitivity and NPV were higher, but the specificity, PPV and accuracy were lower than in those with an implant. Of note, PET/CT was rated false-positive in only two patients (5 %) with an implant but in 15 patients (16 %) without an implant.

PET and PET/CT together (215 patients) showed a sensitivity of 81 %, a specificity of 66 %, a PPV of 68 %, a NPV of 79 % and an accuracy of 73 % in patients with an implant, and a sensitivity of 89 %, a specificity of 67 %, a PPV of 70 %, a NPV of 87 % and an accuracy of 77 % in those without an implant. Differences were not significant between these groups (p = 0.77). The diagnostic yields of [18F]FDG PET and [18F]FDG PET/CT in patients with and without an implant are presented in Table 4.

Table 4 Diagnostic yields of [18F]FDG PET and [18F]FDG PET/CT in patients with and without an orthopaedic implant

Results of [18F]FDG PET(/CT) in relation to clinical proof of infection

Infection was clinically proved by microbiology in 143 patients of whom 62 % had positive microbiology, or on follow–up in 72 patients of whom 17 % were diagnosed with infection. In patients with microbiological confirmation of infection, PET/CT was true-positive in 76, true-negative in 35, false-positive in 19 and false-negative in 13, indicating a sensitivity of 85 %, a specificity of 65 %, a PPV of 80 %, a NPV of 73 % and an accuracy of 78 %. The pathogens that caused most of the false-negative results were Staphylococcus epidermidis and S. aureus. The pathogens did not differ from those found in the patients with true-positive findings. In patients with microbiological confirmation on follow-up, PET/CT was true-positive in 11, true-negative in 41, false-positive in 19 and false-negative in 1, indicating a sensitivity of 92 %, a specificity of 68 %, a PPV of 37 %, a NPV of 98 % and an accuracy of 72 %. Diagnostic yields differed significantly between these two groups (p < 0.001).

Quantitative analyses

The mean SUVmax values of the infected sites were 6.6 ± 4.6 in the clinically infected group and 3.7 ± 2.6 in the clinically noninfected group (p < 0.05). The mean SUVmax ratios were 5.2 ± 4.5 in the clinically infected group and 2.8 ± 2.1 in the noninfected group (p < 0.05). Figure 7a shows the ROC curves for SUVmax and SUVmax ratio. The ROC analysis identified an optimal SUVmax threshold of 3.9 (AUC 0.717) for discriminating between clinically infected and noninfected patients, which yielded a sensitivity of 69 % and a specificity of only 66 %. The optimal SUVmax ratio threshold was 3.0, which yielded a sensitivity of 60 % and a specificity of 65 % (AUC 0.702). In the patients with a PET/CT scan only, ROC analysis (AUC 0.706 and 0.733, Fig. 7b) identified an optimal SUVmax threshold of 3.7 for discriminating between clinically infected and noninfected patients, which yielded a sensitivity of 70 % and a specificity of 66 %. The optimal SUVmax ratio threshold was 2.7, which yielded a sensitivity of 60 % and a specificity of 66 %. In the patients with PET only, ROC analysis (AUC 0.733 and 0.688; Fig. 7c) identified an optimal SUVmax threshold of 4.0 for discriminating between clinically infected and noninfected patients, which yielded a sensitivity of 65 % and a specificity of 65 %. The optimal SUVmax ratio threshold was 3.1, which yielded a sensitivity 65 % and a specificity of 65 %.

Fig. 7
figure 7

ROC curves of SUVmax and SUV ratio for the detection of osteomyelitis by [18F]FDG PET and [18F]FDG PET/CT. a All patients. The AUCs are 0.717 and 0.702. For discriminating between clinically infected and noninfected patients, the optimal SUVmax threshold is 3.9 (sensitivity 69 %, specificity 66 %) and the optimal SUVmax ratio threshold is 3.0 (sensitivity 0.60 %, specificity 65 %). b Patients with PET/CT only. The AUCs are 0.706 and 0.733. For discriminating between clinically infected and noninfected patients, the optimal SUVmax threshold is 3.7 (sensitivity 70 %, specificity 66 %) and the optimal SUVmax ratio threshold is 2.7 (sensitivity 0.60 %, specificity 66 %). c Patients with PET only. The AUCs are 0.733 and 0.688. For discriminating between clinically infected and noninfected patients, the optimal SUVmax threshold is 4.0 (sensitivity 65 %, specificity 65 %) and the optimal SUVmax ratio threshold is 3.1 (sensitivity 0.65 %, specificity 65 %)

Evaluation based on previously published interpretation criteria

Several authors [2428] have proposed quantitative, semiquantitative and qualitative visual assessment methods or the combination of these methods for the diagnosis of infection. The qualitative assessment of Familiari et al. [28] was based on asymmetry versus symmetry in comparisons with the other foot and with other parts of the same foot. We were able to use their criteria in 195 patients, and obtained positive findings in 124 of 195 patients (64 %). Sensitivity, specificity, PPV, NPV and diagnostic accuracy were 83 %, 54 %, 62 %, 77 % and 68 %, respectively. Another grading system was proposed by Stumpe et al. [24] for differentiation of infection and aseptic loosening in total hip replacement in which [18F]FDG PET uptake is graded on a five-point scale. Infection was only diagnosed if diffusely increased uptake was strong but distinctly less than the physiological uptake in the bladder (grade 3) or very strong and comparable to physiological urinary uptake in the bladder (grade 4). According to their assessment, sensitivity, specificity, PPV, NPV and diagnostic accuracy were 84 %, 37 %, 58 %, 69 % and 61 %, respectively. In our patient group partial-body PET scans were performed so that the bladder was only fully mapped in 61 patients. Hence, we were able to apply their interpretation criteria in only 61 patients.

Reinartz et al. [25] based their evaluation on five different patterns of [18F]FDG PET uptake. Infection of hip arthroplasty was only assumed if [18F]FDG PET uptake was seen in the periprosthetic soft tissue. Their assessment in all patients with an orthopaedic implant (66 patients), sensitivity, specificity, PPV, NPV and diagnostic accuracy were 87 %, 29 %, 52 %, 71 % and 56 %, respectively. Chacko et al. [26] and Love et al. [27] interpreted images as positive for periprosthetic infection of the hip or hip and knee, respectively, using two different diagnostic criteria. The first considered any increased [18F]FDG uptake at the prosthesis–bone interface as indicative of infection (criterion 1 of Chacko et al. and criterion 3 of Love et al.). The second considered any increased [18F]FDG uptake adjacent to the prosthesis as indicative of infection (criterion 2 of Chacko et al. and criterion 1 of Love et al.), whether at the bone–prosthesis interface, the tip of the prosthesis or in the soft tissues surrounding the prosthesis. According to Chacko et al. [26] and Love et al. [27], sensitivity, specificity, PPV, NPV and diagnostic accuracy were 77 %, 46 %, 56 %, 70 % and 61 %, respectively, for the first criterion, and 100 %, 20 %, 53 %, 100 % and 58 %, respectively, for the second criterion. For another criterion of Love et al. [27], a semiquantitative analysis was performed. Target-to-background ratios were calculated by taking the most intense activity at the bone interface and the uptake of the soft tissue of the corresponding contralateral extremity. The best cut-off value was 3.3, and the sensitivity, specificity, PPV, NPV and diagnostic accuracy were 61 %, 63 %, 57 %, 67 % and 62 %, respectively. The diagnostic yields of the different interpretation criteria are presented in Table 5.

Table 5 Diagnostic yield with different qualitative interpretation criteria

Discussion

OM and implant-associated infections may affect any patient who undergoes surgical treatment or has had a history of trauma. The incidence of OM and implant-associated infections is steadily increasing, leading to disability and increasing health costs [29]. Therefore, there is an increasing need for reliable diagnostic imaging methods.

To the best of our knowledge, this is the first study comparing [18F]FDG PET and [18F]FDG PET/CT findings with intraoperative bacterial culture results, and with an extended follow-up in a large cohort. Previous studies have investigated the diagnostic value of [18F]FDG PET(/CT) in the diagnosis of suspected bone infections only in patient cohorts with fewer than 50 patients [15, 17, 22, 30].

Our cohort of 215 patients presented with nonspecific symptoms including intermittent (lasting from months to many years) bone pain and tenderness. Previous examinations, evaluation of alterations in biochemical markers, plain radiography, or CT and MR imaging were not definitive. The interval between the initial trauma and the [18F]FDG PET(/CT) examination was considerable, with a mean delay of more than 10 years. This very prolonged course indicates a high proportion of chronic disease or a low infection grade. These factors may have interfered with the final diagnosis, and justified the decision to resort to a PET scan. The diagnosis by [18F]FDG PET in these specific patients was likely to have been more challenging than in clinically more obvious cases.

With regard to our highly selected and characterized patient cohort, our results reveal a high diagnostic accuracy of the PET procedure, that allows a comprehensive diagnosis of OM, often after many years of uncertainty. In our patient group sensitivity, specificity, PPV, NPV and accuracy were 86 %, 67 %, 70 %, 84 %, and 76 %, respectively. In the PET/CT subgroup, specificity further increased to 76 % and accuracy to 82 %.

Other studies have likewise shown that OM imaging using [18F]FDG PET contributes significantly to obtaining a correct diagnosis and affords high diagnostic yield. In some previous studies, [18F]FDG PET was found to provide a sensitivity of up to 100 % and a specificity in the range of 87 – 93 % in the diagnosis of chronic musculoskeletal infections, including patients with and without metallic implants or prosthetic replacements [15, 22, 30, 31]. In a meta-analysis of suspected OM, PET/CT using [18F]FDG has shown a promising sensitivity of 92 % and a specificity of 92 %, in contrast to bone scintigraphy (83 %, 45 %) or leucocyte scintigraphy (74 %, 88 %) [6]. In our highly selected patient group with chronic, smouldering OM persisting for over 10 years and with nonspecific intermittent symptoms, FDG PET yielded an excellent sensitivity of 86 %.

Furthermore, the NPV of 84 % is remarkable, given the otherwise ambiguous clinical picture. Treatment of OM requires long and expensive medical therapies, sometimes extending to surgical resection for debridement of necrotic bone or to consolidate or substitute the compromised bones and joints. Therefore, excluding OM with high certainty by a noninvasive PET scan with negative findings should help avoid unnecessary therapy.

The specificity of 67 % in the overall group is relatively modest, although fusion of the CT and PET data provided greater specificity than PET alone. The present study demonstrated the superiority of fused PET/CT images with respect to accurate anatomical allocation of positive lesions. Indeed, the fused images proved to be helpful to the surgeon for accurate anatomical allocation of positive lesions, as well as in giving precise information on the extent of the suspected infection. In several patients, this information altered the intraoperative strategy with regard to the anatomical identification of the infectious hot spot and the selection of the surgical approach. Higher diagnostic yield with PET/CT imaging has also been found in a previous study investigating OM [32].

Importantly, we found that the presence of an orthopaedic implant did not reduce the diagnostic yield in patients undergoing PET/CT. PET/CT was falsely positive in 2 of 37 patients. One false-positive finding was related to bone fragments. These bone fragments might have led to sterile inflammation and consequently higher [18F]FDG uptake. However, we cannot confirm that the often-claimed metal-induced false-positives due to attenuation artefacts and/or sterile inflammation are clinically relevant. The lack of a standardized interpretation approach for defining an implant-associated infection remains a critical issue. The strengths of quantitative methods are high interobserver agreement and intraobserver reproducibility. Quantitative methods are more immediately performed and less time-consuming than semiquantitative methods. In our study quantitative and semiquantitative analysis of [18F]FDG uptake showed a significantly higher SUVmax and SUVmax ratio (infected to the contralateral noninfected site) in confirmed infections. However, no reliable SUV-based threshold for more accurate diagnosis in the individual patient emerged.

The mere presence of periprosthetic activity or activity close to postoperative changes cannot automatically be associated with infection. Reactive changes such as loosening, synovitis, postoperative changes and sterile inflammation can also lead to high uptake which may be misinterpreted when only quantitative analysis is performed. These results are in line with those of other studies which have shown that in addition to infection, periprosthetic activity may also occur in synovitis and aseptic loosening [26, 33, 34]. The results of our analysis and the results of the proposed quantitative and semiquantitative analyses show that a single quantitative method seems to underperform qualitative methods. Using quantitative or semiquantitative methods as the only approach to diagnosing infection will result in a high number of false-positive findings. Consequently, qualitative assessment is needed.

Stumpe et al. [24] compared uptake of suspected infections with uptake in the bladder. In our patient cohort activity in the bladder was extremely variable and probably depended on the diuretic administration, time of scanning after injection and the time of emptying the bladder before the scan. Furthermore, in our patient cohort the number of patients with low-grade infection was high. Low-grade infections may be associated with low uptake which may be definitively lower than in the bladder. The intensity of increased [18F]FDG uptake may be less important than its location. Familiari et al. [28] compared uptake in the infected side with that in the contralateral side. Our patient cohort consisted of posttraumatic patients with and without an orthopaedic implant. In this subgroup in particular asymmetric [18F]FDG uptake between the extremities may be attributable to surplus stress which cannot necessarily be defined as pathological. Furthermore, comparing uptake along an orthopaedic implant and uptake in the contralateral nonmanipulated extremity was found to be difficult. Asymmetry does not automatically imply infection in this special situation.

The qualitative assessments of Chacko et al. [26] and Love et al. [27] which take the mere presence of [18F]FDG uptake adjacent to the prosthesis as infection whether the uptake is at the bone–prosthesis interface, the tip of the prosthesis or in the soft tissues surrounding the prosthesis seem to be too nonspecific, but we can recommend the assessment that considers [18F]FDG uptake at the prosthesis–bone interface as indicative of infection. Compared with the different assessments, our analysis provided better sensitivity, specificity, PPV, NPV and accuracy. Our nuclear readers are used to our interpretation criteria. Operating with different interpretation criteria may lead to poorer results due to lack of practice. Furthermore, most of the previous studies examined hip arthroplasty. It is questionable if their assessments can be applied to orthopaedic implants in general.

Our study indicated that the best image interpretation was obtained from a combination of visual interpretation and assessment of the FDG uptake pattern, as well as judgment of the intensity of uptake in terms of SUVmax and SUVmax ratio, although absolute cut-off values alone were not particularly reliable. There is substantial potential for the investigation in future studies of hybrid whole-body PET/MRI in the evaluation of OM [11]. Structural MRI, as well as functional diffusion-weighted imaging and perfusion imaging, provide better anatomical and tissue contrast information than CT. Thus, simultaneous acquisition of PET and MRI could improve specificity in the detection of OM infection, but this remains to be formally demonstrated. For the surgeon the implementation of PET/CT or PET/MRI scans in an intraoperative navigation system might increase sensitivity and specificity for OM detection by more accurately addressing the regions of increased [18F]FDG uptake in bone [35].

A major limitation of this cross-sectional retrospective study was the criteria for diagnosing infection. We compared PET findings with the results of swabs, tissue cultures and clinical follow-up. Indeed, the preferred diagnostic criterion for OM is either a positive culture from bone biopsy or histopathology consistent with necrosis. The microbial proof or exclusion of infection is without doubt the gold standard. it cannot be influenced. In clinical routine not all patients underwent surgery and we decided not to exclude patients who had not undergone surgery.

Sensitivity and NPV were higher in patients who were followed up, while specificity and accuracy are comparable in those who were and were not followed up. On the other hand, PPV was markedly lower in those who were followed up (Table 3). The rate of false-positive findings was higher in patients who were followed up. It therefore appears that in patients with a positive PET scan other factors influenced the clinicians’ decision to operate or to follow up. It is noteworthy that both methods were comparably specific. However, their sensitivity was variable, so some patients might be clinically misjudged, despite an invasive procedure.

Conclusion

[18F]FDG PET has high sensitivity for identifying OM infections when conventional clinical and laboratory findings are inconclusive. A simultaneous PET/CT scan particularly with intravenous contrast agent enhancement provides higher accuracy than stand-alone PET. [18F]FDG PET may help diagnose infections noninvasively, thereby reducing the incidence of devastating complications. The remarkably high NPV (84 %) is important for clinical routine, as a reliable negative PET scan can exclude an infection with high probability so that unnecessary surgical or medical interventions can be avoided. However, these findings need to be confirmed in a prospective setting.