Introduction

Prostate cancer is the leading cause of cancer incidence in men [1]. Bone is the second most common site of metastases in prostate cancer after lymph nodes [2, 3].

Prostate cancer osseous metastases are typically osteoblastic and preferentially develop in the axial skeleton. However, the mixed osteoblastic/osteolytic pattern can also be seen in some patients [3]. Given the high incidence of osseous metastases in prostate cancer, accurate detection of these lesions can enhance early staging and is essential in decision-making for subsequent management.

For decades, detection of bone metastases has been relied significantly on bone scintigraphy with 99mTechnetium-labeled phosphonate (99mTc-BS) despite its limited sensitivity and specificity [2]. 18F-Sodium fluoride (18F-NaF) is another bone-specific imaging radiopharmaceutical which was initially approved for the clinical use by the U.S FDA in 1972 [4, 5]. Many studies support the clinical utility of 18F-NaF-PET/CT in assessing the extent of metastatic bone disease in oncologic patient [6,7,8,9,10,11,12,13,14,15,16,17,18,19]. In addition to high diagnostic performance [20, 21], 18F-NaF-PET/CT was shown to impact the patient management and provides prognostic information in multiple clinical scenarios [22,23,24]. There is still no clear estimate on the accuracy of 18F-NaF-PET/CT for the detection of bone metastases in prostate cancer, as most published studies consisted of small and heterogeneous groups of patients, sometimes with partially overlapping populations.

This meta-analysis aims to establish the summary diagnostic performance of 18F-NaF-PET/CT for the detection of bone metastases in staging and restaging of prostate cancer patients with high risk of bone metastases. The diagnostic performance of 18F-NaF-PET/CT is compared with other conventional and emerging imaging techniques in the same cohort of patients, where feasible.

Materials and methods

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement was followed [25].

Search strategy

Systematic search was performed in PubMed/Medline, Embase and abstract proceedings of major scientific meetings (SNMMI, EANM) to identify relevant published studies. The search strategy was based on the following combination of keywords: (A) “prostate” AND (B) “18F Fluoride PET” OR “18F Fluoride PET/CT” OR “18F NaF” OR “NaF” OR “sodium fluoride PET”. The search was last updated on September 28th, 2018, without any restrictions on language, publication date, or publication status.

Criteria for study consideration

Patients Prostate cancer patients with prior clinical/laboratory/imaging suspicion of bone metastases (e.g., osteoarticular pain, elevated alkaline phosphatase or prostate-specific antigen, high Gleason score, known bone metastases or inconclusive prior imaging).

Index-test18F-NaF-PET/CT as an adjunct to conventional imaging.

Reference standard A combination of histopathologic result, where feasible, and clinical or imaging follow-up. In lesion-level analysis, since the bone biopsy of all lesions was not routinely performed in patients with advanced disease, corresponding findings on follow-up imaging were usually considered as the reference standard.

Selection of studies, data extraction, and study outcome

All records identified through the electronic search were initially screened for eligibility on the basis of the title and abstract by one author. Review articles, editorials, case-reports, and irrelevant citations were excluded in the initial assessment. The full-texts of the potentially relevant publications were retrieved for further consideration. All potentially eligible articles were independently checked by two authors for predefined inclusion criteria.

To avoid double-counting of evidence, particular attention was made to identify abstracts/articles with potentially overlapping patient populations by comparing authors, institutions, study periods, and patient characteristics. When there were more than one published article from the same institution [12, 26], only the publication with the largest sample size was included [12].

Two authors independently extracted the following data from each included study; bibliographic details, patient demographics and disease characteristics, index tests, reference standard, and the number of patients or lesions with true-positive, false-positive, true-negative, and false-negative results. The study authors were contacted seeking additional information only in case a subpopulation of a study fulfilled the eligibility. All data extracted by the two review authors were compared in each step and any discrepancies were resolved through consensus or by a third author.

Subgroup analysis was performed to assess the pooled comparative performance of 18F-NaF-PET/CT relative to other imaging in the same cohort of patients, including 99mTc-planar-BS, 99mTc-BS with SPECT, whole body (WB)-MRI with the diffusion-weighted imaging (DWI), 68Ga prostate-specific membrane antigen (PSMA)-PET/CT and 18F-FDG-PET/CT.

Assessment of methodological quality

A modified version of the Quality Assessment Tool for Diagnostic Accuracy (QUADAS-2) was used to assess the methodological quality of the included studies and likelihood of bias, as recommended by Cochrane Collaborations [27].

Statistical analysis and data synthesis

The sensitivity, specificity and diagnostic log odds ratios (DOR), along with the corresponding 95% confidence intervals (CIs), were recalculated for each primary study by cross-relating index test results and the reference standard. The forest plots of sensitivity and specificity were used to display the variations in the results of the individual studies. A chi-square test (P < 0.05) was used to assess heterogeneity among the studies and quantified using I-squared index (I2). I2 lies from 0 to 100%, and the respective values around 25, 50, and 75 indicate low, moderate, and high heterogeneity [28]. In the presence of heterogeneity, the random-effect assumption was used for synthesizing data (DerSimonian–Laird) [29]. I2 has a substantial bias when the number of studies is small and should be interpreted cautiously in our subgroup analysis [30].

Pooled estimates of sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and DORs were calculated. The diagnostic tests with a DOR more than 25 and 100 are considered moderately and highly accurate, respectively [31].

A summary receiver operator characteristic curve (SROC) was generated. Each data point indicates a particular study and sizes of points are proportional to the sample size. The overall summary of the diagnostic test performance was determined by calculating the area under the SROC curve (AUC) and the Q* index. An AUC value of 1.0 (100%) indicates a perfect discriminatory ability for a diagnostic test. The statistical significance of the difference between the AUC values were determined with the Hanley JA method [32]. A two-tailed P value of < 0.05 was considered statistically significant.

For the assessment of publication bias, funnel plots of standard error (SE) and Egger’s regression intercept were examined. Analyses were performed using Meta-DiSc software (version 1.4; Hospital Universitario Ramon y Cajal, Madrid, Spain) and Comprehensive meta-analysis software (CMA version 2, Biostat, Englewood, NJ, USA).

Results

Search results

Using the comprehensive search strategy outlined in the method section, 453 records were identified, of which 417 were excluded by initial screening of titles and abstracts. After careful consideration, 14 studies met our criteria and were included in this meta-analysis [6, 7, 9,10,11,12,13,14, 17, 18, 33,34,35,36]. The detail of the study selection is shown in Fig. 1.

Fig. 1
figure 1

Flowchart of systematic literature review

Study characteristics and methodological quality assessment

Fourteen studies on prostate cancer patients who were referred for staging or restaging of high-risk disease were included, with publication years ranging from 2006 to 2018. Patients were enrolled prospectively in 13 studies and retrospectively in 1 study [33]. In each study, at least two readers visually interpreted the imaging findings as negative, positive or equivocal. In this meta-analysis, indeterminate/equivocal image findings were classified as positive, suggestive for metastases, across all studies. While the reference standard was generally acceptable in all studies, the definition of reference standard widely varied. The characteristics of the included studies are summarized in Table 1. Figure 2 depicts the risk of bias and applicability concerns across the included studies.

Table 1 Summary characteristics of the selected studies
Fig. 2
figure 2

The risk of bias and applicability concerns: review of authors’ judgments about each domain, presented as percentages across included studies

Diagnostic accuracy of 18F-NaF-PET/CT in the detection of bone metastases

Patient-level data

Twelve studies including 507 patients provided the per-patient-basis information [6, 7, 9,10,11,12,13, 17, 18, 33,34,35]. The forest plots of sensitivity and specificity for 18F-NaF-PET/CT on a patient-basis are illustrated in Fig. 3. The pooled sensitivity, specificity and DOR were 0.98 (95% CI 0.95–0.99), 0.90 (95% CI 0.86–0.93) and 123.2 (95% CI 53.7–282.6), respectively. The pooled PLR and NLR estimates were 6.64 (95% CI 4.23–10.43) and 0.07 (95% CI 0.04–0.13).

Fig. 3
figure 3

Forest plots of per-patient basis sensitivity (a), specificity (b) and summary receiver operating characteristic curve (c) of 18F-NaF-PET/CT in the detection of bone metastases across the included studies

There is low heterogeneity among the studies in their estimates of sensitivity (I2 = 4%) and specificity (I2 = 44.8%). The SROC curve analysis yielded an excellent trade-off between sensitivity and specificity, with the AUC of 0.97 (SE = 0.01) and the Q* index of 0.91 (Fig. 3c).

Lesion-level data

Seven studies provided the lesion-based accuracy information of 1812 lesions identified on 18F-NaF-PET/CT [6, 11,12,13,14, 35, 36]. Figure 4 shows the paired forest plot of sensitivity and specificity for 18F-NaF-PET/CT on a lesion basis. The pooled per-lesion accuracy analysis revealed sensitivity of 0.97 (95% CI 0.95–0.98), specificity of 0.84 (95% CI 0.81–0.87) and DOR of 206.78 (95% CI 35.19–1215.2). A likelihood ratio synthesis yielded an overall PLR of 7.35 (2.86–18.91) and NLR of 0.05 (0.02–0.14). The AUC was 0.97 (SE = 0.025) and the Q* index was 0.93, indicating excellent diagnostic accuracy. There is high heterogeneity (I2 > 75%) in lesion-level analysis between the studies both in their estimate of sensitivity (I-square 89.7%) and specificity (I-square 95.9%).

Fig. 4
figure 4

Forest plots of per-lesion basis sensitivity (a), specificity (b) and summary receiver operating characteristic curve (c) of 18F-NaF-PET/CT in the detection of bone metastases across the included studies

Comparative effectiveness of 18F-NaF-PET/CT

The detail on the comparative performance of 18F-NaF-PET/CT with 99mTc-BS, 99mTc-SPECT and WB-DWI-MRI is presented in Table 2.

Table 2 Comparative performance of 18F-NaF-PET/CT with 99mTc BS, 99mTc SPECT and WB-DWI MRI

18F-NaF-PET/CT versus 99mTc-bone scintigraphy

Six studies directly compared the performance of 18F-NaF-PET/CT and planar 99mTc-BS [6, 7, 9, 13, 14, 34]. Per-patient basis, 18F-NaF-PET/CT showed higher sensitivity (0.99 versus 0.83), and specificity (0.86 versus 0.62), compared with 99mTc-BS. Overall, 18F-NaF-PET/CT outperformed 99mTc-BS on both per-patient basis (AUC 0.990 versus 0.842, P < 0.001, n = 148) and per-lesion basis analysis (AUC 0.998 versus 0.771, P < 0.001, n = 744).

18F-NaF-PET/CT versus 99mTc-SPECT (± CT)

The direct comparison of 18F-NaF-PET/CT and 99mTc-SPECT was reported in four studies [6, 11, 13, 34], of which one study used combine 99mTc-SPECT/CT [34].

Compared to 99mTc SPECT, 18F-NaF-PET/CT showed higher sensitivity, specificity, and superior diagnostic performance on both per-patient and per-lesion analysis (Patient level, n = 117: AUC of 0.996 versus 0.896, P < 0.001; lesion level, n = 268 lesions: AUC of 0.998 versus 0.795, P < 0.001).

18F-NaF-PET/CT versus WB-MRI with DWI

Four studies directly compared the performance of 18F-NaF-PET/CT and WB-MRI [6, 10, 17, 18]. 18F-NaF-PET/CT appeared to have higher sensitivity (0.95 versus 0.83) and comparable specificity (0.90 versus 0.90), with no statistically significant difference in the diagnostic accuracy (AUC 0.974 versus 0.947, P = 0.18).

18F-NaF-PET/CT versus 68Ga-PSMA-PET/CT and 18F-FDG-PET/CT

Evidence regarding the comparative effectiveness of 18F-NaF-PET/CT with 68Ga-PSMA-PET/CT and 18F-FDG-PET/CT in prostate cancer patients is sparse [7,8,9, 17, 18]. Studies reported the direct comparison of 18F-NaF-PET/CT with 68Ga-PSMA-targeted-PET/CT (2 studies, n = 123 patients) and 18F-FDG-PET/CT (2 studies, n = 67 patients) is summarized in Table 3.

Table 3 Summary of the studies comparing the performance of 18F-NaF-PET/CT, 68Ga-PSMA-PET/CT and 18F-FDG-PET/CT

Analysis of the available literature shows no significant difference in the performance of 18F-NaF-PET/CT and 68Ga-PSMA-PET/CT in the detection of bone metastases with the pooled sensitivity of 0.93 versus 0.93; and specificity of 0.92 versus 0.99, respectively. Compared to 18F-FDG-PET/CT, 18F-NaF-PET/CT had significantly higher sensitivity (0.68 versus 1.00) in the detection of bone metastases. Due to the limited number of studies, the AUC was not estimated.

Risk of publication bias

Figure 5 demonstrates the funnel plot of the included studies in patient-based analysis. The asymmetric funnel plot indicates possible publication bias (Egger’s regression intercept of DOR pooling, 3.04, 95% CI 0.76–5.32; two-tailed P = 0.01).

Fig. 5
figure 5

Funnel plot of the included studies on the performance of 18F-NaF-PET/CT

Discussion

This study is the first meta-analysis assessing the diagnostic accuracy of 18F-NaF-PET/CT in staging and restaging of prostate cancer patients with high pre-test probability of bone metastases, in comparison with other imaging techniques. Our result showed that 18F-NaF-PET/CT has excellent diagnostic performance in the detection of bone metastases with the pooled sensitivity, specificity, and AUC of 0.98, 0.90 and 0.97, respectively.

The performance of 18F-NaF-PET/CT for bone imaging of oncologic patients has been previously reported in two meta-analyses [20, 21], the latest limited to the studies published before August 2013 [20]. Shen et al. included a heterogeneous group of patients with breast, prostate, lung, thyroid, head and neck, hepatocellular and urinary bladder cancer, and showed a pooled sensitivity, specificity, and AUC of 0.92, 0.93 and 0.985, on a per-patient basis [20].

In concordance with prior studies [20, 21], our analysis supports 18F-NaF-PET/CT as an excellent alternative to conventional 99mTc-BS or SPECT imaging for bone imaging of high-risk prostate cancer patients. We found that the performance of 18F-NaF-PET/CT is superior to the 99mTc-BS and 99mTc-SPECT on both per-patient and per-lesion-level analysis. 99mTc-phosphonates and 18F-NaF are bone-specific radiotracers that can show areas of altered osteogenic activity [5]. Compared with 99mTc-phosphonate agents, higher bone uptake and faster blood clearance of 18F-NaF, combined with superior spatial resolution of PET, allow a more accurate delineation of bone metastases [4, 5].

Whole-body DWI is a new technique in the staging of patients with solid tumors and can provide metrics of the molecular and vascular characteristics of tumors [37]. Although a number of studies suggested the usefulness of WB-MRI including DWI in the evaluation of bone and visceral metastases in prostate cancer, use of WB-DWI-MRI in staging of prostate cancer has been still debated, addressed by ESUR guideline [6, 10, 17, 18, 38]. This is mainly due to technical challenges in acquisition, quality and absence of standardized interpretation criteria [18, 38]. In our analysis, we found no significant difference in the overall performance of 18F-NaF-PET/CT and WB-DWI-MRI, though 18F-NaF-PET/CT appears to have higher sensitivity.

18F-FDG is the most commonly used PET-imaging agent in oncology. The sensitivity of 18F-FDG is limited in prostate cancer due to low glycolytic rate of most skeletal metastases from prostate cancer [3]. To date, few studies compared the performance of 18F-FDG-PET/CT versus 18F-NaF-PET/CT in patients with prostate cancer [7,8,9]. These studies suggested lower sensitivity but higher specificity for 18F-FDG-PET/CT in the detection of osseous metastases. A number of pilot studies have suggested that combined 18F-FDG/ NaF-PET/CT imaging can improve the specificity of 18F-NaF for the evaluation of disease extent in patients with prostate cancer [39]. Yet, the implication of these findings needs further investigations in larger cohorts of patients.

With rapidly expanding clinical adaptation of PSMA-targeted-PET imaging, a number of recent studies compared the utility and performance of PSMA-targeted PET/CT and 18F-NaF-PET/CT in the detection of bone metastases in prostate cancer [17, 18, 40, 41]. These studies showed excellent and comparable diagnostic performance for 68Ga-PSMA-targeted-PET/CT and 18F-NaF-PET/CT in the detection of bone metastases. Two recent studies suggested that 18F-NaF-PET/CT detect a higher number of pathologic bone lesions, particularly in patients with metastatic castrate sensitive disease [40, 41]. However, PSMA-targeted-PET/CT has several advantages over 18F-NaF imaging including the ability to identify both bone and visceral/ lymph node metastases, and to direct PSMA-targeted-radionuclide therapy [18].

Currently, the clinical use of 18F-NaF-PET/CT in the United States is restricted to larger medical centers, most commonly due to lack of availability and reimbursement challenges by the Centers for Medicare and Medicaid Services (CMS) [17]. Recent study by National Oncologic PET Registry (NOPR) showed that 18F-NaF-PET/CT has substantial impact in changing the intended management in approximately 44–53% of prostate cancer patients [24, 42]. The effect was particularly higher in the patients suspected of having progressive bone metastases [24, 42]. Understanding the disease-specific performance of 18F-NaF-PET/CT and proper patient selection seems to be the key in the appropriate utilization of 18F-NaF-PET/CT imaging and its inherent cost reduction. Future prospective studies, along with the analysis of cost and clinical availability, are needed to fully determine the cost effectiveness of 18F-NaF-PET/CT compared to other emerging imaging modalities including WB-DWI-MRI and PSMA-targeted PET/CT, in the selected high-risk prostate cancer patients.

The main limitation of this study is the lack of gold standard, as histopathology was not practically available in all studies. We considered the histopathology and/or clinical/imaging follow-up as a reference standard, which might be a source of heterogeneity. Second, the result of subgroup analysis should be interpreted with cautious. Although the included studies had fairly similar methodology, the small number of studies in each subgroup limits our conclusion.

Conclusion

18F-NaF-PET/CT has excellent diagnostic performance in the detection of bone metastases in staging and restaging of high-risk prostate cancer patients. The performance of 18F-NaF-PET/CT is superior to 99mTc bone scintigraphy and SPECT, and comparable to WB-DWI-MRI.