Introduction

Prostate cancer is the third most common malignancy after lung and breast malignancies. In 2018, there were almost 1.3 million new cases of prostate cancer and 359,000 associated deaths around the world, and prostate cancer was ranked as the second most frequent cancer and the fifth leading cause of cancer-related deaths in men [1]. Bone metastases are very common in prostate cancer [2], and about 80% of patients with prostate cancer die from this condition. Early detection and accurate diagnosis of distant metastasis, most commonly in the axial bone [3], can improve the 5-year survival rate of patients.

Bone scintigraphy (BS) is the most widely used method for evaluating the bone metastasis of prostate cancer as early as possible [4]. BS is a highly sensitive imaging method and low cost, but its specificity is low. It is difficult to distinguish among bone tumors, trauma, degenerative changes, and infection. Single photon emission computed tomography (SPECT) has higher accuracy in detecting bone metastases than BS due to its capture of three-dimensional (3D) positional information, and is a supplementary tool to evaluate the diagnosis of unclear lesions [5].

Equivocal uptake of the tracer in SPECT often requires additional examinations. Magnetic resonance imaging (MRI) is the most commonly used. In the past 30 years, MRI has continuously been used to detect bone metastases and has shown promising results. Diffusion-weighted imaging, axial skeleton MRI, whole-body MRI, and even routine prostate MRI are useful for determining bone metastasis in patients with prostate cancer [6,7,8]. In the meta-analysis of Woo et al., MRI shows excellent sensitivity and specificity in detecting bone metastasis in patients with prostate cancer [9].

Positron emission tomography/computed tomography (PET/CT) has been widely used in the diagnosis, staging, restaging, and therapeutic evaluation of several malignancies in the last decade. PET can provide functional information about tumor lesions with a high resolution, and CT can provide a detailed and precise anatomical location of tumor lesions and related morphological changes. PET/CT is a noninvasive metabolic imaging, and can be used to evaluate the efficacy of new therapies by dynamic acquisition and quantitative analysis [10]; however, the cost of PET/CT limits its use in medicine.

18 F-Sodium Fluoride (18F-NaF) has been used as a positron imaging agent for the last 40 years. Before the advent of PET/CT, its use was limited. It has nearly been replaced by 99mTc-diphosphate as a radiotracer for detecting bone metastases with SPECT. 18F-NaF-PET/CT has excellent advantages in investigating bone metastases. It provides greater spatial resolution, better image quality, better sensitivity, and better specificity. Compared with BS, it can more accurately assess a treatment response and detect occult bone metastases at lower prostate specific antigen (PSA) levels [3].

Choline is the raw material for phospholipids, which are an important component of cell membranes. A large amount of choline is needed for cell membrane biosynthesis in tumor tissue. In addition, up-regulation of choline kinase activity in tumor cells leads to a further increase in choline demand. Both of these conditions increase the uptake of choline in tumor cells [11]. 18F and 11C are the most commonly used radionuclides for labeling choline. Although the half-lives between 11C-choline and 18F-choline are different (20 min for 11C-choline vs. 110 min for 18F-choline), the imaging method is the same [10]. Some studies reported the capability of 11C-choline PET and 18F-choline PET to detect bone metastases in patients with prostate cancer [12,13,14].

Prostate-specific membrane antigen (PSMA), a type II transmembrane glycoprotein, is expressed in prostate epithelial cells. The expression of PSMA in prostate cancer cells is 100–1000 times that of normal cells, and is therefore a potential target for prostate cancer-specific imaging. In recent years, use of 68Ga-PSMA-PET/CT has been widely reported and has become the focus of research [15, 16]. The aim of this study was to compare the diagnostic value of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, and BS imaging in the diagnosis of bone metastases of prostate cancer, and to provide better evidence-based advice for doctors in this field.

Materials and methods

Literature search and selection criteria

We searched PubMed and Embase for articles published between January 1990 and September 2018 using the following keywords: (prostate cancer OR prostate carcinoma) AND (bony metastases OR skeletal metastases OR osseous metastases OR bone metastases) AND (bone scan OR bone scintigraphy OR SPECT OR single-photon emission computed tomography OR MRI OR magnetic resonance imaging OR PET/CT OR positron emission tomography/computed tomography). The studies were limited to English. The search and article selection were independently assessed by two experienced nuclear medicine doctors without any discrepancies.

The main topics of the included articles for analysis were the diagnostic value of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, or BS in detecting bone metastases from prostate cancer. Prostate cancer was confirmed by histopathology or biopsy. Bone metastases from prostate cancer were confirmed by histopathology or biopsy or imaging findings, and clinical follow-up for at least 6 months. The types of research articles were prospective studies, retrospective cohorts, or clinical comparative series. The results compiled included sensitivity, specificity or the number of true positivity, true negative, false positive and false negative. Abstracts, case reports, reviews, notes, letters, comments, animal experiments, other unrelated imaging modalities, articles where the number of cases was less than 10, and original articles with incomplete data were excluded. When the same data were used in more than one article, two investigators evaluated them to make the most suitable choice for our analysis.

Quality assessment

Articles were assessed for quality by the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool. The QUADAS-2 tool is divided into four sections: risk of bias in patient selection, the index test, the reference standard, and the flow and timing of the reference test. The degree of applicability of patient selection, the index test, and the reference standard is also evaluated; there are three answers: “yes,” “no,” and “unclear” for the relevant questions included in each section. The risk level of bias can be judged as “low,” “high,” or “uncertain.” If the answers to all the questions in each section are “yes,” then the risk of bias was low; if one of the answers is “no,” then the risk of bias was assessed as high. If the answers are “unclear” and “yes” or the answers are all “unclear,” then the risk of bias was assessed as uncertain. Each study was scored independently by two evaluators without any discrepancies. The applicability of case selection was assessed according to the matching between the patients’ backgrounds and the purpose of the evaluation. The applicability of the index test was evaluated according to the match between the implementation and interpretation of the test and the purpose of the evaluation. Then applicability of the gold standard was also assessed.

Data extraction and statistical analysis

After searching, the articles were imported into Endnote X7 (Thomson ResearchSoft, Stanford, Connecticut) and the duplicates were removed by the software’s duplication checking function. After reading the titles and abstracts, two investigators independently assessed the studies according to the inclusion and exclusion criteria. For uncertain articles, a decision was made by reading the full text. In the course of reading the full text, two evaluators independently extracted relevant information such as the author, publication date, type of study, consecutive enrollment, number of patients, number of lesions, age, PSA level, case characteristics, reference standard, sensitivity, specificity or the number of true positivity, true negative, false positive, and false negative. Furthermore, we excluded articles not suitable for this analysis such as those including 18F-PSMA, 99mTc-PSMA, or extraosseous metastasis. Based on the bivariate random-effects model, the extracted data were analyzed by the Midas command in Stata 15.1 software (StataCorp, College Station, Texas) to calculate the summary sensitivity, specificity, 95% CI, DOR, and to draw the summary SROC curve and calculate the AUC, Q* value, and 95% CI. We conducted Z tests to find whether the sensitivity, specificity, AUC, and DOR were significantly different among imaging modalities. I2 statistics were used to assess the heterogeneity of data. If P < 0.05, it was considered statistically significant.

Results

Literature search

According to the retrieval strategy, 3,861 articles (1,550 articles in PubMed, 2,311 articles in Embase) were identified. In total, 695 duplicate articles were excluded and, 2,014 articles comprising conference abstracts, reviews, letters, and editorials were also excluded. In total, 1,067 irrelevant articles were excluded by reading the titles and abstracts, while the number of remaining articles was 85. Full-text reviews were performed and some articles failed to meet the inclusion criteria. Finally, 24 articles were included in the study (see Fig. 1 for the search strategy).

Fig. 1
figure 1

Summary of the study selection process

Characteristics of included studies

Twenty-four articles [4,5,6, 13, 14, 17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35] were published between 2006 and 2018, including 12 prospective studies [4,5,6, 18, 22, 27, 28, 30,31,32,33, 35], 9 retrospective studies [13, 14, 19,20,21, 23,24,25, 29], and 3 clinical controlled studies [17, 26, 34]. There are 12 consecutive or random samples and 5 nonconsecutive or nonrandom samples, and the samples for the remaining 7 are unclear. The studies were conducted in Germany, Britain, Italy, Austria, the United States, Japan, South Africa, Switzerland, Denmark, the Netherlands, Belgium, and Israel. The number of cases in each study ranges from 16 to 194. PSMA-PET/CT was performed in six studies, Choline-PET/CT in 11, NaF-PET/CT in 7, MRI in 9 studies and BS in 14 studies. When two or more examinations were involved, the interval between examinations was 1 day to 4 months, and the majority took place within 1 month. Three studies included outcomes performed for primary staging, nine reported outcomes for biochemical recurrence or disease progression staging (secondary staging), three articles reported outcomes for PSA and/or Gleason score elevation, and eight included mixed groups. The characteristics of the included studies are described in Table 1.

Table 1 Study characteristics

Quality assessment

In the patient selection domain, some studies did not clearly report the sampling structure. Equivocal diagnosis did not exist or comprised less than 20% of the cases in some studies, so patient selection carries a risk of bias; however, the applicability of patient selection was acceptable in that the patients in the included studies were matched with the research subjects. All the studies clearly reported methodology for the index test. These readers were blinded to patients’ clinical and other imaging findings in most studies. Only a few studies did not follow the blind method or did not mention it, so we consider there to be a low risk of bias. The applicability of the index test in most studies was acceptable. There were four articles with poor applicability that did not conduct a meta-analysis, but instead calculated per-lesion sensitivity. In the absence of a histologic gold standard, a comprehensive diagnosis of all available current or follow-up clinical data and imaging was taken as the reference standard in 22 articles. A CT or follow-up PET scan was taken as the reference standard in one study, and MRI was taken as the reference standard in another one. Because the comprehensive diagnosis can accurately assess the state of the target disease, the applicability of the reference standard is acceptable. There was also a risk of bias in terms of flow and timing because the patients in most studies did not receive the same reference standard. Summary findings for the QUADAS-2 appraisal are illustrated in Fig. 2.

Fig. 2
figure 2

a Appraisal of the quality of the studies included according to the Quality Assessment for Diagnostic Studies-2 (QUADAS-2) tool. b Summary of QUADAS-2 risk of bias and Summary of QUADAS-2 applicability concerns

The influence of examination method on patient management

In Table 2 and Fig. 3, the per-patient pooled sensitivities of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, and BS were 0.97 (95% CI: 0.89–0.99), 0.87 (95% CI: 0.80–0.92), 0.96 (95% CI: 0.87–0.99), 0.91 (95% CI: 0.69–0.98), and 0.86 (95% CI: 0.76–0.92), respectively. The pooled specificities of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, and bone imaging were 1.00 (95% CI: 0.00–1.00), 0.99 (95% CI: 0.96–1.00), 0.97 (95% CI: 0.90–0.99), 0.96 (95% CI: 0.92–0.98), and 0.95 (95% CI: 0.87–0.98), respectively. The pooled DOR values of choline-PET/CT, NaF-PET/CT, MRI, and bone imaging were 504.16 (95% CI: 143.84–1767.07), 673.67 (95% CI: 95.11–4771.66), 242.63 (95% CI: 36.44–1615.51), and 114.44 (95% CI: 35.71–366.73). The pooled DOR values of PSMA-PET/CT could not be calculated because the specificity was 1. The AUC values of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, and bone imaging were 1.00 (95% CI: 0.99–1.00), 0.99 (95% CI: 0.98–1.00), 0.99 (95% CI: 0.98–1.00), 0.98 (95% CI: 0.96–0.99), and 0.95 (95% CI: 0.93–0.97). Due to insufficient data, it is impossible to perform meta-analysis on a per-lesion basis, but the pooled sensitivities of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, and bone imaging can be calculated—these were 0.88, 0.80, 0.97, 0.81, and 0.68, respectively.

Table 2 Diagnostic performance for 68Ga-PSMA PET/CT, choline PET/CT,18F-NaF PET/CT, MRI and BS on a per-patient basis and per-lesion basis
Fig. 3
figure 3

The SROC curves for 68Ga-PSMA PET/CT, choline PET/CT, 18F-NaF PET/CT, MRI, and BS on per-patient basis

Publication bias and heterogeneity exploration

The number of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, and MRI examination method were too small to evaluate publication bias; however, there was no obvious publication bias (Deek’s funnel plot, P = 0.1) of BS. In Table 3, the per-patient pooled sensitivity I2 values of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, and bone imaging were 67.96, 0, 43.73, 82.08, and 70.58, respectively. The pooled specificity I2 values of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, and BS were 0, 46.64, 47.65, 70.72, and 86.79, respectively. Sensitivity values for PSMA-PET/CT, MRI, and BS were highly heterogeneous, and specificity values for MRI and BS were highly heterogeneous. The remaining data were considered homogeneous. By eliminating articles one by one, we identified individual studies that may contribute to heterogeneity. Zacho et al.’s research results [22] may cause the heterogeneity of the PSMA-PET/CT analysis, while the Zacho et al., Lecouvet et al., and Dyrberg et al. studies [6, 22, 35] may lead to the heterogeneity of the MRI analysis. The Fuccio et al. and Sapir et al. studies [4, 13] may lead to heterogeneity of the BS analysis.

Table 3 Assessment of heterogeneity and threshold effect of included articles

Discussion

Prostate cancer is one of the most common malignant tumors. Because bone metastasis is very common, early identification of metastases is of great significance to patients. PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, and BS can be used to detect bone metastasis; however, their sensitivities and specificities are not uniform in the literature, and which examination method is best is controversial. The purpose of this meta-analysis is to compare the sensitivity and specificity of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, and BS, which may provide some guidance for clinical application. Our results showed that PSMA-PET/CT and NaF-PET/CT had higher pooled sensitivities on a per-patient basis, which were 0.97 and 0.96, respectively. There was no significant difference between them (P > 0.05). The pooled sensitivities of choline-PET/CT, MRI, and BS were 0.87, 0.91, and 0.86, respectively. There were no significant differences among them (P > 0.05), but compared with PSMA-PET/CT and NaF-PET/CT, there were significant differences (P < 0.05). The pooled specificities of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, and BS were 1.00, 0.99, 0.97, 0.96 and 0.95, respectively. PSMA-PET/CT was significantly better than BS (P < 0.05). The pooled DOR values of choline-PET/CT, NaF-PET/CT, MRI, and BS were 504.16, 673.67, 242.63, and 114.44. There were significant differences between them (P < 0.05). Because the pooled specificity of PSMA-PET/CT is 1, it is impossible to calculate the DOR. The AUC values of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, MRI, and BS were 1.00, 0.99, 0.99, 0.98, and 0.95, respectively. The AUC of PSMA-PET/CT was significantly higher than that of BS (P < 0.05). Due to insufficient data, only the pooled sensitivity can be calculated on a per-lesion basis. The pooled sensitivities were 0.88, 0.80, 0.97, 0.81, and 0.68, respectively. The pooled sensitivity of NaF-PET/CT is the highest, and there was no significant difference between choline-PET/CT and MRI (P > 0.05). In this meta-analysis, the numbers of PSMA-PET/CT, choline-PET/CT, NaF-PET/CT, and MRI studies were too small to assess publication bias. The P value of BS in the Deek’s funnel plot was 0.1, so there was no obvious publication bias.

From Table 3, the sensitivity values for PSMA-PET/CT, MRI, and BS and the specificity values for MRI and BS were highly heterogeneous, so we used a random-effects model. Meanwhile, by eliminating the articles one by one, we identified that Zacho et al.’s study [22] may lead to heterogeneity of the PSMA-PET/CT analysis, that the Zacho et al., Lecouvet et al., and Dyrberg et al. studies [6, 22, 35] may lead to heterogeneity of the MRI analysis, and Fuccio et al.’s and Sapir et al.’s studies [4, 13] may lead to heterogeneity of the BS analysis. Different research designs and different threshold settings can also lead to heterogeneity.

Bone scintigraphy is the most widely used method in clinical practice due to its low price. It can detect bone metastases with good sensitivity and can carry out whole-body skeletal examination; however, false positives are prone to occur due to the difficulty in identifying whether lesions are benign or malignant [36]. SPECT/CT is a good supplement to BS, but there are too few articles to analyze it in our meta-analysis. Lecouvet et al. showed that MRI detected bone metastases more sensitively than BS [37], findings similar to those of Shen et al. [10]. Our results showed that the per-patient sensitivity and specificity of MRI were higher than those of BS were, but there were no statistical differences. The per-lesion sensitivity of MRI was significantly higher than that of BS. The inconsistency of these results may be attributed to the higher sensitivity and specificity of MRI in the spine and the higher sensitivity of BS in the peripheral bones, especially the ribs [29, 33]. There are many different imaging agents for PET/CT. Each imaging agent can be used for different tumors, and they differ in their sensitivities and specificities.

This is the first meta-analysis comparing PSMA-PET/CT, NaF-PET/CT, and choline-PET/CT for the detection of bone metastasis of prostate cancer. Our results showed that PSMA-PET/CT had the highest per-patient sensitivity and specificity, and that NaF-PET/CT also had good sensitivity and specificity. On a per-lesion basis, the specificity of NaF-PET/CT is superior to PSMA-PET/CT. Both the per-patient and per-lesion sensitivity of choline-PET/CT were significantly worse than those of PSMA-PET/CT and NaF-PET/CT were. The European Nuclear Medical Association recommended NaF-PET/CT in their guidelines for bone imaging in 2015 [39], which shows the importance of NaF-PET/CT in the bone metastasis of cancer. Harmon et al. showed that NaF-PET/CT is an excellent imaging agent for detecting bone metastasis of prostate cancer [38]. Some studies have shown that choline-PET/CT is a sensitive imaging agent in detecting local lesions, recurrence, and metastasis of prostate cancer [40,41,42]; however, there are few studies on the comparison among NaF-PET/CT, PSMA-PET/CT and choline-PET/CT.

Two articles in our meta-analysis compared NaF-PET/CT with PSMA-PET/CT, and both articles showed no significant difference between methods although they were superior to MRI [22, 35]. Dyrberg et al. showed that NaF-PET/CT and choline-PET/CT had great sensitivity in detecting bone metastasis of prostate cancer, but that NaF-PET/CT was more sensitive [43]. Oromiehet al.’s studies showed that PSMA-PET/CT and PSMA-PET/MRI play an important role in the detection of prostate cancer recurrence and compared PSMA-PET/CT to choline-PET/CT in 37 cases of biorecurrence of prostate cancer. They showed that PSMA-PET/CT could better detect bone metastases than choline-PET/CT, especially in cases where PSA levels are low [44,45,46,47,48].

In general, the results of our meta-analysis are similar to those in previous studies. A study showed that PSMA-PET is more sensitive and specific than BS in detecting bone metastasis of prostate cancer, and would have a higher sensitivity and specificity if PET imaging were coupled to CT [49]. The studies of Lagaru et al. showed that the accuracy of NaF-PET/CT in detecting bone metastases was higher than that in fluorodeoxyglucose-PET/CT and BS [50, 51]. Our results show that NaF-PET/CT is more sensitive than MRI in detecting bone metastasis; however, Jambor et al. showed no significant difference between them, while both of them are more sensitive than BS [52]. Shen et al. found that MRI was better than choline-PET/CT and BS on a per-patient basis for detecting bone metastases of prostate cancer. Choline-PET/CT had a higher per-lesion DOR and Q* than BS [10]. However, PET/CT can not only provide the anatomical location and morphological characteristics of tumors, but also provide functional information on tumors. Therefore, PSMA-PET/CT and NaF-PET/CT are also two excellent imaging methods for the detection of bone metastases of prostate cancer.

The shortcoming of this meta-analysis is that the number of articles included is too small to calculate the per-lesion specificity, AUC, and DOR. It is also not large enough for subgroup analysis, so larger, better quality datasets are needed to confirm their advantages and disadvantages.

Our meta-analysis showed that PSMA-PET/CT and NaF-PET/CT had higher diagnostic value for bone metastasis of prostate cancer than choline-PET/CT, MRI, and BS. BS is widely used in hospitals, so it may be the best choice in routine examination of prostate cancer. PSMA-PET/CT and NaF-PET/CT can be selected for further examination if needed.