Introduction

Bladder cancer (BC) is the most frequent tumor of the urinary tract. In 2017, an estimated 60,490 men and 18,540 women will be newly diagnosed with BC and 12,240 men and 4630 women will die of the disease [1]. At initial diagnosis, about 75% of patients present with superficial, whereas 25% of patients show muscle-invasive cancer. Radical cystectomy (RC) with bilateral pelvic lymph node dissection (PLND) is the most common treatment of muscle-invasive or refractory superficial, high-grade transitional cell carcinoma [2].

The incidence of lymph node (LN) metastases in RC specimens depends on the histologic T stage of the BC. Patients with muscle-invasive BC are known to have a 10–30% risk of metastatic LNs, with an increase of LN metastases of up to 50% in patients with tumors extending into the perivesical fatty tissue [3, 4]. In patients with LN involvement, 5-year recurrence-free survival rates drop to 35% regardless of T stage [5, 6]. A preoperative imaging method that accurately demonstrates the extent of involvement and therefore may guide the extent of the surgical dissection could be desirable in staging and survival [7, 8]. Conversely, surgical treatment with potential serious complications could be avoided in those patients with high nodal tumor burden or metastatic disease disclosed at accurate preoperative imaging [9]. Therefore, identifying LN or distant metastases preoperatively could be crucial for treatment planning.

Anatomical imaging methods such as CT or MRI have been used for evaluation of LN status and these modalities rely on morphologic information, and the diagnosis is typically based only on the LN size. Contrast-enhanced CT is the most commonly used imaging techniques for the preoperative staging of BC. It accurately detects extravesical extension in 79–89.7% of cases and for detecting LN metastases range from 70 to 97% with a false negative rate of up to 40% [10]. MRI has not proven to be superior to CT with accuracies ranging from 73 to 98% [10]. These imaging techniques have proven to not be accurate enough in the preoperative LN involvement evaluation of BC [3, 10].

F-18 fluorodeoxyglucose (FDG) positron emission tomography (PET) or positron emission tomography/computed tomography (PET/CT) has been reported to be a functional and useful imaging modality for tumor staging in different cancers [11, 12]. Although F-18 FDG PET and/or PET/CT are/is now commonly employed in various cancer imaging, its use in BC staging is limited by high excreted urinary activity in the bladder and ureters; the reported sensitivity for LN staging is 50–70% [13,14,15]. A recent study, however, showed that F-18 FDG PET/CT using SUVmax of LNs is a useful tool for preoperative evaluation of pelvic LN metastases from invasive bladder cancer and contributes to the selection of patients for personalized treatment [16]. Vind-Kezunovic et al. showed that using SUVmax > 2 analysis, F-18 FDG PET/CT had a sensitivity of 79.4% and a specificity of 66.5%. With the threshold of SUVmax > 4, the sensitivity was 61.8% and specificity was 84.5% [16]. In terms of specificity, previous studies demonstrated high specificities with a range of 60–100% [13,14,15,16].

The detection of the primary BC and local recurrence is limited due to the presence of excreted FDG in the urinary tract, which often masks the urinary bladder lesion and the adjacent LNs [14, 15]. C-11-labeled choline and acetate PET/CT have been used for preoperative staging in BC patients [17, 18]. These PET tracers offer the advantage of minimal urinary excretion over F-18 FDG with good sensitivity and specificity in the evaluation LN metastasis [17, 18].

The purpose of our study is to meta-analyze the published data on the diagnostic accuracy of PET tracers beyond F-18 FDG for LN staging in BC patients, to provide more evidence-based data and to address further studies in the evaluation of LN status.

Methods

Data sources and search strategy

We conducted electronic English-language literature searches of MEDLINE via PubMed, Embase and Cochrane Library database from the earliest available date of indexing through June 30, 2017. We also hand-searched the reference lists of identified publications for additional studies. We used a search algorithm based on a combination of terms: (1) “PET” OR “positron emission tomography” OR “positron emission tomography/computed tomography” OR “PET/CT” OR “positron emission tomography-computed tomography” OR “PET-CT” and (2) “bladder neoplasms” OR “bladder cancer” and (3) “staging” OR “lymph node”.

Study selection

The inclusion criteria for relevant studies were as follows: whole-body C-11-labeled PET/CT had been used to stage LN status in BC patients; sufficient data to reassess sensitivity and specificity of PET/CT in evaluating LN status or absolute numbers of true-positive, true-negative, false-positive, and false-negative data had been presented; and no data overlap.

Studies were excluded if fewer than ten patients had been included. In addition, duplicate publications were excluded, as were publications such as review articles, case reports, conference papers, and letters, which do not contain the original data. Two researchers independently reviewed titles and abstracts of the retrieved articles, applying the above-mentioned selection criteria. Articles were rejected if clearly ineligible. The same two researchers then independently evaluated the full-text version of the included articles to determine their eligibility for inclusion.

Data extraction and quality assessment

Information about basic study (authors, year of publication, and country of origin), study design (prospective or retrospective), patients’ characteristics and technical aspects were collected. Each study was analyzed to retrieve the number of true-positive (TP), true-negative (TN), false-positive (FP), and false-negative (FN) findings of PET tracers beyond F-18 FDG for LN staging in BC patients, according to the reference standard. Only studies providing such complete information were finally included in the meta-analysis. Quality of the included studies was assessed based on 15-item modified Quality Assessment of Diagnostic Accuracy Studies (QUADAS2) [19]. Two reviewers independently assessed each potentially eligible study and assigned them as a quality rating of “good,” “fair,” or “poor”. Quality assessment was conducted based on following criteria: study design and the presence of bias including selection, performance, recording, and reporting bias. Studies with high risk of bias were defined as poor quality, the presence of moderate risk (did not affect the results) as fair quality, and those with minimal risk as good quality. Disagreements were settled with consensus decision. Disagreement between the two authors was resolved by discussion.

Data synthesis and analysis

All data from each eligible study were extracted. Categorical variables are presented as frequencies or percentages, and continuous variables are presented as mean values unless stated otherwise. Measures of the diagnostic performance, including sensitivity, specificity, and diagnostic odds ratios (DORs), are reported as point estimates with 95% confidence intervals (CIs). A DOR can be calculated as the ratio of the odds of positivity in a disease state relative to the odds of positivity in the non-disease state, with higher values indicating better discriminatory test performance [20]. Between-study statistical heterogeneity was assessed using I2 and the Cochrane Q test on the basis of the random-effects analysis [21]. Publication bias was examined using the effective sample size funnel plot and associated regression test of asymmetry described by Deeks et al. [22]. We used the bivariate random-effects model for analysis and pooling of the diagnostic performance measures across studies, as well as comparisons between different index tests [23, 24]. The bivariate model estimates pairs of logit transformed sensitivity and specificity from studies, incorporating the correlation that might exist between sensitivity and specificity. We also used the model to create hierarchical summary receiver operating characteristic curves and to estimate the area under the curve [25]. When statistical heterogeneity was substantial, we performed meta-regression to identify potential sources of bias [26]. Pooled estimates were also calculated for subgroups of studies that were defined according to specific study designs. Two-sided p ≤ 0.05 was considered statistically significant. Statistical analyses were performed with commercial software programs (STATA, version 13.1; StataCorp LP).

Results

Literature search and selection of studies

After the comprehensive computerized search was performed and references lists were extensively cross-checked, our research yielded 313 records, of which 117 records of duplicated abstracts were excluded after reviewing the title and abstract. Also, non-relevant 111 abstracts, 23 case reports, and 51 review articles were excluded. Remaining 11 full text articles were assessed for eligibility and 1 article was excluded due to insufficient data for the calculation of sensitivity and specificity of PET tracers beyond F-18 FDG for LN staging in BC patients. Finally, ten studies were selected and were eligible for the systematic review and meta-analysis and no additional studies were found screening the references of these articles [18, 27,28,29,30,31,32,33,34,35]. The characteristics of the included studies are presented in Table 1. The detailed procedure of study selection in the current meta-analysis is shown in Fig. 1.

Table 1 Characteristics of the included studies
Fig. 1
figure 1

Flow chart of the search for eligible studies on the diagnostic performance of PET tracers beyond FDG PET/CT for LN staging in BC patients

Study description, quality, publication bias

We conducted all analyses based on per-patient data analysis. All the studies included in the current review conducted patient-based analysis. There were a total of 282 patients in the included studies, and the age ranged from 41 to 85 years. Of all ten studies, five studies enrolled patients retrospectively [18, 27, 29, 32, 34]; remaining five studies [28, 30, 31, 33, 35] enrolled patients prospectively. Eight studies [18, 27,28,29, 31,32,33,34] used PET/CT as imaging device and two studies [30, 31] used PET in their studies. Eight studies [27,28,29,30,31,32,33, 35] used the C-11 choline as PET tracer and two studies [18, 34] used C-11 acetate in their studies. The principal characteristics of the ten studies included in the meta-analysis are included in Table 1. To assess a possible publication bias, Deeks’s funnel plot asymmetry tests were designed. The non-significant slope indicates that no significant bias was found. The p value was 0.64 (Fig. 2).

Fig. 2
figure 2

Results of Deeks’s funnel plot of asymmetry test for publication bias. Non-significant slope indicates that no significant bias was found. ESS effective sample size

Methodological quality assessment

Figure 3 shows the risk of bias and applicability concerns summary and overall, the quality of the studies was deemed satisfactory.

Fig. 3
figure 3

Risk of bias and applicability concerns summary

Diagnostic accuracy of PET tracers beyond F-18 FDG

The diagnostic performance results of PET tracers beyond F-18 FDG for LN staging in BC patients in the ten included studies in the meta-analysis are presented in Table 2. The pooled sensitivity of PET/CT imaging was 0.66 (95% CI 0.54–0.75) without heterogeneity (χ2 = 12.41, p = 0.19) and a pooled specificity of 0.89 (95% CI 0.76–0.95) with heterogeneity (χ2 = 29.1, p = 0.00). Likelihood ratio (LR) syntheses gave an overall positive likelihood ratio (LR+) of 5.8 (95% CI 2.7–12.7) and negative likelihood ratio (LR−) of 0.39 (95% CI 0.28–0.53). The pooled DOR was 15 (95% CI 6–38). Forest plots of the sensitivity and specificity of PET/CT imaging for LN staging in BC patients are shown in Fig. 4. The Fig. 5 shows hierarchical summary receiver operating characteristic (ROC) curve and indicates that the areas under the curve was 0.73 (95% CI 0.69–0.77), indicating moderate diagnostic accuracy.

Table 2 Diagnostic performance of PET tracers beyond FDG PET/CT for LN staging in BC patients
Fig. 4
figure 4

Forest plot of pooled sensitivity and specificity of PET tracers beyond FDG PET/CT for LN staging in BC patients. Summary of sensitivity and specificity was 0.66 [95% confidence interval (CI) 0.54–0.75] and 0.89 (95% CI 0.76–0.95), respectively

Fig. 5
figure 5

Hierarchical summary receiver operating characteristic (HSROC) curves for LN staging in BC patients of PET tracers beyond FDG

Heterogeneity evaluation and meta-regression analysis

Between-study heterogeneity was present for specificity among studies of PET tracers beyond FDG PET/CT imaging for LN staging. A meta-regression analysis was performed to explore other sources of heterogeneity in the current studies (Table 3). In univariate meta-regression analysis, the study design (prospective vs retrospective) was the potent source of heterogeneity of the current review. Furthermore, in multivariate meta-regression, the study design (prospective vs retrospective) was the potent source of heterogeneity.

Table 3 Meta-regression analysis for identifying potential sources of heterogeneity in the diagnostic performance of PET tracers beyond FDG for the detection of LN metastasis in BC patients

Subgroup analysis

We conducted the subgroup analysis according to the study design (prospective vs retrospective). The pooled sensitivity of prospective study group was 0.77 (95% CI 0.27–0.97) and a pooled specificity of 0.74 (95% CI 0.63–0.82). The pooled DOR was 9 (95% CI 1–84). The LR+ was of 2.9 (95% CI 1.6–5.5) and LR− was 0.31 (95% CI 0.06–1.62). The pooled sensitivity of retrospective study group was 0.63 (95% CI 0.49–0.75) and a pooled specificity of 0.96 (95% CI 0.83–0.99). The pooled DOR was 41 (95% CI 8–205). The LR + was of 15.8 (95% CI 3.5–70.8) and LR− was 0.38 (95% CI 0.27–0.55). Figure 6 shows the comparison of HSROC curves of the prospective and retrospective study groups for PET tracers beyond FDG for the diagnosis of LN metastasis in BC patients.

Fig. 6
figure 6

Hierarchical summary receiver operating characteristic (HSROC) curves for LN staging in BC patients of subgroup analysis according to the study design (left; prospective study design, right; retrospective study design)

Discussion

The accurate assessment of LN involvement status is critical both for prognostic and treatment planning in BC patients [36]. Although the PLND is the most accurate staging procedure for LN status assessment for many urological cancers, the noninvasive diagnosis of LN involvement of the disease would be useful. Recent progress in diagnostic imaging by CT and MRI with contrast enhancement has allowed the LN staging of BC, but the results have been generally disappointing. F-18 FDG PET/CT is known to be useful for the preoperative staging of various cancers. However, the most widely used PET tracer, F-18 FDG, is inappropriate for the imaging of BC cancer patients because its high urinary excretion may hamper the correct visualization of the bladder wall and LNs [37]. Some previous studies reported that the sensitivity for LN staging is 50–80% [13,14,15,16]. Despite of limitation of relative low sensitivity, previous studies reported relatively high specificity of F-18 FDG PET/CT for preoperative LN staging of BC patients [13,14,15,16].

Some new promising PET/CT tracers have been proposed as a potential molecular imaging tool for various urological cancers [38, 39]. C-11 choline has been considered as a potential tracer for the detection of BC with the advantage of minimal urinary excretion as compared with F-18 FDG [40]. Also, C-11-labelled PET/CT tracer offers the advantage of minimal urinary excretion with respect of F-18 FDG, with good sensitivity and specificity in the evaluation LN metastasis of BC [33]. C-11 acetate has been used for cardiac and urinary cancer imaging [41,42,43]. In the cancer cell, acetate is converted into fatty acids, which are essential component for phospholipid and membrane synthesis [44]. The mechanism of C-11 acetate uptake in BC is similar to that in prostate cancer, where studies have showed the important role of the enzymes fatty acid synthase and acetyl-CoA carboxylase [23] for cancer development and progression [45,46,47].

Some previous studies investigated the usefulness of C-11 choline and C-11 acetate PET/CT for LN staging in BC patients. Picchio et al. investigated the utility of C-11 choline PET and contrast CT in the presurgical staging of 27 patients and reported the 62% sensitivity and 10% specificity for LN metastases in a per-patient analysis [33]. De Jong et al. demonstrated good performance for C-11 choline-PET/CT in 18 patients with BC, with sensitivity of 67%, specificity of 100%, and accuracy of 94% [30]. Brunocilla et al. showed a sensitivity of 42%, specificity of 84%, and accuracy of 73%, in LN staging of 26 patients with BC on patient-based analysis [27]. However, Maurer et al. demonstrated that C-11 choline PET/CT did not improve the diagnostic information in preoperative LN staging before RC compared with contrast enhanced CT [32]. Using C-11 acetate, Schöder et al. explored the utility of C-11 acetate in the locoregional staging of bladder cancer prior to RC and PLND and showed 100% of sensitivity and 87% of specificity for correctly identifying metastatic LN [34]. Vargas et al. evaluated the diagnostic performance of magnetic resonance imaging (MRI), C-11 acetate PET/CT and contrast-enhanced CT for BC staging, using pathologic review of RC and pelvic LN specimens as the reference standard [18]. They showed 100% of sensitivity and 71% of specificity of C-11 acetate PET/CT for determining N stage [18].

Recently, the improved techniques of the modern PET/CT scanners could improve the diagnostic ability in the detection of LN metastasis in patients with BC whatever radiopharmaceutical was used. A recent study reported that SUVmax > 4-based analysis showed sensitivity of 61.8% and specificity of 84.5% in the detection of LN metastasis using F-18 FDG PET/CT [16]. This improved diagnostic ability of PET/CT scanner for LN staging of BC patients could be achieved through standardization of PET/CT scanners. Because, this standardization procedure is essential in the clinical setting and evaluation of all PET/CT scans and has been shown to minimize multi-institutional variation to 10% deviation [48, 49].

Heterogeneity between studies may represent a potential source of bias. The included studies were statistically heterogeneous in their estimates of specificity. This heterogeneity is likely to arise through diversity in methodological aspects between different studies (Table 1). The baseline differences among the patients in the included studies (Table 1) may have contributed to the observed heterogeneity of the results too. According to the multi-variate meta-regression analysis of the current study, the study design (prospective vs retrospective) was the source of the study heterogeneity. Furthermore, the small sample size and bias were the potential source of limitations of the current review. To minimize bias in the selection of studies and in the data extraction, reviewers who were blinded to the journal, author, institution, and date of publication independently selected articles based on the inclusion criteria, and scores were assigned to study design characteristics and examination results by a standardized form that was based on the QUADAS2 tool. Also, publication bias is a major concern in all meta-analyses as studies reporting significant findings are more likely to be published than those reporting non-significant results. We assessed the publication bias in our analysis by funnel plots which showed some asymmetry (p = 0.64).

Conclusion

C-11 choline and C-11 acetate PET/CT shows a low sensitivity and moderate specificity for the detection of metastatic LNs in patients with BC. Moreover, heterogeneity among the studies should be considered a limitation. Further large multicenter studies would be necessary to substantiate the diagnostic accuracy of C-11 choline and C-11 acetate PET/CT for this purpose.