Introduction

Colorectal cancer (CRC) is the third leading cause of cancer-related death in the US, and third in cancer prevalence in both men and women. An estimated 142,820 new cases and 50,830 deaths from CRC are expected in 2013 alone [1]. While important efforts in the prevention and early detection of CRC are ongoing, approximately one-fifth of patients diagnosed with CRC will have evidence of distant spread at diagnosis [1]. About 30–40 % of CRC patients after operation will recur and develop metastases, and 80 % happened within 2 years [2, 3]. The overall survival for patients with CRC is relatively poor, especially for those with local recurrent and/or metastases. Given its high incidence and mortality rate, substantial efforts are warranted to understand, detect, and control the disease. Positron emission tomography (PET), using the radio-labeled glucose analog 2-[18F]-fluoro-2-deoxy-d-glucose (18F-FDG), exploits metabolic characteristic of malignant tissue to identify tumor foci. The recently developed integrated positron emission tomography/computed tomography (PET/CT), which combines a full-ring detector clinical PET scanner and multidetector computed tomography (MDCT) scanner, acquires both metabolic and anatomic imaging data with a single device during a single diagnostic session. This provides precise anatomic localization of suspicious areas of increased FDG uptake [4]. 18F-FDG PET/CT imaging have proven valuable for staging, restaging, planning, and monitoring therapies in various cancer patients, including CRC. However, the quality of current evidences and diagnostic value of 18F-FDG PET/CT for local recurrent in CRC cancer have yet to be systemically evaluated. Considering this background, the present study conducted the first systematic review and meta-analysis to assess the diagnostic accuracy of 18F-FDG PET/CT in the detection of local recurrent in patients with CRC.

Materials and Methods

Literature Search Strategy

Two investigators performed a systematic literature search of the PubMed/MEDLINE databases to identify relevant studies (last update October 11, 2014), with the following key words: (a) colorectal neoplasm, CRC, colorectal carcinoma; (b) positron emission tomography/computed tomography, PET/CT, PET-CT. Different keywords including Medical Subject Heading (MeSH) terms were combined using Boolean operations ‘AND’ and ‘OR’, viz. (Diagnosis/Broad [filter]) AND ((“Colorectal Neoplasms” [MeSH] OR CRC [Text Word] OR “Rectal neoplasms” [MeSH Terms] OR rectal cancer [Test Word] OR Colorectal OR Rectal) AND (“PET” OR “FDG” OR “positron emission tomography/computed tomography” OR “positron emission tomography”) AND (“Recurrence” [MeSH] OR “Neoplasm Recurrence, Local” [MeSH])). There was no language restriction on the initial search. The Cochrane Database of Systematic Reviews was also searched electronically to identify additional potentially relevant articles. Cross references from selected articles were also used for retrieving relevant studies. The scope of literature search was enlarged on the basis of the reference lists of all retrieved articles. Authors of eligible studies were contacted to supplement additional data when key information was missing. Two reviewers independently judged study eligibility. Any disagreement was resolved by consensus.

Selection of Studies

The selection criteria for inclusion in the systematic review and meta-analysis were (a) 18F-FDG PET/CT was used to detect local recurrent CRC; (b) histopathologic analysis and/or clinical and imaging follow-up were used as the reference standard; (c) only studies from which a 2 × 2 table could be constructed for true-positive, false-negative, false-positive, and true-negative values were included; (d) the studies were based on patient-level statistic; (e) when data or subsets of data were presented in more than one article, the article with the most details or the most recent article was chosen; (f) the studies including at least 15 patients were selected for inclusion in the study; and (g) all the peoples included should be local recurrent rather than distant metastasis. Abstract presented at congresses, unpublished data, case report, meta-analysis, reviews, letter, editorials, and comments were excluded. Duplicated studies with overlapping patient populations as well as studies evaluated less than 10 patients were excluded.

Data Extraction

Two reviewers extracted relevant data from each selected article, including study characteristics and test results using a standardized data extraction sheet that was verified independently by the third reviewer. Any discrepancy was resolved by consensus. Data extraction was done separately for the primary site and neck nodes wherever possible. Study characteristics that were documented included (a) first author; (b) journal and year of publication; (c) study origin; (d) number of patients; (d) number of true-positive, false-positive, true-negative, and false-negative findings of 18F-FDG PET/CT according to the reference standard (histological examination or clinical/imaging follow-up) were recorded; and (f) study quality.

Quality Assessment of Included Studies

Quality assessment of included studies was performed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool [5] developed systematically, validated using a Delphi procedure, and structured finally as a list of 14 items phrased as questions pertaining to validity (patient spectrum, reference standard, and test execution), evaluation of bias and variability, and quality of reporting (withdrawals and indeterminate results), which should be answered yes, no, or unclear. All included studies were scored on all 14 items to provide an overall score. For the purpose of this analysis, ‘yes’ was scored as 1, while ‘no’ and ‘unclear’ were both scored as 0. Two reviewers performed quality assessment jointly which was randomly verified by a third reviewer. Any discrepancy was resolved by consensus. The overall meta-analysis was done independently of quality assessment, as the QUADAS tool was not designed for weighting data for a meta-analysis. Nevertheless, the impact of study quality on diagnostic performance was explored in the meta-regression analysis using the median QUADAS score as a covariate.

Statistical Analysis

Data from individual studies were summarized in a 2 × 2 table classifying patients as true-positives (TP), true-negatives (TN), false-positives (FP), and false-negatives (FN). The following indexes of test accuracy, together with 95 % confidence intervals (95 % CIs), were calculated for each study: sensitivity, specificity, positive likelihood ratios (PLR), negative likelihood ratios (NLR), and diagnostic odds ratio (DOR). Asymmetric summary receiver operating characteristic (SROC) curves were fitted using weighted regression or inverse variance method (Moses’ model), and their area under the curve (AUC) and Q* index were calculated. AUC summarizes diagnostic performance as a single number, while Q* index is the point where sensitivity and specificity are equal. The degree of heterogeneity among different studies was tested using X 2 statistic. When there was significant heterogeneity observed (p < 0.05), a random-effect model was applied. All statistical analyses were performed using Meta-Disc version 1.4 (XI, Cochrane Colloquium, Barcelona, Spain) and Stata (version 12, Stata Corporation, College Station, TX, USA) to analyze data.

Results

Characteristic and Quality Assessment of Included Studies

The initial systematic literature search using appropriate keywords identified 349 potentially relevant articles. This list was reduced to 109 studies after removing duplicates and screening based on the title and abstract review. After we read the full text of these articles, 82 of the 109 relevant articles were excluded because (a) the aim of the articles was not to reveal the accuracy of 18F-FDG PET/CT for local recurrent in CRC patients (n = 37); (b) researchers in the articles did not use histopathologic analysis and/or clinical and imaging follow-up as the reference standard (n = 7); and (c) researchers in the articles did not report data that could be used to construct or calculate true-positive, false-positive, true-negative, and/or false-negative results (n = 38). Finally, 26 studies fulfilled all inclusion criteria and were selected for data extraction and data analysis [2, 630]. The characteristics of eligible 26 studies are presented in Table 1. We assessed the study quality of all the 27 articles using the QUADAS tool (Table 2).

Table 1 Characteristics of studies included in this meta-analysis
Table 2 Results of quality assessment for all included eligible studies

Meta-analyses Results

The pooled sensitivity was 0.94 (95 % confidence interval [CI] 0.92–0.96). The pooled specificity was 0.94 (95 % CI 0.93–0.95). Forest plots for the sensitivity and specificity in the detection of local recurrent CRC are shown in Figs. 1 and 2, respectively. The positive and negative likelihood ratios in the diagnosis of local recurrent CRC were 14.39 (95 % CI 7.37–28.09) and 0.08 (95 % CI 0.06–0.12), respectively. Forest plots for the positive and negative likelihood ratios in the detection of local recurrent CRC are shown in Figs. 3 and 4, respectively. The pooled DOR was 208.67 (95 % CI 109.56–397.44). Forest plot for the DOR in the detection of local recurrent CRC is shown in Fig. 5. The Q value for the studies in our meta-analysis was 0.9329. The area under the curve (AUC) was 0.9776, indicating high overall accuracy. The SROC curves and the *Q index for pancreatic cancer detection are shown in Fig. 6. The asymmetry of the funnel plots using Egger’s and Begg’s tests showed that there was no publication bias among the included studies.

Fig. 1
figure 1

Forest plot of sensitivity of 18FDG-PET/CT for detecting local recurrence of colorectal cancer

Fig. 2
figure 2

Forest plot of specificity of 18F-FDG PET/CT for detecting local recurrence of colorectal cancer

Fig. 3
figure 3

Forest plots for the positive likelihood ratio in the detection of local recurrence of colorectal cancer

Fig. 4
figure 4

Forest plots for the negative likelihood ratio in the detection of local recurrence of colorectal cancer

Fig. 5
figure 5

Forest plot for diagnostic OR of 18F-FDG PET/CT for detecting local recurrence of colorectal cancer

Fig. 6
figure 6

Summary receiver operating characteristic (SROC) curve of 18F-FDG PET/CT for detecting local recurrence of colorectal cancer

Discussion

To our knowledge, this systematic review and meta-analysis are the first to evaluate the diagnostic performance of 18F-FDG PET/CT in detecting local recurrent CRC.

Several studies have reported the diagnostic performance of 18F-FDG PET/CT in the diagnosis of local recurrent CRC. However, one of the major problems with these studies is that many have limited power, analyzing only relatively small numbers of patients. Meta-analysis is a powerful tool for summarizing the results from different studies by producing a single estimate of the major effect with enhanced precision. It can overcome the problem of small sample size and inadequate statistical power of genetic studies of complex traits, and provide more reliable results than a single case–control study [31]. Combining data from many studies has the advantage of reducing random error [32]. So, we pooled all the relative published studies to derive more robust estimates of 18F-FDG PET/CT in detecting recurrent local CRC.

The DOR is a single indicator of test accuracy that combines the data from sensitivity and specificity into a single number [33]. It is the ratio of the odds of a positive test in a patient with disease relative to the odds of positive test in a patient without disease and has a value that ranges from 0 to infinity, with higher values indicating better discriminatory test performance (i.e., higher accuracy). A value of 1.0 indicates that the test does not discriminate between patients with the disorder and those without it. In this meta-analysis, the pooled DOR with 95 % confidence interval for 18F-FDG PET/CT was 208.67 (109.56–397.44), indicating a high level of accuracy for 18F-FDG PET/CT in detecting local recurrent CRC.

As a global measure of test efficacy across all studies, we determined the Q value, defined as the point of intersection of the SROC curve with a diagonal line extending from the left upper corner to the right lower corner of the ROC space. The Q value corresponds to the highest joint value of sensitivity and specificity for the diagnostic test. This point does not indicate the only or even the best combination of sensitivity and specificity for a particular clinical setting, but it does provide an overall measure of the discriminatory power of the diagnostic test. The Q* index estimate for 18F-FDG PET/CT was 0.9329, indicating a high level of accuracy for 18F-FDG PET/CT in detecting local recurrent in patients with CRC.

Since the SROC curve is not easy to interpret and use in clinical practice and likelihood ratios are considered to be more clinically meaningful, both PLR and NLR were calculated and served as our measures of diagnostic accuracy. Likelihood ratios are metrics that take into account the interaction between the sensitivity and the specificity in their calculation, and PLR >10 and NLR <0.1 are considered convincing evidence to rule in or rule out disease, respectively. The value of PLR in this meta-analysis was 14.39, while the value of NLR was 0.08. These data suggested that it can be used for detecting local recurrent in patients with CRC.

Our meta-analysis had several limitations. First, the exclusion of conference abstracts, and letters to the editors may have led to publication bias. Second, a wide variation in patient population, imaging techniques, study design, and quality in these selected studies may have affected the estimates of diagnostic accuracy of 18F-FDG PET/CT. Third, there was no single reference standard strategy for histopathologic analysis. Fourth, the study design in all included studies was retrospective. The retrospective nature of studies may have some potential limitations. For example, imaging observers might have known the outcomes of other examination results before the interpretation of PET or CT, which could not be excluded. These factors may also have affected the accuracy of 18F-FDG PET/CT in detecting local recurrent in patients with CRC.

Conclusion

Our present meta-analysis toward the diagnostic value of 18F-FDG PET/CT in local recurrent CRC suggests that 18F-FDG PET/CT has good diagnostic performance in detecting local recurrence in patients with CRC.