Introduction

Tuberculosis (TB) remains a worldwide infectious disease attracting extensive attention, whose morbidity rate under persistent elevation displayed inevitable hazards. In 2013, approximately 9.0 million people developed into TB with 1.5 million deaths, 360,000 of whom were HIV positive. The proportion of cured TB patients every year is almost constant with subtle declinations, for instance, only estimated 37 million people were cured between 2000 and 2013 ascribed to advanced and effective methods of diagnosis in conjunction with treatments (WHO) [1]. Meanwhile, tuberculosis’s early diagnosis remains an intricate problem. Nowadays, tuberculin skin test (TST) is one of the most widely applied methods in major countries considering its lower cost and convenience. However, this recommended diagnostic approach has several deficiencies, such as its low specificity, cross-reactions with bacille Calmette-Gue´rin (BCG) vaccination, and infection of nontuberculous mycobacteria (NTM) [2]. Ample evidences have demonstrated that Interferon (IFN)-γ release assays (IGRAs) seems to be an alternative for the diagnosis of the TB. IGRAs applies proteins that are more unique and specific for Mycobacterium tuberculosis than the purified protein derivative (PPD) which is encoded by genes located in the region of difference 1 (RD1) among the whole M. tuberculosis genome. These genes are not found in the BCG substrains or the most environmental mycobacteria (apart from Mycobacterium kansasii, Mycobacterium szulgai, Mycobacterium marinum, and Mycobacterium flavescens) [3]. Several studies have depicted the relative accuracy of IGRAs, but the majority of them were just verified in one aspect, respectively, such as children’s tuberculosis [4], active tuberculosis [5], and latent tuberculosis [6], which had restricted a comprehensive application of IGRAs.

Therefore, aiming to access the exact diagnostic value of IGRAs, we performed this meta-analysis to investigate the accuracy of IGRAs and determine whether IGRAs has the probability to replace the conventional diagnostic approaches.

Methods

Search Strategy

A systematic search was performed in the PubMed, Embase, Cochrane Library, and Web of Science databases in recent decades up to May 2015. The following terms were used to search for relevant investigations: “Tuberculosis/diagnosis” and “T-SPOT” or “QFT-IT” and “specificity”.

Inclusion and Exclusion Criteria

Researches retrieved from the databases were first scanned through titles with abstracts, and then full-text studies were further reviewed for eligibility. The studies will be included if TB patients have been diagnosed etiologically or sufficient information such as false or true positives and negatives have been provided to construct the analysis.

The studies will be excluded by the following reasons: case reports, editorials, or animal studies; systematic review and meta-analysis; sensitivity and specificity were not reported or could not be calculated; full text were not available or published in English. Two investigators participated in the search of available references individually and they reached the consensus on each eligible study.

Data Extraction

The following data were collected: first author, publishing year, numbers of cases and controls, country of origin, individuals’ characteristics, age, percent of male, percent of BCG vaccinated, percent of HIV-positive, TST reagent, cutoff for TST, percent with TST, T-SPOT, QFT-IT results, and percent of indeterminacy.

Statistical Analysis

To decrease heterogeneity, data extraction was separated by T-SPOT, QFT-IT, and TST. Indeterminate results were rejected from these studies. We used Statistical analysis I 2 and P value to describe the heterogeneity, and use correlation coefficient to determine its threshold effect. Pooled sensitivity and specificity of each assay and their 95 % confidence interval (CI) were calculated using random effects [7]. The meta-analysis was performed using MetaDiSc and Review Manager version 5.3.

Quality Assessment

The risk of bias table, which consists of seven domains covering (1) random sequence generation, (2) allocation concealment, (3) blinding of participants and personnel, (4) blinding of outcome assessment, (5) incomplete outcome data, (6) selective reporting, and (7) other bias, was used to access the inclusive studies’ risk of bias considering a total of five aspects, including the selection bias, the performance bias, the detection bias, the attrition bias, and the reporting bias. The risk of bias was finally judged as “low,” “high,” or “unclear” according to the answers of the signaling questions. The “unclear” category was used only when insufficient data were reported [8].

Results

Characteristics of the Studies

The search and selection process is described in Fig. 1. A total of 961 studies were found through retrieval, except for 393 duplications. In the remaining 568 studies, 529 did not conduct diagnostic tests or have the associative sensitivity or specificity, and 21 were meta-analysis and systematic reviews. Consequently, after further excluding 2 studies not published in English, a total of 16 studies were available as full texts for the final analysis [2, 3, 922]. Furthermore, among the 16 studies, 4 made comparison between QFT-IT and TST including 1855 participants, 7 compared T-SPOT and TST including 1731 participants, and the other 5 studies distinguished TST and T-SPOT from QFT-IT. All of these studies were representative in which a total of 3586 participants took part in the present analysis (Tables 1, 2). The risk of bias is shown in Fig. 2, with the incomplete outcome data locating in the high risk of bias and other bias under relatively low risk.

Fig. 1
figure 1

Flow chart of the literature search and selection strategy

Table 1 Studies in which TST was compared with QFT-GIT (N = 9)
Table 2 Studies in which TST was compared with QFT-GIT (N = 12)
Fig. 2
figure 2

Risk of bias graph included in the meta-analysis

Sensitivity and Specificity of Interferon-γ Release Assays

We identified 9 studies correlated with QFT-IT, and the pooled sensitivity and specificity in diagnosis were 0.840 (95 % CI 0.814–0.864) and 0.658 (95 % CI 0.621–0.693), respectively (Fig. 3). The positive likelihood ratio, negative likelihood ratio, and pooled DOR of QFT-IT for the diagnosis of TB were 3.652 (95 % CI 2.180–6.117), 0.212 (95 % CI 0.109–0.414), and 10.397 (95 % CI 5.527–19.560) (Fig. 4), respectively. The pooled sensitivity, specificity, positive likelihood ratio, negative likelihood ratio, and pooled DOR of the T-SPOT were 0.842 (95 % CI 0.811–0.870), 0.745 (95 % CI 0.715–0.775), (Fig. 5), 2.196 (95 % CI 1.727–2.794), 0.246 (95 % CI 0.161–0.377), and 19.205 (95 % CI 7.049–52.326) (Fig. 6).

Fig. 3
figure 3

Sensitivity and specificity of the QFT-IT

Fig. 4
figure 4

DOR and SROC curve of the QFT-IT

Fig. 5
figure 5

Sensitivity and specificity of the T-SPOT

Fig. 6
figure 6

DOR and SROC curve of the T-SPOT

Sensitivity and Specificity of the Tuberculin Skin Test

Sixteen studies were identified to describe the tuberculin skin test. The meta-analytic estimate for sensitivity and specificity were 0.665 (CI 0.635–0.693) and 0.633 (CI 0.605–0.661) (Fig. 7). In addition, the positive likelihood ratio, negative likelihood ratio, and pooled DOR of the TST were 1.825 (95 % CI 1.351–2.464), 0.556 (95 % CI 0.385–0.804), and 3.810 (95 % CI 1.837–7.902) in Fig. 8.

Fig. 7
figure 7

Sensitivity and specificity of the TST

Fig. 8
figure 8

DOR and SROC curve of the TST

SROC Curve of Interferon-γ Release Assays and the Tuberculin Skin Test

Area under the curve (AUC) was adopted to measure the accuracy of the tests. SROC curve of Interferon-γ release assays and the tuberculin skin test are shown in Figs. 6, 7, and 8. The AUC of the QFT and T-SPOT were 0.8818 and 0.9006, which were significantly higher than 0.7301 which was the AUC of the TST. From the SROC curve, we came to the assumption that all of the three methods might have a threshold effect. The analysis of diagnostic threshold of the QFT, T-SPOT and TST are conveyed in Figs. 6, 7, and 8. Their spearman correlation coefficients were, respectively, −0.375, −0.227, and −0.063, which proved the hypothesis that all the three methods had a threshold effect.

Discussion

This meta-analysis aimed to inquire the accuracy of IGRAs, which was witnessed as an accurate diagnostic method for latent and active TB, considering its unique preponderance. The summary results, as derived from 16 studies, indicated that both the sensitivity and the specificity of the IGRAs were significantly higher than those of the conventional TST. Moreover, the DOR of the TST was signally lower than the IGRAs, which meant the IGRAs had distinctly higher diagnostic value than TST. Besides, the reported TST sensitivity fluctuated between 0.258 and 1.000, with its specificity ranging from 0.336 to 0.919, while the sensitivity of the T-SPOT ranged from 0. 557 to 0.929, with its specificity were from 0.493 to 1.000. Hence, these results accounted for the better accuracy of IGRAs, although the outcomes were a little bit unsteady.

Through sub-analysis, we found that young adults, latent tuberculosis, TB patients with immunodeficiency, and TB patients with concurrent HIV infection may explain their stability [2326] of data.

Because of the discontinuation of BCG mass vaccination in countries with a low incidence of TB, there has been an increase in NTM infection [27, 28], from which TST fails to distinguish NTM infection. However, IGRAs has a propensity for discriminating cross-reactivity induced by nontuberculous mycobacteria from bacille Calmette–Gue´rin vaccination, which demonstrates its superior capability of reducing over-diagnosis of TB and guiding clinical management.

As a fact that about one-third people in the world have been infected by M. tuberculosis in which estimated 2 billion people under latent tuberculosis infection (LTBI) have a 10 % lifetime risk of developing active TB [29], there is an urgent need to develop the golden standard of early LTBI’s diagnosis to solve the problems separating the LTBI from the ATB. Although increasing studies has demonstrated that the IGRAs promoted diagnosing LTBI because of its better specificity [20], there is no doubt that the false-positive rate is still high, which may lead to abundant unnecessary treatments resulting in drug resistance. So more researches are needed to be conducted to evaluate the value of the diagnosis of IGRAs.

The risk of tuberculosis (TB) in patients with an immunocompromised medical condition is greater than that in the general population [30]. Several studies conducted in South Korean populations have manifested that IGRAs had a predominant diagnostic sensitivity in active TB patients who were immunosuppressed [3133]. The sensitivity of the T-SPOT was 0.720 (95 % CI 0.542–0.862), which was far higher than the TST (0.423), while the specificity of the T-SPOT was 0.423 and the specificity of the TST was 0.918. T-SPOT’s low specificity could make it insufficient to rule out TB disease, and the TST’s low sensitivity made it fail to rule in TB disease. So when referring to the immunocompromised TB patients, IGRAs might replace T-SPOT or TST for a definite diagnosis.

Besides, it is difficult to detect tuberculosis infections in HIV-infected patients, since the decreased number of CD4+ and CD8+ cells in their immune system brings about immune escape [34]. In this study, the sensitivity of the T-SPOT was 0.413 (95 % CI 0.354–0.488), which might be inaccurate due to different microenvironments in patients with other latent diseases or diverse received treatments. This was also far lower than the pooled sensitivity of 72 % in low/middle-income countries in a meta-analysis [35]. Another limitation of IGRA testing among HIV-infected patients was the rate of indeterminate results. Previous reports from the UK described indeterminate T-SPOT results in 2–7.4 % of HIV-infected patients [3638]. However, in this study, it had not been found. As to TST, it failed to be a diagnostic method for the TB patients infected by HIV taken its sensitivity of 12.9 % into consideration.

In conclusion, IGRAs showed a superior capability than the TST to be a diagnostic approach for the tuberculosis because of its far higher sensitivity and specificity contributing to tell the TB patients apart. Also, its higher DOR and accuracy made it a valid alternative to the TST. However, neither the IGRAs nor the TST revealed ideal stability, which led to their restricted use. Furthermore, both of their costs and trauma should be considered. So we finally concluded that without a better precise diagnosis, the IGRAs could be a priority option to detect TB patients.