Introduction

Solitary pulmonary nodule (SPN), which is focal, circular, high density of solid lung shadow with diameter less than 3 cm, frequently initially emerges on chest radiography, or computed tomography (CT) images. Although there are about 20–30% of people having SPNs, 90% of them will not develop into malignancies. Hence, the oncologists are faced with the dilemma of choosing between biopsy of the SPNs and observation, and feel obliged to distinguish malignant nodules from benign with non-invasive approach. The optimal technique is to share the virtues of both early identifying malignancy and avoiding unnecessary surgery.

Currently, the guidelines [1, 2] recommend using CT to evaluate SPNs. However, the problems of both radiation and low diagnostic specificity of CT for SPNs cannot be ignored. Additionally, it has been documented that the application of dynamic magnetic resonance imaging (MRI), fluorine 18 fluorodeoxyglucose (18F-FDG) positron emission tomography (PET), and technetium 99 m (99mTc) depreotide single photon emission computed tomography (SPECT) to evaluate for diagnosis of SPNs in routine clinical practice recently. However, the standard non-invasive clinical strategy for SPNs remains to be established. Thereby, it is imperative to determine the diagnostic accuracy of the four imaging modalities in distinguishing malignant from benign SPNs based on the big data.

In 2008, Paul Cronin et al. had compared the application of dynamic CT, dynamic MRI, 18F-FDG PET, and 99mTc-depreotide SPECT for SPNs [3]. However, it lacked of some crucial parameters, such as likelihood ratio (LR), diagnostic score, and diagnostic odds ratio (DOR). Moreover, during the ensuing decade, numerous articles about the four imaging modalities diagnosis of malignant SPNs sprung up. Therefore, we conducted this study based on a large scale (73 cohorts incorporating 7956 individuals) and more parameters to identify the competent approach to differentiate malignant SPNs from benign.

Materials and methods

Literature search

Researches were identified by a systematic electronic literature search for abstracts of relevant studies in the published literatures. MEDLINE, PubMed, EMBASE, and Cochrane Library were screened and updated to Nov 26, 2019. The following basic search terms were used: “computed tomography”, “CT”, “dynamic computed tomography”, “dynamic CT”, “dynamic contrast-enhanced computed tomography”, “DCE CT”, “magnetic resonance imaging”, “MRI”, “dynamic magnetic resonance imaging”, “dynamic MRI”, “dynamic contrast-enhanced magnetic resonance imaging”, “DCE MRI”, “positron emission tomography”, “PET”, “fluoro-2-deoxy-D-glucose positron emission tomography”, “FDG PET”, “single photon emission computed tomography”, “SPECT”, “99mTc-depreotide single photon emission computed tomography”, “99mTc-depreotide SPECT”, “solitary pulmonary nodule”, “SPN”, “solitary lung nodule”, “diagnosis”, “evaluation”, “diagnostic test”, “prediction”. Full-text articles were reviewed if sufficient information were unavailable in abstracts. Moreover, the reference lists of related articles were scrutinized for additional studies. Reviews, case reports, letters to the editor, editorials comments, and conference abstracts were excluded. The search was carried out without any language restriction.

Selection of studies

Initially, two researchers, respectively, performed a screening of titles and abstracts, then scrutinized the full-text articles to hunt for relevant studies. Finally, we assessed eligibility and the methodologic quality of the trials, and summarized the diagnostic accuracy findings.

Inclusion criteria

Inclusion criteria were as following: (1) the parameters of dynamic CT, dynamic MRI, 18F-FDG PET, and 99mTc-depreotide SPECT in evaluation of SPNs were available; (2) imaging results were compared with histologic sample (percutaneous or surgical biopsy or surgical resection) findings for more than half of the patients; (3) detailed raw data (i.e., true-positive, true-negative, false-positive, and false-negative findings) were available; (4) the sample size ≥ 10 patients.

Data extraction

Two investigators independently extracted data from all the recruited trials. All of the eligible studies contained the following information: the name of first author, continent of the research, year of publication, study design, and number of patients. Each investigator independently collected the data to analyze true-positive, true-negative, false-positive, and false-negative of imaging results.

Statistical analyses

Test performance metrics

Sensitivity, specificity, LRs, diagnostic score, and DOR with 95% confidence intervals (CIs) are recalculated from the contingency table of true-positive, true-negative, false-positive, and false-negative results.

Meta-analysis model

Parameters were calculated by using a bivariate mixed-effects regression model. The standard output of the bivariate model includes: mean logit sensitivity and specificity with their standard errors and 95% CIs; and estimation of the between-study variability in logit sensitivity and specificity and the covariance between them. Summary sensitivity, specificity, the corresponding positive likelihood ratios (PLRs), negative likelihood ratios (NLRs), diagnostic score, and DOR are derived as functions of the estimated model parameters. The LRs indicate that by how much a given test would raise or lower the probability of having disease. The value of a DOR ranges from 0 to infinity, with higher values indicating better discriminatory test performance. The DOR is a single summary measure with the caveat that the same odds ratio may be acquired with different sensitivity and specificity. The area under the summary receiver-operating characteristic (SROC) curve serves as a global measure of test performance. The following guidelines used to interpret intermediate SROC values: low (0.5 ≤ AUC ≤ 0.7), moderate (0.7 ≤ AUC ≤ 0.9), and high (0.9 ≤ AUC ≤ 1) accuracy. In this study, all estimations were performed by using the MIDAS (bivariate mixed-effects regression model) module in Stata 12.0 software.

Assessment of quality and heterogeneity

Two reviewers independently assessed the quality of each study based on the prospectively developed criteria that were modified from well-accepted methodologic standards for evaluating quality in diagnostic test research; disagreements were resolved through discussion and consensus. The following nine criteria were evaluated, and a grade of 1 was given for each criterion that was fulfilled: prospective study design (prodesign), sample size of 30 or more subjects (ssize30), the uniform pathological biopsy reference standard test (fulverif), sufficient description of the reference standard (refdescr), adequate description of the validated test (testdescr), sufficient clinical description of subjects (subjdescr), adequate reporting of results (report), broad population (brdspect), and blinded interpretation of test results (blinded).

Heterogeneity among the studies is estimated graphically by Galbraith (radial) plot and statistically by I2. A value of 0% indicates no observed heterogeneity, and values ≥ 50% may be considered substantial heterogeneity. The advantage of I2 is independent of the number of the studies in the meta-analysis.

Publication bias

Publication bias arises when the published studies only represent partial researches on a specific topic. It is assessed visually using a scatter plot, which is depicted as a symmetrical funnel shape when publication bias is absent [4], and P < 0.05 for the slope coefficient indicates significant asymmetry.

Results

Study identification

Initially, the search yielded 10,458 potential literature citations. Subsequently, 8572 were excluded for irrelevant, non-clinical trials, reviews, letters, case reports, and 1391 for duplicates. After verifying the related terms in the titles and abstracts, 392 irrelevant studies were removed, and 93 unfit designed studies were eliminated through analyzing the full text. Eventually, 73 published trials met inclusion criteria (see the supplemental data for details) (Fig. 1). There were 2 kinds of imaging modalities to assess the SPNs in 17 studies and 3 kinds of imaging modalities in 3 studies.

Fig. 1
figure 1

Flow diagram of search strategies

Study characteristics

The 73 studies incorporating 7956 patients were published ranging from 1990 to 2019. Among them, 49 trials were prospective; 25 trials were performed in America, 16 trials in Europe, and 28 trials in Asia. The average age of patients was 62.5, and the average number of nodules per study was 108. The final diagnosis for all subjects was confirmed pathologically in 37 studies while the final diagnosis was made either pathologically or clinically in 36 studies (Table 1).

Table 1 Information of included studies

Diagnostic parameters and summary assessment

For 31 dynamic CT studies, the pooled sensitivity was 0.92, 95% CI (0.89–0.95) (Fig. 2 CT1); the pooled specificity 0.64, 95% CI (0.54–0.74) (Fig. 2 CT1); the pooled PLR 2.6, 95% CI (2.0–3.4) (Fig. 2 CT2); the pooled NLR 0.12, 95% CI (0.08–0.17) (Fig. 2 CT2); the pooled diagnostic score 3.10, 95% CI (2.62–3.59) (Fig. 2 CT3); and the pooled DOR 22, 95% CI (14–36) (Fig. 2 CT3). The area under the SROC curve was 0.91, 95% CI (0.88–0.93) (Fig. 3a).

Fig. 2
figure 2

The diagnostic parameters with corresponding 95% CIs for SPNs. CT1 sensitivity and specificity of dynamic CT, CT2 PLR and NLR of dynamic CT, CT3 diagnostic score and DOR of dynamic CT, MR1 sensitivity and specificity of dynamic MRI, MR2 PLR and NLR of dynamic MRI, MR3 diagnostic score and DOR of dynamic MRI, PET1 sensitivity and specificity of 18F-FDG PET, PET2 PLR and NLR of 18F-FDG PET, PET3 diagnostic score and DOR of 18F-FDG PET, SPECT1 sensitivity and specificity of 99mTc-depreotide SPECT, SPECT2 PLR and NLR of 99mTc-depreotide SPECT, SPECT3 diagnostic score and DOR of 99mTc-depreotide SPECT

Fig. 3
figure 3

Areas under SROC curves. a For dynamic CT; b for dynamic MRI; c for 18F-FDG PET; d for 99mTc-depreotide SPECT

With regard to 14 dynamic MRI studies, the pooled sensitivity was 0.92, 95% CI (0.86–0.95) (Fig. 2 MR1); the pooled specificity 0.85, 95% CI (0.77–0.90) (Fig. 2 MR1); the pooled PLR 6.0, 95% CI (3.9–9.2) (Fig. 2 MR2); the pooled NLR 0.10, 95% CI (0.06–0.17) (Fig. 2 MR2); the pooled diagnostic score 4.12, 95% CI (3.41–4.82) (Fig. 2 MR3); and the pooled DOR 61, 95% CI (30–124) (Fig. 2 MR3). The area under the SROC curve was 0.94, 95% CI (0.92–0.96) (Fig. 3b).

Concerning 41 18F-FDG PET studies, the pooled sensitivity was 0.90, 95% CI (0.86–0.93) (Fig. 2 PET1); the pooled specificity 0.73, 95% CI (0.65–0.79) (Fig. 2 PET1); the pooled PLR 3.3, 95% CI (2.6–4.2) (Fig. 2 PET2); the pooled NLR 0.14, 95% CI (0.10–0.19) (Fig. 2 PET2); the pooled diagnostic score 3.16, 95% CI (2.69–3.64) (Fig. 2 PET3); and the pooled DOR 24, 95% CI (15–38) (Fig. 2 PET3). The area under the SROC curve was 0.90, 95% CI (0.87–0.92) (Fig. 3c).

Regarding 10 99mTc-depreotide SPECT studies, the pooled sensitivity was 0.93, 95% CI (0.88–0.96) (Fig. 2 SPECT1); the pooled specificity 0.70, 95% CI (0.56–0.81) (Fig. 2 SPECT1); the pooled PLR 3.1, 95% CI (2.0–4.8) (Fig. 2 SPECT2); the pooled NLR 0.10, 95% CI (0.06–0.17) (Fig. 2 SPECT2); the pooled diagnostic score 3.43, 95% CI (2.63–4.22) (Fig. 2 SPECT3); and the pooled DOR 31, 95% CI (14–68) (Fig. 2 SPECT3). The area under the SROC curve was 0.93, 95% CI (0.91–0.95) (Fig. 3d).

In summary, the four imaging modalities were promising in distinguishing malignant SPNs from benign, and the differences among them held no significance.

Study quality scores

The study quality scores of 31 dynamic CT trials (Fig. 4a), 14 dynamic MRI trials (Fig. 4b) and 10 99mTc-depreotide SPECT trials (Fig. 4d) were ranged from 5 to 9 while that of 41 18F-FDG PET trials were 4 to 9 (Fig. 4c).

Fig. 4
figure 4

Graphs illustrate the quality criteria of studies assessed in this study: a for dynamic CT; b for dynamic MRI; c for 18F-FDG PET; d for 99mTc-depreotide SPECT (the percentages of trials that met the given criteria)

Study heterogeneity and publication bias

Concerning heterogeneity, the I2 was 99% for dynamic CT (Fig. 5 CT1), 93% for dynamic MRI (Fig. 5 MR1), 99% for 18F-FDG PET (Fig. 5 PET1), and 66% for 99mTc-depreotide SPECT (Fig. 5 SPECT1), respectively.

Fig. 5
figure 5

The heterogeneity (Galbraith Graphs and univariable meta-regression) and asymmetry test (Deeks’ Funnel Plot) of studies. CT1 Galbraith Graph for dynamic CT, CT2 univariable meta-regression for dynamic CT, CT3 Deeks’ Funnel Plot for dynamic CT, MR1 Galbraith Graph for dynamic MRI, MR2 univariable meta-regression for dynamic MRI, MR3 Deeks’ Funnel Plot for dynamic MRI, PET1 Galbraith Graph for 18F-FDG PET, PET2 univariable meta-regression for 18F-FDG PET, PET3 Deeks’ Funnel Plot for 18F-FDG PET, SPECT1 Galbraith Graph for 99mTc-depreotide SPECT, SPECT2 univariable meta-regression for 99mTc-depreotide SPECT, SPECT3 Deeks’ Funnel Plot for 99mTc-depreotide SPECT

The analysis of meta-regression revealed that sources of heterogeneity for dynamic CT (Fig. 5 CT2), dynamic MRI (Fig. 5 MR2), 18F-FDG PET (Fig. 5 PET2), and 99mTc-depreotide SPECT (Fig. 5 SPECT2) were shown in Fig. 5.

The P value of Deeks’ Funnel Plot Asymmetry Test for dynamic CT (Fig. 5 CT3), dynamic MRI (Fig. 5 MR3), 18F-FDG PET (Fig. 5 PET3), and 99mTc-depreotide SPECT (Fig. 5 SPECT3) were 0.65, 0.21, 0.74, and 0.61, respectively. These results suggested the degree of publication bias was trivial.

Discussion

Worldwide, pulmonary cancer is ranked first malignancy. Early diagnosis and treatment play a pivotal role in yielding the favorable prognosis of the disease. Noticeably, the initial identifiable manifestation of pulmonary cancer is dominantly an SPN, whose incidence amounts to 1 per 500 images on regular chest radiographs. However, approximately half of indeterminate SPNs are confirmed as benign lesions through transbronchial/transthoracic biopsy or surgery [5,6,7]. Moreover, the invasive procedures are confronted with expensive cost and relevant complications and mortality [8, 9]. Hence, a non-invasive diagnostic approach for SPNs is imperative. Furthermore, a preferred technique should share the virtues of accuracy, reliability, availability, and cost-effectiveness [10]. Thereby, we compared the accuracy of the four imaging modalities for SPNs in an attempt to find an optimal non-invasive diagnostic approach.

First, the dynamic MRI has an advantage to distinguish malignant from benign SPNs, especially for pulmonary lesions with a diameter > 5 mm [11], which is characterized by non-radiation and universal applicability. In addition, when compared with other three imaging modalities, the dynamic MRI showed the well-matched diagnostic accuracy for SPNs in our study. Conventionally, the MRI was impeded to become a regular imaging fashion for SPNs due to known artifacts that result from tissue–air transitions and relatively low spatial resolution. However, several advances have been made in MRI technique (e.g., DWI [12,13,14], 3D GRE VIBE sequence [11, 15], ultrafast imaging techniques [16,17,18]) with the use of kinetic and morphologic parameters to improve the image quality in lung MRI and visualization of SPNs [19], which promisingly made the dynamic MRI to become an alternative standard approach for SPNs.

Secondly, the dynamic CT is distinctive in evaluating tumor vascularity and routinely applied to distinguish malignant from benign SPNs. Moreover, the recent technological advances in the form of multidetector-row CT (MDCT) contribute to accurate evaluation of hemodynamics [20,21,22]. Additionally, the sensitivity of dynamic CT for SPNs was further improved by combining net enhancement with wash-out patterns in the delayed dynamic phase. However, the fly in the ointment is that the specificity of dynamic CT is relatively low, which was also validated in our study. The reason is that some benign nodules also display enhancement in dynamic contrast-enhanced CT. Another flaw is radiation of CT. Although the application of low-dose CT for screening SPNs may reduce radiation to some degree, it comes at the cost of lower resolution.

Thirdly, 18F-FDG PET, a non-invasive functional imaging, has proved to be valuable for SPNs by measuring metabolic activity via the standard uptake value (SUV) [23, 24]. Based on the U.S. bibliography, 18F-FDG PET can spare unnecessary surgery in approximately 15% of individuals [25]. However, a variety of factors can impact the SUV value [26]: the body size of patient, the blood glucose concentration, the time to imaging after injection, and the nodules volume. Furthermore, this technique has defects of both high cost and radiation. Therefore, as shown in our results, when compared with dynamic MRI and CT, the 18F-FDG PET possessed no advantage in the identification of SPNs.

Finally, 99mTc-depreotide SPECT is correlated with the introduction of receptor scintigraphy and widely used in clinical practice. Based on the overexpression of somatostatin receptors on the tumor cells [27, 28], 99mTc-depreotide SPECT has been verified for the diagnosis of malignant SPNs using a 99mTc-radiolabeled somatostatin analog. From our results, the diagnostic efficacy of 99mTc-depreotide SPECT for judging SPNs resembled that of 18F-FDG PET. However, it also has the disadvantage of radiation.

In summary, the dynamic MRI, dynamic CT, 18F-FDG PET, and 99mTc-depreotide SPECT are all promising non-invasive approaches to distinguish malignant SPNs from benign. The dynamic MRI is the only imaging modality without radiation among the four imaging modalities. Additionally, from the viewpoint of cost-effectiveness and convenience, the dynamic MRI is superior to 18F-FDG PET or 99mTc-depreotide SPECT. Thus, the dynamic MRI may serve as the preferred imaging modality for SPNs. As the development and accumulation of the medical big data, more large-scale multicenter studies for diagnosis of SPNs are recommended. Meanwhile, the artificial intelligence (AI) in imaging is emerging. An important agenda for future research for SPNs will involve image omics based on the AI and medical big data.

Limitation

This research confronted following two flaws: one was the heterogeneity among recruited trials; the other was that the information to subgroup analysis on the SPN size were unavailable.

Conclusion

The dynamic MRI, dynamic CT, 18F-FDG PET, and 99mTc-depreotide SPECT were favorable non-invasive approaches to distinguish malignant SPNs from benign. Moreover, from the viewpoint of cost-effectiveness and avoiding radiation, the dynamic MRI was recommendable for SPNs.