Introduction

Ultrasound-guided fine-needle aspiration (FNA) is widely accepted as the primary diagnostic tool for the evaluation of thyroid nodules. FNA is a safe, simple, and reliable method for evaluating the thyroid nodule and has reduced the number of unnecessary thyroid surgeries. Although FNA has high diagnostic accuracy for thyroid nodules, it has several limitations: the non-diagnostic result rate ranges from 10% to 36% [1,2,3,4] and the indeterminate result rate ranges from 3% to 18% [2, 3, 5]. Furthermore, the false-negative rate of thyroid nodules with a benign FNA result reaches 13.6–56.6% when the thyroid nodules have suspicious features on ultrasound [6]. Poor specimen quality due to insufficient cellularity and preservation is considered to be the cause of misdiagnoses using FNA [7].

Core needle biopsy (CNB) has been suggested as an additional diagnostic tool to FNA. Many studies have demonstrated the successful use of CNB for thyroid nodules after initial non-diagnostic or indeterminate results on FNA [8,9,10,11,12,13,14,15]. Recent systematic reviews and meta-analyses have examined the value of CNB as a complementary diagnostic tool in thyroid nodules with initially non-diagnostic or indeterminate results on FNA [16, 17]. In addition to the complementary role of CNB, several studies have published data on the use of CNB as a primary diagnostic tool for initially detected thyroid nodules [18,19,20,21]. However, some physicians still doubt the value of CNB for the evaluation of thyroid nodules due to insufficient evidence. No systematic review with meta-analysis has assessed the role of CNB in the evaluation of initially detected thyroid nodules. Accordingly, the aim of this study was to systematically review the published literature and evaluate the prevalence of non-diagnostic results and inconclusive results and the diagnostic performance of CNB for initially detected thyroid nodules.

Materials and methods

Literature search strategy

A computerised search of the MEDLINE and Embase databases was performed to identify relevant original articles on the use of CNB as a first-line diagnostic tool for initially detected thyroid nodules until 17 August 2017. We used the following search terms: ((thyroid)) AND ((core-needle biopsy) OR (core needle biopsy) OR (CNB)). Only studies published in English were included. The bibliographies of the selected articles were screened to identify other relevant articles. Endnote version X8 (Thomson Reuters, New York, NY, USA) was used to manage the literature.

Inclusion criteria

Studies investigating the use of CNB as a first-line diagnostic tool for initially-detected thyroid nodules were eligible for inclusion. Studies or subsets of studies satisfying all of the following criteria were included:

  1. (a)

    Population: Patients with thyroid nodules who underwent CNB as a first-line diagnostic tool without a previous FNA procedure.

  2. (b)

    Reference standard: Because the diagnostic criteria for CNB of thyroid nodules have not been standardised, the histological results of CNB were categorised into the six categories of the Bethesda System [8, 11, 22]. The six categories of the Bethesda System include non-diagnostic, benign, atypia of undetermined significance (AUS) or follicular lesion of undetermined significance (FLUS), follicular neoplasm or suspicious for follicular neoplasm, suspicious for malignancy and malignant (21). Non-diagnostic results included the absence of any identifiable follicular thyroid tissue [23], follicular cells are present [24], but those are regarded as normal thyroid tissue only [25], and tissues containing only a few follicular cells insufficient for diagnosis [3]. Inconclusive results were defined as Bethesda category 1 and category 3. The diagnostic criteria for malignancy were defined as Bethesda category 6 (malignancy). Malignant nodules were diagnosed after surgery or after biopsy. Benign nodules were diagnosed after surgery, after at least two sets of benign findings on FNA and/or CNB, or after benign cytology findings on FNA or CNB with a stable nodule size after 1 year.

  3. (c)

    Study designs: All observational studies (retrospective or prospective).

  4. (d)

    Outcomes: Results reported in sufficient detail to evaluate the prevalence of non-diagnostic results, inconclusive result and diagnostic performance of CNB.

Exclusion criteria

The exclusion criteria were as follows: (a) case reports and case series with a sample size smaller than ten patients; (b) review articles, editorials, letters, comments and conference proceedings; (c) studies that did not focus on the use of CNB as the first-line diagnostic tool for thyroid nodules; (d) studies with insufficient data to be included in a meta-analysis of rates of non-diagnostic results and diagnostic performance; and (e) studies with overlapping patients and data. Two reviewers (S.R.C. and C.H.S.) independently selected literature reports using a standardised form.

Data extraction

One reviewer (S.R.C.) extracted data from the studies, and the second reviewer (C.H.S.) double-checked the accuracy of the extracted data and resolved any uncertainty through discussion. The following data were extracted from each of the selected studies onto standardised data forms: (a) Study characteristics: authors, year of publication, hospital or medical school, years of patient recruitment, sample size and study design; (b) demographic and clinical characteristics of patients: mean age, nodule size and patient reference standards; (c) rates of non-diagnostic and inconclusive results of CNB; (d) diagnostic performance of CNB for the diagnosis of malignancy; and (e) complications.

Quality assessment

The methodological quality of the included studies was assessed independently by two reviewers (S.R.C. and C.H.S.) using tailored questionnaires devised according to Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) criteria [23]. Disagreements were very minor and were resolved by consensus.

Data synthesis and analyses

The pooled proportions for non-diagnostic and inconclusive results of CNB for initially-detected thyroid nodules were adopted as the main indices for this meta-analysis. The meta-analytic pooling was conducted by the inverse variance method for calculating weights. The overall proportion was obtained using fixed and random effects meta-analysis of single proportions and logit transformation of proportions [26, 27]. We also obtained the confidence interval (CI) using the Clopper-Pearson interval for individual studies and used a continuity correction of 0.5 in studies with zero cell frequencies. Heterogeneity among the studies was determined by using the following methods: the Cochran Q-test for pooled estimates (p < .05 indicating significant heterogeneity) and the Higgins inconsistency index (I2) test (0–40%, may not be important; 30–60%, may represent moderate heterogeneity; 50–90%, may represent substantial heterogeneity; 75–100%, may represent considerable heterogeneity) [28, 29]. Publication bias was visually assessed by funnel plots, and statistical significance was evaluated by Egger’s test [30]. Thereafter, publication bias-adjusted pooled estimates – that is, adjusted pooled proportions – were obtained by using the trim-and-fill method [31]. If the original unadjusted pooled proportions and the trim-and-fill–adjusted pooled proportions were in agreement, the results were regarded as robust for publication bias. We also carried out multiple subgroup analyses according to the design of the study (retrospective or prospective), origin of the study (Asia vs. outside Asia) and who performed the CNB (radiologist or not).

The secondary index of this study was the diagnostic performance of CNB in diagnosing malignancy. A threshold effect was visually assessed using coupled forest plots of sensitivity and specificity. The Spearman correlation coefficient between the sensitivity and a false-positive rate was obtained, and > 0.6 was considered a considerable threshold effect [32]. Hierarchical logistic regression modeling (bivariate modeling and hierarchical summary receiver operating characteristic [HSROC] modeling) was used to calculate pooled summary estimates of sensitivity and specificity [33,34,35]. An HSROC curve with a 95% confidence region and prediction region was plotted. Publication bias was evaluated using a Deeks’ funnel plot and Deeks’ asymmetry test [36].

All statistical analyses were performed by one reviewer (C.H.S., with 4 years of experience in performing systematic reviews and meta-analyses) using the ‘meta’ package in R v. 3.4.1 (R Foundation for Statistical Computing, Vienna, Austria) and the ‘metandi’ and ‘midas’ modules in Stata 10.0 (StataCorp, College Station, TX, USA).

Results

Literature search

Our study selection process is illustrated in Fig. 1. The literature search of the Ovid-MEDLINE and Embase databases generated 639 initial articles; after duplicates were removed, 398 articles were screened for eligibility. Of the remaining articles, we excluded 379 after reviewing the titles and abstracts: 167 articles that were not in the field of interest; 126 conference abstracts; 47 review articles; 30 case reports; and nine editorial letters. The full texts of the remaining 19 articles were reviewed; an additional search of the bibliographies of these articles identified no further eligible studies. Of these 19 articles, we further excluded six after full text review: four articles that were not in the field of interest [37,38,39,40], one article with partially overlapping patient cohorts [41] and one article with insufficient data [42]. Finally, 13 eligible studies, which included a total sample size of 9,166 patients with 13,585 nodules, were included in this meta-analysis [18,19,20,21, 43,44,45,46,47,48,49,50,51].

Fig. 1
figure 1

Study flow diagram

Characteristics of the included studies

The characteristics of the 13 included studies are detailed in Table 1. The 13 original articles included 11 retrospective studies [18,19,20,21, 43, 45, 46, 48,49,50,51] and two prospective studies [44, 47]. All included studies had clear descriptions of the CNB technique and equipment. There were six studies from Asian countries (South Korea, 5; Japan, 1) and seven from outside Asian countries (USA, 4; UK, 1; Spain, 1; Denmark, 1). The mean patient age was 51.3 years (range, 13–92 years) and the mean nodule size was 2 cm (range, 0.2–13cm). CNB was performed by a radiologist in 11 studies and by a surgeon in two studies. Nodule composition was described in six studies; most nodules were solid [20, 21, 47, 49,50,51].

Table 1 Characteristics of the included studies

Quality assessment

The QUADIA-2 quality of the included studies was moderate overall, and all of the studies satisfied at least four of the seven items (Supplementary Fig. 1). Eight studies had a high risk of bias in the reference standard due to a poor description or inappropriate definition in the pathological report of the CNB [18, 19, 43,44,45,46,47,48].

Pooled proportions of non-diagnostic results and inconclusive results of CNB

The prevalence of non-diagnostic results and inconclusive results of CNB was described in 12 [18,19,20,21, 44,45,46,47,48,49,50,51] and five [20, 21, 49,50,51] studies, respectively. The pooled proportions of non-diagnostic results and inconclusive results on CNB are summarised in Table 2, and the corresponding forest plots are shown in Fig. 2. The pooled proportion of non-diagnostic results was 3.5% (95% CI 2.4–5.1), and the pooled proportion of inconclusive results was 13.8% (95% CI 9.1–20.3). Considerable heterogeneity was observed among the studies in terms of the pooled proportions of CNB (I2=92.9%, 97%). The funnel plots showed no publication bias for the pooled proportion of non-diagnostic results on CNB (p = .1945).

Table 2 Summary of the meta-analytic pooled proportions of non-diagnostic results and inconclusive results after core needle biopsy (CNB)
Fig. 2
figure 2

Forest plots of the non-diagnostic result (a) and inconclusive result (b) of core needle biopsy (CNB)

Multiple subgroup analyses

A summary of the multiple subgroup analyses of the non-diagnostic results of CNB is presented in Table 3. The pooled non-diagnostic results of CNB were significantly higher in retrospective studies than in prospective studies (p = .01). There was no significant difference in the pooled non-diagnostic results of CNB between studies originating within Asia and outside Asia (p = .347). The pooled non-diagnostic results of CNBs performed by a radiologist were 3.5%.

Table 3 Comparison of the pooled non-diagnostic results after core needle biopsy (CNB) in each subgroup

Diagnostic performance of CNB for malignancy

The diagnostic performance of CNB for malignancy was described in nine studies involving 5,010 nodules [19, 20, 43, 45, 46, 48,49,50,51]. The diagnostic criteria for malignancy involved a classification of Bethesda category 6 (malignancy). The coupled forest plots of the sensitivity and specificity of CNB for malignancy diagnosis are shown in Fig. 3. CNB demonstrated a summary sensitivity of 80% (95% CI 75–85) and specificity of 100% (95% CI 93–100). The area under the HSROC curve is shown in Fig. 4 and was 0.93 (95% CI 0.91–0.95). Regarding the linear regression test of funnel plot asymmetry, the statistically insignificant value (p = .07) for the slope coefficient suggested symmetry in the data and a low likelihood of publication bias (Supplementary Fig. 2).

Fig. 3
figure 3

Forest plots of the sensitivity and specificity of core needle biopsy (CNB) for diagnosing thyroid malignancy. Horizontal lines indicate 95 % CIs of the individual studies

Fig. 4
figure 4

Hierarchical summary receiver operating characteristic (HSROC) curve of the diagnostic accuracy of core needle biopsy (CNB) for diagnosing thyroid malignancy

Complications

Twelve of the 13 studies reported complications of CNB [18,19,20, 43,44,45,46,47,48,49,50,51]. In these 12 studies, only two major complications occurred in 6,979 patients [19, 48]. One patient had some bleeding after CNB and was admitted to the hospital for overnight observation, without any intervention [48]. Another patient had a major complication of recurrent laryngeal nerve damage after CNB [19]. The damage to the recurrent laryngeal nerve occurred due to direct puncture of the nerve and caused permanent dysphonia. This complication was caused by a lateral approach of the needle. No procedure-related deaths or need for intervention were reported. All other studies reported the occurrence of minor complications including haematoma (n = 73), soft tissue infection (n = 1) and parenchymal oedema (n = 9) [18,19,20, 43,44,45,46,47,48,49,50,51].

Discussion

Our present meta-analysis revealed that CNB is an acceptable diagnostic tool for initially-detected thyroid nodules. In this present study, we found pooled proportions of 3.5% (95% CI 2.4–5.1) for non-diagnostic results and 13.8% (95% CI 9.1–20.3) for inconclusive results. With regard to the diagnostic performance for malignancy, CNB showed a summary sensitivity of 80% (95% CI 75–85) and specificity of 100% (95% CI 93–100). There were only two major complications associated with CNB among 6,979 patients. These results suggest that CNB can be used as a primary diagnostic tool for initially-detected thyroid nodules as well as a subsequent diagnostic tool to FNA.

Over the last few decades, many original articles have described the role of CNB for thyroid nodules. Many studies have investigated the role of CNB in thyroid nodules previously diagnosed as non-diagnostic or inconclusive in FNA [8,9,10,11,12,13,14,15, 52,53,54]. Two systematic reviews and meta-analyses have examined the value of CNB as a complementary diagnostic tool in thyroid nodules with initially non-diagnostic or indeterminate results on FNA [16, 17]. The results showed low non-diagnostic results of 1.8% (95% CI 0.4–3.2) and high specificity (100%) for the diagnosis of malignancy in nodules with an initially indeterminate result on FNA [16]. For nodules with an initially non-diagnostic result on FNA, CNB also showed lower non-diagnostic results (6.4% vs. 36.5%) and higher diagnostic accuracy (sensitivity, 89.8% vs. 60.6%) than repeat FNA [17]. In addition, current guidelines recommend that CNB be considered when the cytological results of FNA are repeatedly inadequate or inconclusive [55,56,57]. However, because the role of indeterminate lesions is still unsettled, routine use of CNB is not currently recommended by guidelines [56].

Several recent studies have reported the potential of CNB as a primary diagnostic tool for patients with thyroid nodules [18,19,20, 51]. However, CNB is not routinely used in thyroid biopsy. This is partly due to limited evidence of the efficacy of CNB and concerns about safety [56, 58, 59]. Our meta-analysis revealed that the non-diagnostic result rate was 3.5% (95% CI 2.4–5.1), which is somewhat lower than the 2–16% of FNA [58, 60, 61].The sensitivity for diagnosis of malignancy was 80%, which is slightly higher than the reported sensitivity of FNA of 74% [20]. The advantages of CNB may be explained by its ability to sample large amounts of tissue, assess histological architecture (rather than cytology), and function with a low rate of operator dependence, if targeting of thyroid nodules is successful [8]. In addition, our data confirm the safety of the CNB procedure, with only two reported major complications in 6,979 patients who underwent CNB. One patient had some bleeding after CNB and was admitted to the hospital for overnight observation, without the need for an intervention [48]. Another patient had a major complication of recurrent laryngeal nerve damage after CNB [19]. The recurrent laryngeal nerve damage was due to direct puncture of the nerve caused by a lateral approach of the needle. To prevent this complication, the transisthmic approach is recommended [62]. By using a transisthmic approach, the operator can prevent direct injury to the recurrent laryngeal nerve and can carefully measure the safe distance from the needle tip before the stylet is fired.

This meta-analysis also showed variable heterogeneities regarding the pooled proportions for CNB, which affects the general applicability of the summary estimates. To overcome the heterogeneity, we performed subgroup analyses according to the operator, study design and origin of the study. In subgroup analysis, the non-diagnostic result rate of retrospective studies was significantly higher than that of prospective studies, which seems to be due to the fact that the two prospective studies were published in 1994 and 2001, before the development of modern techniques and devices of CNB. There was no significant difference in the pooled non-diagnostic results of CNB between studies originating within Asia and outside Asia. Thus, with modern techniques and devices, CNB may be a more effective diagnostic tool for thyroid nodules, regardless of the region.

This study had some limitations. First, the diagnostic categories of CNB specimens have not been standardised. Eight studies had a high risk of bias in the reference standard due to a poor description or inappropriate definition in the pathological report of the CNB. However, the pathological criteria for non-diagnostic results and malignancy of CNB specimen were identical in almost all studies. Second, ten of the included studies were retrospective studies. However, we used validated systematic review methods and reported our data according to standard reporting guidelines, including the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [63] and the guidelines of the Handbook for Diagnostic Test Accuracy Reviews published by the Cochrane Collaboration [64]. Third, our meta-analysis showed considerable heterogeneity in the pooled proportions. Other variables that might affect the CNB result, such as operator experience, nodule characteristics and number of needle passes were not evaluated by subgroup analysis because it was difficult to extract accurate data for our meta-analysis. Finally, we only included articles in English, which could have resulted in an overestimation or underestimation of the results. In addition, we excluded grey literature, such as letters, case reports, conference abstracts and unpublished data, which may have caused a publication bias. However, it was difficult to extract accurate data for the meta-analysis from these types of publications.

In conclusion, our present systematic review and meta-analysis indicated that CNB has a low non-diagnostic result and high diagnostic accuracy for initially detected thyroid nodules and a low major complication rate. CNB may therefore be a feasible diagnostic tool for patients with initially detected thyroid nodules.