Abstract
Objectives
Ultrasound (US)-guided fine needle aspiration cytology (FNAC) and thyroglobulin measurement (FNA-Tg) are two common methods for confirming lymph node metastases (LNM) in patients with differentiated thyroid carcinoma (DTC). This study aimed at comparing the diagnostic performance of FNAC, FNA-Tg alone, and in combination by means of a meta-analysis.
Methods
Eligible articles were selected according to predefined criteria, and their quality was evaluated as per the QUADAS-2 checklist. We calculated pooled sensitivity (Se), specificity (Sp), positive/negative likelihood ratio, and diagnostic odds ratio (DOR), and plotted the summary receiver operating characteristic (SROC) curve using the Meta-DiSc1.4 software.
Results
Twenty-one studies pooling 1662 malignant and 1279 benign LNs from 2712 patients with DTC were included. The results showed that FNAC was more specific (pooled Sp, 0.98) while FNA-Tg was more sensitive (pooled Se, 0.94). FNAC and FNAC+FNA-Tg performed better postoperatively than FNA-Tg, while FNA-Tg performed better preoperatively. The combination of FNAC and FNA-Tg could achieve a better diagnostic performance than each alone (DOR 446.00, area under the curve [AUC] 0.9862), no matter preoperatively (DOR 378.14, AUC 0.9879) or postoperatively (DOR 788.72, AUC 0.9930). Besides, the combination of FNAC and FNA-Tg/serum-Tg ratio obtained a higher Sp (0.98) than the combination of FNAC and FNA-Tg.
Conclusion
The addition of FNA-Tg, especially the FNA-Tg/serum-Tg ratio, to FNAC could increase the diagnostic performance of LNM in both preoperative and postoperative patients with DTC. Since one test or test combinations could perform differently according to the clinical situation, the best-fitting test should be chosen accordingly.
Key Points
• FNAC is more specific than FNA-Tg while FNA-Tg is more sensitive than FNAC.
• The combination of FNAC and FNA-Tg could achieve a better diagnostic performance than either alone, no matter preoperatively or postoperatively.
• The combination of FNAC and FNA-Tg/serum-Tg ratio could reach a higher Sp than the combination of FNAC and FNA-Tg.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Although disease etiology is still unclear [1], thyroid carcinoma, accounting for 95% of endocrine malignancy, is the sixth most common cancer in women in the USA, and an estimated over 52,000 new cases occurred in men and women in 2019 in this country [2]. Differentiated thyroid cancer (DTC) is the most frequent subtype of thyroid cancer, including papillary thyroid carcinoma (PTC) and follicular thyroid cancer. Like other primary malignancies of the head and neck, PTC presents a consistent pattern of metastases to regional lymph nodes (LNs), ranging from 20 to 50% of cases [3]. Although most patients with DTC have a favorable long-term survival, 5–20% of patients will develop recurrence in LNs during postoperative follow-up, and quite a few patients even present LN metastasis (LNM) at initial diagnosis [4].
A LN burden (ratio of positive LNs to total removed LNs) > 17% in the lateral neck is predictive for recurrence in patients of all ages, and patients with ≥ 2 metastatic LNs might benefit from total thyroidectomy and radioactive iodine therapy [5]. The 2015 guidelines of the American Thyroid Association for the diagnosis and management of DTC recommend that the central and lateral cervical LNs should be examined by routine US before surgery. LNs that are suspicious for DTC metastases on US should undergo US-guided fine needle aspiration cytology (FNAC) [6]. However, several factors may decrease the diagnostic efficacy of FNAC, like small LN size, lymphocyte infiltration, necrosis, or lack of epithelial component in cyst aspirates [7]. In that case, measurement of thyroglobulin (Tg) levels in FNA material has provided additional clues.
The diagnostic accuracy of FNA-Tg variates, due to relatively small sample sizes, different threshold value choices, or distinct clinical settings (before or after thyroidectomy) [8, 9]. In addition, those studies used different detection methods for FNA-Tg measurement, such as immunoradiometric assay (IRMA), immunochemiluminometric assay (ICMA), electrochemiluminescence immunoassay (ECLIA), immunometric assay (IMA), and immunofluorometric assay (IFMA), which might affect the accuracy of analysis since different assays have different analytical sensitivity (Se), functional Se, and inter-assay variability [10, 11]. Therefore, the diagnostic performances of FNA-Tg and its combination with FNAC in identifying LNM need determination under different conditions. This prompted assessment of diagnostic test accuracy by a meta-analysis.
In the era of evidence-based medicine, decision-makers need high-quality data to support decision to use a diagnostic test in a specific clinical situation and to choose the relevant one. Meta-analysis of diagnostic test accuracy studies is a useful method to increase the level of validity by combining data from multiple studies. Statistics dedicated to meta-analyses of diagnostic test accuracy provide either summary points of diagnostic accuracy (e.g., summary (Se), specificity (Sp), positive and negative likelihood ratios, and diagnostic odds ratio (DOR)) or summary lines (i.e., summary receiver operating characteristic [SROC] curves). The DOR takes advantage of accuracy as a single indicator, which is closely linked to both Se and Sp and expresses the strength of the association between test result and disease [12]. The current study had two aims: (1) to compare the diagnostic efficacy of FNAC, FNA-Tg, and their combination at preoperative and postoperative stages; (2) to explore the appropriate cut-off of FNA-Tg in diagnosing LNM of DTC.
Methods
Search strategy
Based on the recommendation in the Cochrane handbook for systematic reviews of diagnostic test accuracy (handbook.cochrane.org) [13], two independent investigators predefined the retrieval strategy and conducted a systematic search of bibliographic databases (PubMed, Web of Science, the Cochrane Library, Google Scholar, and CNKI) using different combinations of the search terms including “FNAC”, “FNA-Tg/Tg”, “lymph node/lymph node metastases”, and “thyroid cancer/thyroid carcinoma”, for articles published up to October 2019. Furthermore, references cited in each identified literature were further searched manually for potential available studies. We deleted duplicated and unrelated articles by reading titles and abstracts, and then further excluded articles not meeting the inclusion criteria by reviewing the full text. We did not set any language restrictions to our search; any differences in the process of article searching and data extracting were resolved through team discussion. We contacted the author for specific raw data if the data provided in the article were not sufficient. To avoid the involvement of duplicated samples, authors of papers that appeared to be relevant were contacted via email if available. If overlapping data by the same first author were found, the article with the largest number of subjects was included.
Inclusion criteria
All the selected studies should meet the following criteria: (1) malignant LNs were confirmed through pathological examination after surgical resection (negative LNs not resected operatively could be confirmed if they stay negative after 2 years’ follow-up); (2) detection results of FNAC+FNA-Tg should be presented; (3) data such as true positivity (TP), false positivity (FP), true negative (TN), and false negative (FN) should be provided or could be calculated with sensitivity (Se), specificity (Sp), accuracy, and so on.
Data extraction and quality assessment
Two reviewers independently extracted the following data: the first author’s name, year of publication, study design, sample size of cases and controls, pathological type of disease, preoperative or postoperative diagnosis of LN metastases, cut-offs of FNA-Tg, Se, Sp, TP, FP, TN, and FN of FNAC, FNA-Tg, and FNAC+FNA-Tg. The included articles were evaluated item-by-item according to the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) checklist [14]. Risks of bias in regard to four key domains including patient selection, index test, reference standard, and the flow and timing were evaluated, and the first three domains were applied to assess applicability concerns. If a study is judged as “low risk” on all domains, then it should be considered as “low risk of bias” or “low concern regarding applicability.” If a study is classified as “high risk” or “unclear” in one or more domains, then it should be judged as “at risk of bias” or having “concerns regarding applicability.” Disagreements generated in this process were resolved by consensus.
Statistical analysis
The threshold effect was evaluated by assessing the shape of a ROC plane plot and the Spearman’s correlation coefficient between the logit of Se and the logit of 1-Sp; p value for Spearman’s correlation coefficient (pa) < 0.05 indicated the existence of a threshold effect. The inconsistency index I-square (I2) and Cochran’s Q were calculated to check heterogeneity caused by non-threshold effect factors, and we set I2 > 50% as substantial heterogeneity. The positive likelihood ratio (PLR, Se/1-Sp) represents the ratio of TP and FP and the negative likelihood ratio (NLR, 1-Se/Sp) equals to the ratio of FN and TN; they act as relatively independent estimators of how much a test result will change the odds of having a disease [15]. The SROC curve with its corresponding area under the curve (AUC) and Q value is a suitable way to evaluate the stability and accuracy of a test. The Q value is the intersection point of the SROC curve with a diagonal from top left to bottom right, which corresponds to the highest value of Se and Sp for the test. p value for the difference between the b value and zero (pb) < 0.05 indicated that an asymmetrical SROC should be plotted; otherwise, the symmetrical SROC was applied. The DOR was computed using the Moses-Shapiro-Littenberg model for symmetrical or asymmetrical SROC, which reflects the relationship between the result of the diagnostic test and the disease. All the statistical outcomes and result plots were produced using Meta-DiSc1.4 software.
Results
Search results and characteristics of included studies
The flow chart summarizing the study screening process is shown in Fig. 1. The initial search provided a total of 407 records, of the 350 records left after removal of duplicate ones, and 281 unrelated ones were further excluded on the basis of titles and abstracts. Next, full texts of the remaining 69 articles were carefully read and screened according to our predefined inclusion criteria, and 48 articles were discarded due to the absence of data to calculate diagnostic parameters (n = 22), no FNAC+FNA-Tg combined diagnostic data (n = 10), insufficient sample size (n = 9), lack of pathological diagnosis (n = 5), and no mentioned FNA-Tg cut-off value (n = 2). Finally, 17 English [16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32] and 4 Chinese [33,34,35,36] articles including 1662 malignant and 1279 benign LNs from 2712 patients with DTC were involved into our meta-analysis.
Among them, 8 studies researched on LNs from preoperative patients, 7 studies included LNs from postoperative patients, 2 studies involved both preoperative and postoperative patients, and 4 studies did not clarify if the patients had received a thyroidectomy. Different detection methods for FNA-Tg were taken: 8, 5, 4, 1, and 1 studies took use of IRMA, ICMA, ECLIA, IMA, and IFMA, respectively, and the remaining 2 studies did not describe what methods they had used. Only about half of the included studies had reported the analytical Se and functional Se of the detection methods for FNA-Tg. Three studies took FNA-Tg/serum-Tg ratio while the others took FNA-Tg levels as the positive standard for the FNA-Tg test. All the above characteristics of the included studies are summarized in Table 1.
Study quality evaluated by QUADAS-2
Based on the QUADAS-2 tool, the patient selection of all the studies matched corresponding signal questions, so it was of low risk of bias and applicability concerns. While, several studies might introduce high or unclear risk of bias in terms of index test and flow and timing, because they did not interpret the diagnostic results of FNAC in a blind manner or did not pre-establish a diagnostic threshold for FNA-Tg interpretation. In addition, part of the involved studies employed different FNA-Tg cut-offs or set a cut-off value after summarizing all the detection data, which might bring about concerns regarding applicability of the index test (Fig. 2).
Heterogeneity evaluation
As it could be seen from the ROC plane plots of the total group, no typical “shoulder arm” pattern was presented for FNAC, FNA-Tg, or FNAC+FNA-Tg (Fig. 3). All the pa values were > 0.05, no matter the preoperative subgroup or the postoperative subgroup, suggesting that there was no statistical threshold effect existing in all the included studies (Table 2). The pb values in preoperative and postoperative subgroups were all > 0.05, so symmetrical SROC curves were plotted, while pb values in the total group were all < 0.05 (Table 2), so asymmetrical SROC curves were fitted for FNAC, FNA-Tg, and FNAC+FNA-Tg, respectively.
For non-threshold effect, I2 and Cochran’s Q for DOR were evaluated. In the total group, statistical heterogeneities existed in the series of studies of FNAC (I2 = 62.4%, p for Cochran’s Q = 0.0001) and FNA-Tg (I2 = 63.5%, p for Cochran’s Q = 0.0001) but not in FNAC+FNA-Tg (I2 = 26.24%, p for Cochran’s Q = 0.1581) (Table 3). To deal with the heterogeneity and evaluate the robustness of our results, a random effects model was employed to calculate the pooled results, the data was reanalyzed after removing some studies that visibly deviated from the overall, and subgroup analysis was performed to explore the sources of heterogeneity.
Diagnostic performances of FNAC, FNA-Tg, and FNAC+FNA-Tg in the total group
FNAC
The summary Se and Sp with 95% CI were 0.82 (0.80–0.84) and 0.98 (0.97–0.99), respectively, indicating that FNAC is quite specific for LNM but it might produce false negative results to a certain degree. DOR of FNAC in diagnosing LNM was 183.75, and the computed AUC of SROC curve was 0.9353 (Figs. 4, 5, and 6; Table 3). We noticed that two studies (Cunha 2007 and Holmes 2014) had an extremely high Se of FNAC (Fig. 4), and another two studies (Al-Hilli 2016 and Sigstad 2007) presented an obvious low Sp of FNAC (Fig. 5). To evaluate the stability of the results, we took each of the above four studies separately out from the whole, and found the pooled Se and Sp both remained as the same. When the Al-Hilli 2016 study was removed, the DOR and AUC increased to 217.26 and 0.9644; I2 for DOR decreased a little but was still > 50% (from 62.4 to 58.0%). The DOR, AUC, and I2 for DOR did not changed much when any of the other three studies was deleted.
FNA-Tg
The combined Se, Sp, and DOR with 95% CI of FNA-Tg in recognizing LNM were 0.93 (0.92–0.95), 0.92 (0.91–0.93), and 155.17 (86.39–278.74), respectively, and the corresponding AUC of SROC curve was 0.9674 (Figs. 4, 5, and 6; Table 3), indicating that FNA-Tg was more sensitive but less specific than FNAC. As shown in Fig. 5, four studies (Al-Hilli 2016, Kim 2009, Shi 2015, and Yap 2014) reported a much lower Sp of FNA-Tg than the others. We removed the above studies one by one and found the Sp, DOR, and I2 for DOR only fluctuated minutely, suggesting the meta-analysis results of FNA-Tg were stable.
FNAC+FNA-Tg
The integrated Se and Sp with 95% CI of FNAC+FNA-Tg were 0.97 (0.96–0.98) and 0.94 (0.93–0.95), respectively. The DOR and the AUC of SROC curve of FNAC+FNA-Tg to identify LNM were 446.00 and 0.9862 (Figs. 4, 5, and 6; Table 3), higher than those of either FNAC or FNA-Tg alone. There were four studies (Al-Hilli 2016, Kim 2009, Sigstad 2007, and Zhang 2014) obviously deviating from the whole in Sp (Fig. 5). When removing the Kim 2009 study out of the overall, we noticed an increase of DOR (from 446.00 to 513.61) and AUC (from 0.9862 to 0.9913) and a decrease of I2 for DOR (from 23.8 to 0%). The one-by-one deletion of the other three studies did not alter the Sp or the DOR greatly (Sp fluctuated from 0.94 to 0.95, DOR ranged from 459.49 to 501.01).
Diagnostic performances in preoperative and postoperative subgroups
One test could perform differently in different clinical applied situations. As it could be seen from Table 4, the diagnostic efficacy of FNAC in the postoperative subset (DOR 141.82, AUC 0.9560) was higher than that in the preoperative subset (DOR 86.34, AUC 0.9488), while FNA-Tg performed a bit better in the preoperative subgroup (DOR 94.10, AUC 0.9727) than in the postoperative subgroup (DOR 91.37, AUC 0.9510). Noticeably, no matter in prior to or after operation, the combination of FNAC and FNA-Tg performed more excellent than either test alone, and the best diagnostic performance was achieved by FNAC+FNA-Tg when applied postoperatively to monitor recurrence (DOR 788.72, AUC 0.9930). The heterogeneity for FNAC+FNA-Tg was much lower than that for FNAC and FNA-Tg alone, no matter preoperatively or postoperatively. The heterogeneity was obviously high in the preoperative subgroup for FNAC and FNA-Tg (I2 > 50% and p for Cochran’s Q < 0.05), indicating non-threshold effect might affect the result.
Subgroup analysis based on different cut-offs of FNA-Tg
The included studies had employed different numerical values of FNA-Tg or the combination of FNA-Tg and serum-Tg as the cut-offs in Tg measurement, which might influence the Se and Sp of each individual study, thus accumulating obvious heterogeneity and decreasing the diagnostic performance of the pooled analysis. To explore the influence of different positive standards on the diagnostic performance of FNA-Tg and FNAC+FNA-Tg, we divided these studies into four subgroups: 0 < cut-off < 1.0 ng/ml, cut-off = 1.0 ng/ml, cut-off > 1.0 ng/ml, and FNA-Tg/serum-Tg ratio > 1. The number of studies that fell into the above subgroups was 7, 6, 5, and 3, respectively. Interestingly, no matter which cut-off was used, it was for sure that the combination of FNAC and FNA-Tg performed better than either alone (Table 5). The combination of FNAC and FNA-Tg/serum-Tg ratio achieved a much higher pooled Se, Sp, and DOR (Se, 0.91; Sp, 0.98; and DOR, 448.05) than the combination of FNAC and FNA-Tg. The best Se (0.99) and AUC (0.9949) of FNAC+FNA-Tg was reached when using cut-offs < 1.0 ng/ml, while the highest Sp of FNAC+FNA-Tg was obtained when using FNA-Tg/serum-Tg ratio > 1 as the cut-off (Table 5).
Discussion
This study summarized the diagnostic performance of US-guided FNAC, FNA-Tg, and the combination of both for nodal metastasis assessment in preoperative and postoperative patients with DTC. Integrating results of the included studies, we found that FNAC had a better Sp while FNA-Tg got a higher Se, and that they could achieve a more excellent diagnostic performance when analyzed jointly than independently. FNAC and its combination with FNA-Tg performed better when applied to recognize recurrence postoperatively, while FNA-Tg performed better when used to confirm staging preoperatively. On the whole, the combination of FNAC and FNA-Tg/serum-Tg ratio achieved a better diagnostic performance than the combination of FNAC and FNA-Tg. However, we should be aware that different categories of cut-off values may be suitable for different clinical situations.
A complete LN dissection is the key step of initial treatment for DTC. To reduce the number of preventative bilateral neck LN dissection and to build a personalized therapeutic strategy, the surgeons should be well informed of the metastatic status of neck LNs of individual patient [37]. After the initial treatment consisting in surgery with or without radioactive iodine, a timely recognition of recurrence in LNs could guarantee a timely intervention. FNAC is the most direct way to examine LNM before or after operation; it is highly specific because pathologists would not make a positive diagnosis until they observe definite tumor cells under the microscope. However, the Se of FNAC was only 0.82, as suggested by the pooled analysis in the current study (Table 3). Noticeably, the quality of aspiration material is largely dependent on the US guidance avoiding targets lacking cell components such as cystic part or necrotic areas [36]. Although the interpretation of conventional cytology is somewhat dependent on subjective experience of pathologists, it could achieve a more robust diagnostic efficacy when combined with other tests.
Since first reported in 1921 [38], the Se of the Tg level in the needle washout specimens for diagnosis of LNM has been well documented. Tg is only produced by the thyroid follicular cells and involved in thyroid hormone synthesis and iodine transport [7]. Like other auxiliary diagnostic examinations, such as BRAF mutation in cytology samples [39], the FNA-Tg test would not add patients’ extra discomfort as it is an adjunct to FNA biopsy. Its interpretation is less dependent on personal experience than cytology. FNA-Tg could enhance FNAC diagnostic accuracy by decreasing false negative results at a modest cost [16]. Our analysis also indicated that the combination of FNAC and FNA-Tg could increase the pooled Se by 0.15 compared with FNAC along (Table 3). The DOR values of FNAC, FNA-Tg, and FNAC+FNA-Tg were 183.75, 155.17, and 446.00 in the total group, respectively, indicating the combination of FNAC and FNA-Tg has a stronger diagnostic power in discriminating LNM of DTC.
A quantitative measurement of Tg in LN aspirate samples is an objective indicator of DTC metastasis prior or after thyroidectomy. However, no definite consensus has been reached for the diagnostic threshold value of FNA-Tg worldwide [26]. Some researchers also mentioned that Tg values close to the cut-offs should be interpreted with caution. They may represent a minute metastatic focus, and conversely, Tg values before surgery might represent a false positive due to contamination with blood [30]. To avoid the interference of blood Tg, some studies suggested to use FNA-Tg/serum-Tg ratio as the positive standard instead [33, 36]. Our subgroup analysis reflected that the FNA-Tg performed less accurately using FNA-Tg/serum-Tg ratio than using Tg level as cut-off as a single test, but it could perform quite excellently when combined with FNAC using FNA-Tg/serum-Tg ratio as cut-off, especially in Sp (AUC 0.9860 and Sp 0.98). Obviously, no matter which FNA-Tg cut-off was employed, the diagnostic index such as Se, PLR, NLR, DOR, AUC, and Q value of the combination of FNA-Tg and FNAC was generally higher than that of FNA-Tg alone, again stressing the importance of combining FNA-Tg with FNAC in LNM detection in patients with DTC.
In a meta-analysis, intra-study variation and inter-study variation are the main causes for heterogeneity. According to our results, no threshold effect was detected in any included study; however, the non-threshold effect evaluated by I2 and Cochran’s Q statistics was significant. Therefore, we adopted some measures to reduce the harmful influence of heterogeneity on overall conclusion: using a random effects model, reanalyzing data after deleting deviated studies, and conducting subgroup analyses. It should be noted that different detective combination should be chosen according to different detective purposes. For example, test combination with high Se, like FNAC+FNA-Tg with 0 < cut-off < 1.0 as cut-off, is suitable for screening in large-scale population; combination with high Sp, such as FNAC+FNA-Tg with FNA-Tg/serum-Tg > 1 as cut-off, is appropriate to assist in making a surgery plan. To avoid the risk of false negative results, close follow-up should be given when a less sensitive test, such as FNAC, was applied alone preoperatively or postoperatively.
Our study has several strengths and limitations. We included a large number of LNs and patients to improve the robustness of our analysis, and a search of the gray literature was performed and full texts available in other languages than English were also referred, which has increased the general applicability of our conclusions. Moreover, subgroup analyses were carefully conducted to explore the influence of patient status or FNA-Tg cut-offs on diagnostic performance. Admittedly, the number of involved studies, especially the studies using FNA-Tg/serum-Tg ratio as cut-off in Tg detection, was relatively limited. Additionally, significant non-threshold effect existed in the incorporated data. Further subgroup analysis suggested that different FNA-Tg cut-offs could add on the inter-study heterogeneity, but it did not impact the stability of the integrated outcomes very much, and the data still supported the conclusions on the whole.
Conclusion
In summary, FNA-Tg was more sensitive but less specific than FNAC in identifying LNM of DTC. The addition of FNA-Tg, especially the FNA-Tg/serum-Tg ratio, to FNAC could improve the diagnostic accuracy of LNM in patients with DTC and provide more comprehensive clues for preoperative staging and postoperative follow-up.
Abbreviations
- AUC:
-
Area under the curve
- CI:
-
Confidence intervals
- DOR:
-
Diagnostic odds ratio
- DTC:
-
Differentiated thyroid carcinoma
- ECLIA:
-
Electrochemiluminescence immunoassay
- FN:
-
False negative
- FNAC:
-
Fine needle aspiration cytology
- FNA-Tg:
-
Fine needle aspiration-thyroglobulin
- FP:
-
False positive
- ICMA:
-
Immunochemiluminometric assay
- IFMA:
-
Immunofluorometric assay
- IMA:
-
Immunometric assay
- IRMA:
-
Immunoradiometric assay
- LNM:
-
Lymph node metastases
- NLR:
-
Negative likelihood ratio
- PLR:
-
Positive likelihood ratio
- PTC:
-
Papillary thyroid cancer
- QUADAS-2:
-
Quality Assessment of Diagnostic Accuracy Studies-2
- Se:
-
Sensitivity
- Sp:
-
Specificity
- SROC:
-
Summary receiver operating characteristic
- TN:
-
True negative
- TP:
-
True positivity
References
Marotta V, Malandrino P, Russo M et al (2020) Fathoming the link between anthropogenic chemical contamination and thyroid cancer. Crit Rev Oncol Hematol 150:102950
Siegel RL, Miller KD, Jemal A (2019) Cancer statistics, 2019. CA Cancer J Clin 69:7–34
Lee J, Song Y, Soh EY (2014) Prognostic significance of the number of metastatic lymph nodes to stratify the risk of recurrence. World J Surg 38:858–862
Schlumberger MJ (1998) Papillary and follicular thyroid carcinoma. N Engl J Med 338:297–306
Wang LY, Palmer FL, Nixon IJ et al (2015) Lateral neck lymph node characteristics prognostic of outcome in patients with clinically evident N1b papillary thyroid cancer. Ann Surg Oncol 22:3530–3536
Haugen BR, Alexander EK, Bible KC et al (2016) 2015 American Thyroid Association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: the American Thyroid Association Guidelines Task Force on Thyroid Nodules and Differentiated Thyroid Cancer. Thyroid 26:1–133
Jiang HJ, Hsiao PJ (2020) Clinical application of the ultrasound-guided fine needle aspiration for thyroglobulin measurement to diagnose lymph node metastasis from differentiated thyroid carcinoma-literature review. Kaohsiung J Med Sci 36:236-243
Sohn YM, Kim MJ, Kim EK, Kwak JY (2012) Diagnostic performance of thyroglobulin value in indeterminate range in fine needle aspiration washout fluid from lymph nodes of thyroid cancer. Yonsei Med J 53:126–131
Salmaslioglu A, Erbil Y, Citlak G et al (2011) Diagnostic value of thyroglobulin measurement in fine-needle aspiration biopsy for detecting metastatic lymph nodes in patients with papillary thyroid carcinoma. Langenbecks Arch Surg 396:77–81
Kim M, Jeon MJ, Kim WG et al (2016) Comparison of thyroglobulin measurements using three different immunoassay kits: a BRAMHS Tg-Plus RIA kit, a BRAMHS hTg sensitive Kryptor kit, and a Beckman Coulter ACCESS immunoassay kit. Endocrinol Metab (Seoul) 31:462–468
Cho YY, Chun S, Lee SY, Chung JH, Park HD, Kim SW (2016) Performance evaluation of the serum thyroglobulin assays with immunochemiluminometric assay and immunoradiometric assay for differentiated thyroid cancer. Ann Lab Med 36:413–419
Glas AS, Lijmer JG, Prins MH, Bonsel GJ, Bossuyt PMM (2003) The diagnostic odds ratio: a single indicator of test performance. J Clin Epidemiol 56:1129–1135
Beynon R, Leeflang MM, McDonald S, et al (2013) Search strategies to identify diagnostic accuracy studies in MEDLINE and EMBASE. Cochrane Database Syst Rev. https://doi.org/10.1002/14651858.MR000022.pub3
Penny F, Whiting P, Anne WS et al (2011) QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med 155:529–536
Bolboacă S, Jäntschi L (2005) Binomial distribution sample confidence interval estimation for positive and negative likelihood ratio medical key parameters. AMIA Annu Symp Proc 2005:66–70
Al-Hilli Z, Strajina V, McKenzie TJ et al (2017) Thyroglobulin measurement in fine-needle aspiration improves the diagnosis of cervical lymph node metastases in papillary thyroid carcinoma. Ann Surg Oncol 24:739–744
Bournaud C, Charrie A, Nozieres C et al (2010) Thyroglobulin measurement in fine-needle aspirates of lymph nodes in patients with differentiated thyroid cancer: a simple definition of the threshold value, with emphasis on potential pitfalls of the method. Clin Chem Lab Med 48:1171–1177
Cunha N, Rodrigues F, Curado F et al (2007) Thyroglobulin detection in fine-needle aspirates of cervical lymph nodes: a technique for the diagnosis of metastatic differentiated thyroid cancer. Eur J Endocrinol 157:101–107
Holmes BJ, Sokoll LJ, Li QK (2014) Measurement of fine-needle aspiration thyroglobulin levels increases the detection of metastatic papillary thyroid carcinoma in cystic neck lesions. Cancer Cytopathol 122:521–526
Jeon SJ, Kim E, Park JS et al (2009) Diagnostic benefit of thyroglobulin measurement in fine-needle aspiration for diagnosing metastatic cervical lymph nodes from papillary thyroid cancer: correlations with US features. Korean J Radiol 10:106–111
Jo K, Kim MH, Lim Y et al (2015) Lowered cutoff of lymph node fine-needle aspiration thyroglobulin in thyroid cancer patients with serum anti-thyroglobulin antibody. Eur J Endocrinol 173:489–497
Jung JY, Shin JH, Han BK, Ko EY (2013) Optimized cutoff value and indication for washout thyroglobulin level according to ultrasound findings in patients with well-differentiated thyroid cancer. AJNR Am J Neuroradiol 34:2349–2353
Kim MJ, Kim EK, Kim BM et al (2009) Thyroglobulin measurement in fine-needle aspirate washouts: the criteria for neck node dissection for patients with thyroid cancer. Clin Endocrinol (Oxf) 70:145–151
Lee YH, Seo HS, Suh SI et al (2010) Cut-off value for needle washout thyroglobulin in athyrotropic patients. Laryngoscope 120:1120–1124
Giovanella L, Ceriani L, Suriano S (2011) Lymph node thyroglobulin measurement in diagnosis of neck metastases of differentiated thyroid carcinoma. J Thyroid Res 2011:621839
Moon JH, Kim YI, Lim JA et al (2013) Thyroglobulin in washout fluid from lymph node fine-needle aspiration biopsy in papillary thyroid cancer: large-scale validation of the cutoff value to determine malignancy and evaluation of discrepant results. J Clin Endocrinol Metab 98:1061–1068
Shi JH, Xu YY, Pan QZ, Sui GQ, Zhou JP, Wang H (2015) The value of combined application of ultrasound-guided fine needle aspiration cytology and thyroglobulin measurement for the diagnosis of cervical lymph node metastases from thyroid cancer. Pak J Med Sci 31:1152–1155
Shin HJ, Lee HS, Kim EK, Moon HJ, Lee JH, Kwak JY (2015) A study on serum antithyroglobulin antibodies interference in thyroglobulin measurement in fine-needle aspiration for diagnosing lymph node metastasis in postoperative patients. PLoS One 10:e0131096
Sigstad E, Heilo A, Paus E et al (2007) The usefulness of detecting thyroglobulin in fine-needle aspirates from patients with neck lesions using a sensitive thyroglobulin assay. Diagn Cytopathol 35:761–767
Tang S, Buck A, Jones C, Sara Jiang X (2016) The utility of thyroglobulin washout studies in predicting cervical lymph node metastases: one academic medical center’s experience. Diagn Cytopathol 44:964–968
Yap NS, Maher R, Learoyd DL (2014) Any detectable thyroglobulin in lymph node biopsy washouts suggests local recurrence in differentiated thyroid cancer. Endocr Connect 3:150–155
Zhang HS, Wang RJ, Fu QF et al (2014) Diagnostic value of thyroglobulin measurement with fine-needle aspiration biopsy for lymph node metastases in patients with a history of differentiated thyroid cancer. Asian Pac J Cancer Prev 15:10905–10909
Li Q, Zhao BW, Lv JH et al (2016) Diagnostic value of fine-needle aspiration and thyroglobulin measurement in fine-needle aspiration for node metastasis in thyroid carcinoma before thyroidectomy. Chin J Ultrasound Med 32:400–403
Wu Y, Yu XQ, Hu HY, Zhu GX, Guo SH, Du FY (2018) Value of fine needle aspiration cytology combined with fine needle aspiration thyroglobulin measurements in evaluation of metastatic lymph nodes in patients with papillary thyroid carcinoma. J Clin Ultrasound in Med 20:615–628
Xu YY (2014) The value of joint application of ultrasonic imaging guided fine needle aspiration and thyroglobulin determination in diagnosis of postoperative lymph node metastasis of papillary thyroid cancer. Master Degree Thesis, Jilin University, Changchun, China
Zhou L, Zhang G, Zhang DQ et al (2017) Application and evaluation of fine needle aspiration biopsy and eluate detection for cervical lymph node dissection decision in patients with thyroid carcinoma. Chin J Pract Surg 37:1002–1006
Wang LY, Ganly I (2016) Nodal metastases in thyroid cancer: prognostic implications and management. Future Oncol 12:981–994
Pacini F, Fugazzola L, Lippi F et al (1992) Detection of thyroglobulin in fine needle aspirates of nonthyroidal neck masses: a clue to the diagnosis of metastatic differentiated thyroid cancer. J Clin Endocrinol Metab 74:1401–1404
Marotta V, Sapio MR, Guerra A, Vitale M (2011) BRAF mutation in cytology samples as a diagnostic tool for papillary thyroid carcinoma. Expert Opin Med Diagn 5:277–290
Funding
This study has received funding by the Guangdong Basic and Applied Basic Research Foundation (grant number 2020A1515010098 and 2020A1515011374) and the Medical Scientific Research Foundation of Guangdong Province, China (grant number A2019116).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Guarantor
The scientific guarantor of this publication is Prof. Cai-yun He.
Conflict of interest
The authors of this manuscript declare no relationships with any companies, whose products or services may be related to the subject matter of the article.
Statistics and biometry
Statistical analyses were performed using Meta-DiSc1.4 software and supervised by statistical expert prof. Cai-yun He.
Informed consent
Written informed consent was not required for this study because only published data were used.
Ethical approval
Institutional Review Board approval was not required because only published data were used and no animal subjects were involved in this study.
Study subjects or cohorts overlap
Studies with duplicate data were excluded from the meta-analysis.
Methodology
This study is a meta-analysis.
• prospective
• diagnostic study
• multicenter study
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Liu, RB., Zhou, DL., Xu, BH. et al. Comparison of the diagnostic performances of US-guided fine needle aspiration cytology and thyroglobulin measurement for lymph node metastases in patients with differentiated thyroid carcinoma: a meta-analysis. Eur Radiol 31, 2903–2914 (2021). https://doi.org/10.1007/s00330-020-07400-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00330-020-07400-9