Introduction

The incidence of thyroid cancer has increased rapidly worldwide in the past three decades [1]. In 2020, the age-standardized incidence rates of thyroid cancer were 10.1 per 100,000 women and 3.1 per 100,000 men in the world [2]. Papillary thyroid cancer (PTC) is the most frequent and least aggressive histologic subtype of thyroid cancer, accounting for the most of new cases [3]. Despite its high incidence, PTC has a favorable prognosis, with 10-year survival rate of up to 98% [4].

It has been reported that 30–80% PTC patients are accompanied by cervical lymph node metastases (LNM) [5,6,7]. Although LNM is not associated with overall survival among patients with PTC [8], the presence or absence of LNM affects the decision of surgery. Consequently, it is important to distinguish benign from metastatic lymph nodes, aiming to avoid over-treatment. However, there is no optimal method for the diagnosis of LNM from PTC.

Current guidelines recommended cervical lymph nodes should be examined by ultrasonography (US) before thyroidectomy, and suspicious lymph nodes should undergo US-guided fine needle aspiration (FNA) [9]. However, in some cases, the tissue material is inadequate for degeneration and cystic changes. A review demonstrated that 20–28.5% of FNA samples were non-diagnostic, and the false negative rate was 6–8% [10]. Accumulating data has shown that thyroglobulin measurement in eluates after FNA (FNA-Tg) is helpful in improving the diagnostic sensitivity of non-diagnostic samples [11, 12]. However, FNA-Tg procedures have not been standardized to date, and current guidelines make a weak recommendation on the addition of FNA-Tg to FNA [9]. Core needle biopsy (CNB) is a procedure which enables histological diagnosis with cellular architecture, and can decrease inconclusive diagnostic results [13]. However, few studies have evaluated its diagnostic performance in cervical LNM from thyroid cancer.

Thus, in the present study, we compared the diagnostic performance of FNA, FNA-Tg, the combination of FNA and FNA-Tg, and CNB in order to determine the optimal method in detecting cervical LNM from PTC.

Materials and methods

Study population

A total of 872 consecutive PTC patients with single or multiple suspicious cervical lymph nodes were retrospectively reviewed between January 2021 and April 2022. Patients included met the following criteria: PTC patients underwent US-guided FNA, FNA-Tg or CNB for suspicious cervical lymph nodes, and they subsequently underwent central or lateral neck dissection. Suspicious metastatic lymph nodes have the following sonographic features on US: microcalcifications, partially cystic appearance, hyperechoic tissue resembling thyroid, round shape, peripheral, or diffusely increased vascularization on color Doppler images [9, 14]. The flowchart of patient inclusion and exclusion process was present in Fig. 1. Patients’ demographics, the suspicious lymph node size, results of CNB, FNA, FNA-Tg and surgical pathology were recorded.

Fig. 1
figure 1

Flowchart of suspicious cervical metastatic lymph nodes from PTC included in the study. PTC papillary thyroid cancer, CNB core needle biopsy, FNA fine needle aspiration, FNA-Tg thyroglobulin in washout after FNA

This retrospective study was approved by the Ethics Committee of Chinese People’s Liberation Army General Hospital (No. S2021-123-01), and all patients undergoing FNA, CNB or surgery provided written informed consent for these examinations or procedures.

US-guided FNA or FNA-Tg

Patients were placed in the supine position with the neck extended. US-guided FNA was performed on suspicious lymph nodes using 23- to 25-gauge needles (Hakko Co., Nagano, Japan). The needle was inserted obliquely within the transducer plane of view, and moved back and forth five to ten times through the lymph node. During this process, suction device was not used; cells moved into the needle via capillary action. Three passes were performed, each with a new needle. The contents of the needles were expelled onto glass slides, and smeared using a second slide to spread the fluid across the surface. Immediately the slides were immersed in 95% alcohol, and Papanicolaou stained to identify cellular detail. The slides were examined by experienced cytopathologists and categorized as (1) positive: presence of epithelial cells with cytological features of PTC, or with atypical cytological characteristics; (2) negative: absence of malignant cells or presence of blood cells. Puncture feeling was recorded during the FNA process, and classified into “soft” (without any resistance during the FNA process), “hard” and “hard with grittiness” (with any degree of displacement of the nodule during the FNA process).

To measure FNA-Tg, a dedicated FNA pass was performed without smears, and the FNA needle was rinsed with 1 mL of 0.9% normal saline in a serum blood tube with heparin. Specimens were sent to the laboratory directly. Tg concentrations were measured using an automated electrochemiluminescence immunoassay (Cobas e 601, Roche Diagnostics, Mannheim, Germany). The minimum detectable Tg concentration in our laboratory was 0.2 ng/mL, and the maximum Tg concentration was 300 ng/mL.

US-guided CNB

Patients were in supine position with neck hyperextension. After local anesthesia using 2% lidocaine, CNB was performed using an automatic gun (Bard Biopsy System, Tempe, AZ) with a 16-, 18- or 20-gauge cutting needle under US guidance. According to the suspicious lymph node size, and its proximity to vital structures, the optimal throws were chosen. The specimens were immediately placed in 10% formalin, and sent for histopathologic examination. Experienced pathologist examined the CNB, and categorized them as (1) positive: presence of PTC; (2) negative: absence of PTC.

After US-guided FNA or CNB, patients were closely observed for 1–2 h with effective compression of the puncture site for at least 30 min to prevent bleeding before discharge.

Because lymph nodes in the central compartment are more easily affected by blood contamination by circulating Tg at the time of sampling, FNA-Tg was performed mainly in the lateral cervical lymph node, particularly small, partially cystic one at our institution [15]. FNA and CNB are viewed as complementary methodologies at our institution. The utility of FNA and CNB depends on the clinical context, anatomic site, and technique.

Surgery

For patients treated for the first time, total thyroidectomy or lobectomy was performed under general anesthesia, and prophylactic central lymph node dissection was routinely performed. Lateral neck dissection was performed if patients with biopsy-proven metastatic lateral cervical lymphadenopathy or high concentration of FNA-Tg. For patients with recurrent PTC, an operation was performed when there was a positive biopsy result or the concentration of FNA-Tg was high.

Evaluation of the diagnostic performance

The final diagnoses were confirmed by postoperative pathology. Given the retrospective nature of the study, we could not ensure that the lymph node biopsied and the lymph node surgically removed were the same one. If the biopsied lymph node was located in the same level of the neck with lymph node surgically removed, and their size was similar, we considered they were the same one. Positive FNA, and CNB results were considered to be true positive in cases where postoperative pathology revealed a malignancy, otherwise they were false positive when postoperative pathology did not find malignancy. Negative FNA and CNB results were considered to be true negative in cases where no metastatic lymph nodes were found on postoperative pathology, otherwise they were considered to be false negative.

The diagnostic abilities of CNB alone, FNA alone, FNA-Tg alone, and the combination of FNA and FNA-Tg were assessed with respect to their sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy, and the areas under curves (AUC) of the receiver operating characteristic (ROC).

Statistical analysis

All data were analyzed using SPSS for Windows (version 21.0; IBM Corp.) and R (version 4.2.1). Categorized variables are reported as frequencies (%), and continuous variables are reported as mean ± standard deviation. The chi-square test or Fisher’s exact test was used to compare sensitivity, specificity, PPV, NPV, and accuracy. The R package “pROC” was used to assess AUCs, and Delong test was used to compare the AUCs. An alpha level lower than 0.05 was set for statistical significance.

Results

A total of 319 patients with PTC were included in this study. The mean age of included patients was 41.1 ± 12.0 years, and the female-to-male ratio was 1.9:1. Among them, 42 patients had a history of thyroidectomy.

Before lymph node dissection, 42 patients underwent FNA alone, 152 patients underwent the combination of FNA and FNA-Tg, and 125 patients underwent CNB alone (Table 1). No major or minor complications were encountered after FNA or CNB. Among them, one lymph node was biopsied in 275 patients before lymph node dissection, 2 in 40 patients, 3 in 3 patients, and 5 in 1 patient. Among the 369 lymph nodes, 105 were located in the central (level VI) neck compartment, and 264 in the lateral (levels II-V) neck compartment.

Table 1 The demographic characteristics of patients and the lymph node size before lymph node dissection

The mean size of the 240 lymph nodes undergoing FNA was 1.2 ± 0.7 cm. Among them, 55 cases underwent FNA with 23-gauge needle, and 185 cases with 25-gauge needle. On postoperative pathology, 184 cases (76.7%) were diagnosed as malignant, and 56 cases (23.3%) were diagnosed as benign lymph nodes. FNA had a sensitivity of 81% (149/184), a specificity of 80.4% (45/56), a PPV of 93.1% (149/160), an NPV of 56.3% (45/80), and an accuracy of 80.8% (194/240). The false positive rate and the false negative rate were 6.9% (11/160) and 43.7% (35/80), respectively. The puncture feeling was recorded in 234 cases. Among them, 55 lymph nodes had the puncture feeling of hard or hard with grittiness, 48 out of 55 (87.3%) were malignant on postoperative pathology. The false positive rate of puncture feeling was 12.7% (7/55). Meanwhile, as for 179 lymph nodes with the puncture of soft, 46 cases (25.7%) were benign. The false negative rate of puncture feeling was 74.3% (133/179).

FNA was combined with FNA-Tg measurement in 187 lymph nodes. The mean size of the lymph nodes was 1.3 ± 0.7 cm. In an independent assessment, FNA-Tg alone had a higher specificity in diagnosing LNM from PTC with FNA-Tg > 55 ng/mL as a threshold than FNA-Tg ≥ 1 ng/mL (P = 0.0024); however, its sensitivity decreased (P = 0.0351). No significant difference was found in PPV, NPV and accuracy (Table 2). As shown in Fig. 2a, the AUC with FNA-Tg >55 ng/mL as a cutoff value was larger than that with FNA-Tg ≥1 ng/mL (0.636 vs. 0.782, p = 0.0005); therefore, in the subgroup analysis, we chose 55 ng/mL as the cutoff value of FNA-Tg. In this cohort, the sensitivity of FNA alone was 82.0% (123/150), the specificity was 81.1% (30/37), the PPV was 94.6% (123/130), the NPV was 52.6% (30/57), and the accuracy was 81.8% (153/187). No significant difference was found in sensitivity, specificity, PPV, NPV, and accuracy between FNA alone and FNA-Tg alone (P > 0.05). The combination of FNA and FNA-Tg had a sensitivity of 91.3% (137/150), a specificity of 73.0% (27/37), a PPV of 93.2% (137/147), a NPV of 67.5% (27/40), and an accuracy of 87.7% (164/187).

Table 2 Comparison between diagnostic performance of FNA-Tg ≥1 ng/mL and ≥55 ng/mL
Fig. 2
figure 2

The area under the curve (AUC) of the receiver operating characteristic (ROC) for different methods in detection for suspicious lymph node metastasis from papillary thyroid cancer. a The AUC for FNA-Tg with 55 ng/mL as a threshold (red line) was higher than that with 1 ng/mL as a threshold (blue line); b The AUCs for the combination of FNA and FNA-Tg (red line), FNA alone (blue line), and CNB alone (cyan line) were 0.822, 0.807, and 0.744 respectively. FNA fine needle aspiration, FNA-Tg thyroglobulin in washout after FNA, CNB core needle biopsy

The mean lymph node size in those undergoing CNB was 1.6 ± 0.8 cm. Among them, 5 cases were biopsied with 16-gauge cutting needle, 116 cases with 18-gauge cutting needle, and 8 cases with 20-gauge cutting needle. Of the 129 lymph nodes, 117 cases (90.7%) were finally diagnosed as malignant, and the remaining 12 cases (9.3%) were diagnosed as benign lymph nodes. The CNB had a sensitivity of 88% (103/117), a specificity of 66.7% (8/12), a PPV of 96.3% (103/107), an NPV of 36.4% (8/22), and an accuracy of 86.0% (111/129). The false positive rate and the false negative rate were 3.7% (4/107) and 63.6% (14/22), respectively.

As shown in Table 3, the combination of FNA and FNA-Tg had higher sensitivity than FNA (P = 0.01). No significance was found in specificity, PPV, NPV and accuracy among FNA, CNB, and the combination of FNA and FNA-Tg (P > 0.05). As shown in Fig. 2b, there was no significance among the AUCs of FNA, the combination of FNA and FNA-Tg, and CNB (P > 0.05).

Table 3 Sensitivity, specificity, PPV, NPV and accuracy of FNA alone, CNB alone, and the combination of FNA and FNA-Tg

Discussion

Our study firstly compared the diagnostic performance of US-guided FNA, FNA-Tg, the combination of FNA and FNA-Tg, and CNB in cervical LNM assessment in patients with PTC. We found that the combination of FNA and FNA-Tg had higher sensitivity (91.3%). The diagnostic performance of FNA-Tg varied with its threshold. FNA-Tg got higher sensitivity (92.0%) with 1 ng/mL as a cutoff value, however, it got the lower specificity (35.1%). When the cutoff vale of FNA-Tg increased to 55 ng/mL, the specificity increased to 73.0%, however, the sensitivity decreased to 83.3%. Furthermore, the AUC with FNA-Tg >55 ng/mL as a cutoff value was larger than that with FNA-Tg ≥1 ng/mL as a cutoff value (0.636 vs. 0.782, P = 0.0005). The diagnostic performance of CNB was similar to that of FNA. The results of our study provide some evidence for guiding clinicians how to manage suspicious cervical LNM from PTC.

FNA is a cytological examination for detection of metastatic lymph nodes in general head and neck malignancies [16]. However, FNA samples may be non-diagnostic or even provide false negative results, especially in small or cystic metastatic lymph nodes. Furthermore, small lymph nodes size, lack of epithelial component in cyst aspirates, and the interference by lymphocyte infiltration can decrease the diagnostic accuracy of FNA [17]. A recent meta-analysis revealed that the sensitivity of FNA is just 0.813 [16], which is comparable to our study. In this study, the sensitivity of FNA was 81.0%, and the false negative rate of FNA was 43.7%, which is far from optimal. Studies have demonstrated that puncture feeling is of great value during FNA in the diagnosis of thyroid nodules [18, 19], in which “hard” and “hard with grittiness” were indicators for malignancy, while “soft” was indicator for benign. The false positive and negative rates of puncture feeling in differentiating thyroid nodules were 11.85% and 10.81%, respectively [18]. However, in our study, the false negative rate of puncture feeling was high (74.3%), which indicating that puncture feeling in the diagnosis of cervical LNM from PTC is not as helpful as it is in differentiating thyroid nodules.

Tg is a high molecular weight glycoprotein secreted by normal thyroid follicular cells or PTC cells [20]. Since its first report in 1992 [21], FNA-Tg measurement has been used to detect LNM from PTC in the past few years. Several studies have demonstrated that FNA-Tg is more sensitive than FNA for detecting LNM from PTC [20, 22], and when it was combined with FNA, the accuracy of FNA is improved [16, 23, 24]. The results of our study are consistent with these reports. However, the cutoff value of FNA-Tg has not been well established to date. Previous studies have used various cut-off values of FNA-Tg ranging from 0.2 ng/mL to 77 ng/mL [11, 24,25,26,27,28], but none of them could differentiate benign lymph nodes from metastatic with complete reliability. Most studies used 1 ng/mL as the threshold to analyze the diagnostic value of FNA-Tg. When 1 ng/mL was used as the threshold, it had the highest diagnostic sensitivity [20]. Therefore, in this study, we compared two threshold values: 1 ng/mL and 55 ng/mL (the standard of our hospital). Our results showed that the former had higher sensitivity, but the latter had higher specificity. Several factors have an effect on the accuracy of FNA-Tg, including Tg measurement methods, Tg assays kits, serum antithyroglobulin antibody level, serum thyroglobulin level, the presence or absence of thyroid glands, and the characteristics of metastatic lymph nodes [10]. Therefore, a better standardization of criteria for patient selection, the procedure of FNA-Tg, and cutoff value is required for FNA-Tg.

US-guided CNB is a common method for disease diagnosis, which provides a histologic sample with its cytologic appearance and tissue architecture [29]. A meta-analysis revealed that CNB was able to detect malignancy in head and neck lesions with an overall accuracy of 96% [29]. However, few studies have evaluated its diagnostic performance in cervical LNM from PTC. Zhang et al. [30] reported that the sensitivity and specificity of CNB in predicting cervical LNM from PTC were 98.9% and 100%, respectively, which were higher than ours. The reason may be that the size of lymph nodes included in our study was smaller. Compared to FNA, CNB can decrease inconclusive diagnostic results [13] because that tissue core contains a greater number of cells. Therefore, CNB is more helpful for conducting immunohistochemical staining or cell subtyping of the primary tumor or gene expression profiling. Furthermore, CNB has an advantage over FNA in diagnosing lesions previously treated by radiotherapy, in which granulomatous response or severe fibrosis is expected [31]. The main concerning of CNB is its potential risk of complications, such as hematoma, pain, and the risk of seeding malignant cells along the needle tract [32, 33]. In our study, no major or minor complication were occurred.

Several limitations of this study should be considered. First, given the retrospective nature of the study, we could not perform a node-by-node analysis for all lymph nodes. Instead, we performed a level-to-level analysis. If the biopsied lymph node was located in the same level of the neck with lymph node surgically removed, and their size was similar, we considered they were the same one. Despite this, it is impossible to determine if negative pre-operative cytology is a true negative or false negative in patients with metastatic nodes seen on surgical pathology, particularly if the ultrasound features are not obviously abnormal. Furthermore, the cervical lymph nodes with negative biopsy results will not undergo surgery. In fact, lymph nodes with negative biopsy results may also metastasize. All of the above may affect the results. Therefore, Further large-scale and high-quality prospective clinical trials are warranted to confirm our result. Second, the influence factors of FNA-Tg were not evaluated in this study, and further investigation should be carried.

In conclusion, the current study demonstrated that FNA-Tg is useful to improve the sensitivity of FNA, but a better standardization of criteria for patient selection, the procedure of FNA-Tg, and cutoff value is required for FNA-Tg. The diagnostic performance of the combination of FNA and FNA-Tg was not different from that of CNB in detecting LNM from PTC. Clinicians should choose the optimal method according to clinical needs.