Introduction

Although fine-needle aspiration (FNA) is the gold-standard technique for the preurgical diagnosis of thyroid nodules, around 25% of cases lack the features needed for a definitive diagnosis and are classified as indeterminate [1]. Most of the indeterminate cases are submitted to surgery, though only the minority of cases (10–40%) will be found to be malignant [2]. In the last decades, with the aim to improve the presurgical diagnosis in indeterminate thyroid nodules, thus reducing the number of unneeded operations, and the consequent expenses and risks, attention has been focused on the preoperative molecular characterization of the nodules. Accordingly, different tests have been developed taking advantage of the major advancements in the knowledge of the genetic bases of thyroid cancer (TC). In this context, the Thyroid Cancer Genome Atlas [3] recently reported the extensive characterization of the most prevalent TC, namely papillary thyroid cancer (PTC), significantly reducing the number of tumors without known genetic driver. Those findings allowed to reclassify PTCs into 2 molecular subtypes, identified as BRAF-like and RAS-like. Genetic alteration associated to BRAF-like gene expression profile, such as BRAFV600E mutation and RET fusions are virtually diagnostic of cancer. On the contrary, RAS-like mutations, such as RAS, PTEN, EIF1AX mutations and PPARG fusions, are associated with either malign or benign follicular neoplasms [4, 5]. Mutations in TP53 or in TERT promoter, in particular when associated with other tumor driver alterations, are frequently found in clinically aggressive thyroid cancer, including poorly differentiated and anaplastic thyroid carcinoma [6]. Differently, copy number alterations (CNA) and mutations in mitochondrial DNA are characteristic of Hürthle cell carcinoma [7].

The first studies dealing with the preoperative molecular evaluation of FNA samples, focused on the analysis of BRAFV600E, which is the most common PTC mutation [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46]. However, since many TCs are driven by other mutations, testing for BRAFV600E alone did not provide sufficiently high negative predictive value (NPV) to avoid surgery for nodules negative for this mutation. In the same years, other Authors proposed the combined evaluation of two or three genetic alterations, such as BRAFV600Eand RET fusions [47, 48], or BRAFV600E, RET and TRK fusions [49]. The sensitivity of molecular testing was further improved through the introduction of gene panels, which became available for clinical use in the late 2000s. In addition to BRAFV600E, they tested for several other common genes mutated in TC, and these typically “rule-in” tests panels were able to identity as mutated ~ 70% of cases. The first panel contributed by Nikiforov et al. in 2011, was a 7-genes molecular test (ThyroSeq® v0) composed of a panel of mutations (BRAF, N-, H-, K-RAS) and gene fusions (RET/PTC, PAX8/PPARG). In this seminal study they prospectively analyzed 247 AUS/FLUS and 214 FN/SFN nodules with histological follow-up, reporting a high specificity (97–99%) and a PPV of 88%, but a low sensitivity (57–63%) and a NPV of 86–94%, associated to a cancer prevalence of 14–27% and a residual cancer risk of 6–14% in samples with negative result [50]. The advent of the next-generation sequencing technology promoted the expansion of genotyping panels for thyroid FNA cytology [51] with novel ThyroSeq® panels testing for a progressively increasing number of genetic alterations, with a resulting higher sensitivity [52, 53]. In 2012, a “rule-out” test was introduced, namely the Afirma® test, which does not rely on detecting gene mutations but is based on the analysis of expression changes in 167 genes. The Afirma® test evaluates the gene expression profiles, reports the result as either “benign” or “suspicious”, and has a high NPV [54].

Additional approaches for molecular testing include the analysis of microRNAs (miRNAs) expression. MiRNAs are small noncoding RNAs implicated in gene regulation and several miRNAs have been found dysregulated in thyroid cancer [55,56,57,58,59]. Although different miRNAs have been proposed in different studies, 15 miRNAs could be considered as the more accurate to discriminate benign from malign lesions with a high sensitivity and specificity [60].

Based on the results obtained by these molecular tests in the preoperative evaluation of thyroid nodules, International and National guidelines [61, 62] recommend the genetic evaluation, whenever possible, for the diagnosis of indeterminate nodules. The main disadvantage of these tests is the high cost [63], which makes them rarely used in Europe. To overcome this limitation, some Authors report data on more limited, customized “rule-in” panels which are able to detect the most frequent genetic alterations of TC, even though with lower sensitivities with respect to the NGS and gene expression profile large panels.

In the present review, the most recent available versions of commercial molecular tests are reported. The accuracy of those test, the pros and cons and their present exploitation in clinical practice are fully analyzed. The reliability of custom panels is described, too. To note, all the data reported refer to indeterminate nodules, namely Bethesda classes III (Atypical follicular lesion of undetermined significance, AUS/FLUS) and IV (Suspicious for follicular neoplasm, FN/SFN) [1], since the most important indication and appropriateness of these tests is for the differential diagnosis of this type of nodules.

Methods

Literature search

We performed a PubMed search for studies published between 2009 and 2019 exploring the performance of “rule-in” and “rule-out” panels and including more than four genes and/or miRNAs, exclusively in AUS/FLUS or FN/SFN cytology. Meanwhile, we checked the references of each included paper to identify additional relevant publications.

Inclusion criteria for studies

  1. 1.

    Indeterminate thyroid results via fine-needle aspiration (FNA) that included Bethesda classes AUS/FLUS or FN/SFN (more than 20 cases).

  2. 2.

    Histopathologic results diagnosis from surgical specimens as gold reference standard for benign or malignant nodules.

Exclusion criteria for studies

  1. 1.

    Opinions, reviews, commentary, case reports, and insufficient data.

  2. 2.

    Absence of surgical histopathology results.

  3. 3.

    Studies written in languages other than English.

  4. 4.

    Studies on pediatric populations.

  5. 5.

    Studies in which Bethesda III and IV categories cannot be separated from Bethesda classes V.

Commercial tests

Three tests are commercially available in the United States, based on the analysis of DNA/RNA sequencing data, of mRNA or microRNA expression profiles, or combination of these methods: ThyroSeq® v3 (CBLPath, Inc, Rye Brook, New York, and University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania), Afirma® (Veracyte, Inc, South San Francisco, California), and ThyGenX/ThyraMIR (Interpace Diagnostics, Inc, Parsippany, New Jersey). The RosettaGX Reveal (Rosetta Genomics, Inc, Philadelphia, Pennsylvania) has been recently removed from the market (Table 1).

Table 1 Characteristics of the most recent available versions of commercial molecular tests

ThyrosSeq v3

The ThyroSeq® v3 Genomic Classifier (GC), released for clinical use in 2018, is the enhanced version of the previous Thyroseq® v2 [52]. The main advantages of the new version of this “rule-in” method are the larger number of genes mutation hotspots and gene fusions analyzed, the analysis of DNA copy number alterations (CNA), and an improved accuracy for the detection of oncocytic (Hürthle cell) tumors [64]. ThyroSeq® v3 is based on a targeted next-generation sequencing of DNA and RNA to analyze 112 genes providing information on more than 12.000 hotspot mutations and more than 120 fusions, gene expression alterations in 19 genes, and CNAs in 10 genomic regions. Quality control steps include gene expression analysis for markers to determine adequate thyroid follicular cell content, as well as markers to detect medullary thyroid carcinoma and non-thyroidal tissues (e.g., parathyroid tissue, metastatic carcinoma) (Table 1). The genomic classifier that the test uses is based on a score from 0 to 2 points for each genetic alteration, proportional to its association with cancer. GC scores of 0 or 1 are considered negative for malignancy (with the latter reported as “currently negative” to indicate nodules with low-risk mutations for which active surveillance and repeat FNA could be considered), while GC scores ≥ 2 are considered positive results. Among nodules with positive results, ThyroSeq® v3 provides further information on preoperative risk stratification based on the type of detected alterations and on their allelic frequency.

The test performance was validated in a multi-institutional, prospective, blinded study [65]. In that study, 257 nodules with indeterminate cytology were analyzed and resected tissue samples were obtained for histopathological diagnosis. ThyroSeq® v3 showed 94% sensitivity, 82% specificity, 97% NPV and 66% PPV among 247 Bethesda III/IV cases with a prevalence of malignancy of 28%. The new version of the test demonstrated an improved sensitivity, but lower specificity and PPV compared to the previous version (ThyroSeq® v2; 93% and 83%, respectively) [52]. ThyroSeq® v3 has been shown to be extremely useful in the identification of Hurthle cell carcinomas (NPV: 100%), while only 43% of adenomas were correctly classified.

Post-validations studies are available only for the ThyroSeq® v2 [52, 53, 66,67,68,69,70], and confirmed high NPV (94.5%, 95% CI 92.1–96.8%), but reported lower sensitivity (87.9%, 95% CI 82.9–92.9), specificity (71.2%, 95% CI 67.1–75.2%) and PPV (51.2%, 95% CI 45.4–57.1%) in comparison to the validation studies (Fig. 1 and Supplemental Table 1). Moreover, considering a pre-test probability of 25.6, a positive post-test probability of 54.3%, and a negative post-test probability of 5.5% were reached.

Fig. 1
figure 1

Forest plots for sensitivity, specificity, Positive and Negative Predictive Values (PPV, NPV) for Thyroseq® v2. The first Author and the year of publication are indicated

Afirma® gene expression classifier (GEC) and genomic sequencing classifier (GSC)

The Afirma® Gene Expression Classifier (GEC, Veracyte) is a microarray-based test that uses a proprietary algorithm to predict benign lesions (“rule-out” method). The algorithm involves 2 steps. The first step screens for the expression of 25 genes to identify rare neoplasms such as medullary thyroid carcinoma (MTC). Only not excluded samples proceed to the second step, which evaluates the expression profile of further 142 genes to classify indeterminate thyroid nodules into either benign (GEC-B) or suspicious (GEC-S) categories. The test was validated in a multicenter, prospective, blinded study [54] involving 210 nodules of the two indeterminate categories Bethesda III, IV, with a pre-test malignancy rate of 24 and 25%, respectively. Authors showed high sensitivity (87%), but modest specificity (53%); the NPV and PPV were 95 and 94% and 38 and 37% in the two indeterminate categories, respectively. Differently, in one post-validation study a high frequency of false negative results was recorded [71]. It is worth noting that the interpretation of the above mentioned results requires caution because of the small fraction of GEC-B nodules addressed to surgery in the clinical practice. Moreover, benign Hürthle cell nodules, which represents a large proportion of Bethesda III/IV categories, are frequently falsely classified as GEC-S [72,73,74,75]. Meta-analysis of all the available studies using Afirma® and with available histological diagnosis [66, 71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90], showed a pooled sensitivity (95.7%, 95% CI 94.1–97.2%), specificity (16.4%, 95% CI 14.2–18.3%), PPV (37.6%, 95% CI 35.3–39.9%) and NPV (87.7%, 95% CI 83.4–91.9%) of the test (Fig. 2 and Supplemental Table 2). Considering a pre-test probability of 34.5, a positive post-test probability of 37.6%, and a negative post-test probability of 12.3% were reached.

Fig. 2
figure 2

Forest plots for sensitivity, specificity, Positive and Negative Predictive Values (PPV, NPV) for Afirma® Gene Expression Classifier (GEC). The first Author and the year of publication are indicated

To overcome the modest specificity and PPV of GEC, the Afirma BRAF test was introduced, which assays the expression profile together with BRAFV600Emutation [34]. However, the investigation of BRAF mutation did not increase the PPV, mostly due to the low prevalence of classical variants of PTC in Bethesda III and IV nodules. Recently, the next-generation Afirma® Genomic Sequencing Classifier (GSC) has been developed to analyze the expression profile of 1115 genes, with RNA-Seq methodology, and including the possibility to detect single nucleotide variants, fusions, and copy number variations in the coding region of the genome [91]. The GSC includes several quality control steps, such as the screening for the expression profile of parathyroid cells and the assessment of follicular cell content. The GSC can detect mitochondrial transcripts, and CNAs for the analysis of Hürthle cell lesions (Hürthle classifier), too. The GSC was validated on the same cohort used for the first generation Afirma® GEC, showing increased specificity (from 53 to 68%) and PPV (from 38 to 47%) while maintaining high sensitivity and NPV (Table 1). Furthermore, the GSC showed a highest specificity and PPV in Hürthle cell adenomas compared to GEC. Independent reports comparing the performance of GSC with that of GEC confirmed these results [76,73,74,, 92,93,94]. A broader test panel (Xpression Atlas) was developed to detect additional alterations, involved in thyroid neoplasms (761 variants in 346 genes and 130 fusions) [95]. Of note, in both GSC and Xpression Atlas, mutations in the not transcribed portion of the genome, such as in the TERT promoter, are not included. Xpression Atlas was intended for Bethesda III/IV nodules with a GSC suspicious (GSC-S) result. However, the impact of the addition of novel variants on improving the risk stratification of thyroid nodules remains to be established.

The Afirma® GEC was developed to reduce the morbidity and the cost of repeated FNAC and/or of unnecessary thyroid surgery, but contrasting results have been obtained in different settings regarding its actual impact. Indeed, it has been reported that after the availability of this test the number of indeterminate cytologies has increased without a significant reduction of surgical procedures [66, 75, 77,78,79,80,, 78,78,79,80,, 96, 97], and the cost-effectiveness of the test in the clinical practice has been questioned [8]. On the other hand, in hypothetical modeling, molecular test resulted considerably more cost-effective than diagnostic lobectomy, being ThyroSeq® v3 more cost-effective than GSC [98].

ThyGeNEXT/ThyraMIR®

ThyGeNEXT® is a targeted next-generation sequencing test developed by Interpace Diagnostics that evaluates mutations in 10 genes (BRAF, H-, K-, and N-RAS, TERT, ALK, GNAS, RET, PTEN, and PIK3CA) and 38 different gene fusions (involving ALK, BRAF, NTRK-1, -2, and -3, PPARG, RET, and THADA).

To increase the sensitivity and NPV of the genotyping panel, Interpace Diagnostic pairs this test with a complementary miRNA expression classifier called ThyraMIR®. Samples for which no mutations or gene fusions are detected by the targeted sequencing test, undergo further risk stratification with ThyraMIR® which is based on the expression pattern of 10 miRNAs (miR-29b-1-5p, miR-31-5p, miR-138-1-3p, miR-139-5p, miR-146b-5p, miR-155, miR-204-5p, miR-222-3p, miR-375, miR-551b-3p).

The miRNA classifiers were developed using miRNA expression data determined by RT-qPCR on a case–control training set consisted of 240 surgical specimens [99].

The test includes expression analysis for transcripts to confirm the thyroid follicular cell content and detect sampling of parathyroid tissue and markers associated with medullary thyroid carcinoma (miR-375 and RET mutations) (Table 1).

The combined test was clinically validated using and earlier version of the NGS-based test called ThyGenX®, which analyzes 7 genes (BRAF, H-, K-, and N-RAS genes) and 3 gene fusions (PAX8-PPARG, RET-PTC1, and RET-PTC3), together with ThyraMIR®. Among 109 Bethesda III/IV cases with a 32% prevalence of cancer, ThyGenX/ThyraMIR® together demonstrated 89% sensitivity, 85% specificity, 94% NPV, 74% PPV, and a 61% benign call rate.

Banizs et al. 2019 [100] reported the establishment of an additional level to the two-level miRNA classifier described by Labourier et al. [99]. The Authors showed that this miRNA sub-classification offers the opportunity to support non-surgical management in patients with weak or no driver mutations for low levels microRNA status while supporting the need diagnostic lobectomy for high microRNA status.

Additional post validation studies are certainly needed to better determine the accuracy of ThyGeNEXT/ThyroMIR®.

Rosetta GX reveal™

The Rosetta GX Reveal™ Thyroid Classifier (Rosetta Genomics Philadelphia, PA) was a validated test to measure the expression pattern of 24 miRNAs, found to be up- or down-regulated in PTC, directly on RNA extracted from stained FNA smears prepared for initial cytological evaluation [101]. The advantage of the methodology was that it obviated the need to perform an additional collection of material for molecular testing after the fine needle aspiration, since miRNAs were analyzed from the same sample used for cytological examination. The test is no longer commercially available. The test used algorithms to classify indeterminate thyroid nodules into benign, suspicious for malignancy or positive for medullary carcinoma. Markers associated with thyroid epithelial cells were also included (Table 1).

The test was developed using a training set of 375 FNAB smears and was validated using a blinded multicenter retrospective cohort of 189 cytologically indeterminate cases, including 150 Bethesda III–IV cases, with their corresponding surgical specimens [102]. Considering classes III and IV, this validation study revealed 74% sensitivity and specificity, 43% PPV and 92% NPV, with a malignancy rate of 21%. Of note, since no Hürthle carcinomas were included the validation study, the performance of Rosetta GX Reveal™ in detecting these tumors was not determined.

Walts et al. 2018 retrospectively compared the performance of the Afirma® GEC with that of Rosetta GX Reveal™ in a cohort of 80 Bethesda III–IV thyroid FNAs with surgical follow-up and a rate of malignancy of 20–23% [79]. Rosetta GX Reveal™ demonstrated a higher specificity compared to GEC (60.3% vs 9.5%) but a lower sensitivity (78% vs 94%). Interestingly, Rosetta GX Reveal™ outperformed GEC in the cohort of NIFTP and of Hürthle lesions. A retrospective study was performed in 2018 on a small cohort of 9 Bethesda III–IV thyroid FNAs with a prevalence of cancer of 30%, comparing the Rosetta GX Reveal™ and the ThyGenX/ThyraMIR® combination tests [103]. The 2 tests had similar sensitivities and NPV (85 vs 89%, and 100% for both), while Rosetta GX™ showed a higher specificity (86 vs 71%) and higher PPV (75 vs 60%).

Non-commercial tests

Although the clinical relevance of the above described commercial tests has been widely recognized, their high cost has prevented their extensive diffusion, particularly in European Countries. As a consequence, “home-made”, customized molecular tests have been developed, many of them never reported in the literature, mainly testing by PCR and direct sequencing BRAFV600E, RAS point mutations and RET, TRK and PPARG fusions (Fig. 3 and Supplemental Table 3).

Fig. 3
figure 3

Forest plots for sensitivity, specificity, Positive and Negative Predictive Values (PPV, NPV) for non-commercial 5- and 7-genes panels. The first Author and the year of publication are indicated

The first non-commercial panels reported in the literature were based on the analysis of the 7 most frequent genetic alterations in DTC, such as the first Nikiforov’s panel (BRAFV600E and BRAFK601E, RAS mutations at codons 12, 13, and 61, PAX8/PPARG, RET/PTC and TRK fusions). This panel was tested on 2 series obtaining sensitivities of 60–100%, specificities and PPV of 100%, NPVs of 92–100 in Bethesda III category, with a prevalence of malignancy of 14–17% and sensitivities of 77%, specificities and PPV of 100%, NPVs of 79% in Bethesda IV category, with a prevalence of malignancy of 52% [104, 105]. In the same year, Cantara and co-Authors screened the same molecular alterations in 41 indeterminate lesions with a sensitivity and a PPV of 86%, a specificity and NPV of 97% and a risk of malignancy of 17% [106], whereas Beaudenon-Huibregtse et al. found both a lower sensitivity (36/67%) and a NPV (56/86%) in a series of 41 indeterminate cases analyzed by means of the same 7-genes panel, with a risk of malignancy of 50 and 32% in the III and IV categories, respectively [107].

In 2017, there were reported the results obtained in a large German cohort of 254 indeterminate cases analyzed for BRAF and RAS mutations and PAX8/PPARG and RET/PTC rearrangements, by pyrosequencing and quantitative PCR, respectively, on air-dried FNA smears [108, 109]. In the AUS/FLUS category they found sensitivity and NPP (58% and 90%, respectively), comparable to those reported by Nikiforov, but a lower specificity (82%) and PPV (41%), with a risk of malignancy of 15%. In the FN/SFN category, the specificity (91%) was similar to that previously reported [104, 107], but the sensitivity was lower (27%), with a risk of malignancy of 17%. The detection of RAS/PAX8/PPARG genetic alterations in histologically benign nodules could have affected the specificity in all indeterminate categories, while the low sensitivity in the FN/SFN category was probably due to a very low mutation prevalence in follicular thyroid cancers and in follicular variant PTCs.

Bongiovanni et al. [110], after sampling by laser capture microdissection, applied the 7-gene panel prospectively and retrospectively on 23 FN/SFN, with a malignancy rate of 57%, showing sensitivity and PPV of 67% and specificity and NPV of 92%.

Censi et al. [111] analyzed H-,K-, and N-RAS, TERT promoter and BRAF gene mutations (5-gene panel) in a series of 199 consecutive indeterminate nodules with a sensitivity, specificity, PPV, NPV and risk of malignancy of 50, 78, 37, 84%, and 22% in the AUS/FLUS category, and of 39, 85, 79, 50%, and 58% in the FN/SNF category, respectively. The frequent detection of RAS mutation in benign samples, the lack of rearrangement analysis and the introduction of the new NIFTP histopathologic nomenclature may have played a part in the low PPV obtained in this study.

The same 5-gene panel was more recently interrogated on 54 indeterminate nodules showing lower sensitivity (44%) and NPV (67%), but higher specificity and PPV (93 and 85%) [112].

Overall, the pooled sensitivity, specificity, PPV and NPV of the 7-genes molecular test on Bethesda III/IV nodules was 61.3% (95% CI 54.3–68.2%), 95.2% (95% CI 93.7–96.7%), 76.5% (95% CI 69.7–83.2%) and 90.6% (95% CI 88.6–92.7), respectively. Considering a pre-test probability of 20.3, a positive post-test probability of 76.5%, and a negative post-test probability of 9.4% were reached.

The pooled sensitivity of the 5-gene panel was 46.8%, (95% CI 36.7–56.9%), specificity 86.3% (95% CI 81–91.6%), PPV 66.7% (95% CI 55.3–78%) and NPV 73.5% (95% CI 67.3–79.8). Considering a pre-test probability of 36.9, a positive post-test probability of 66.7%, and a negative post-test probability of 26.4% were reached.

As expected, the 5 and 7 gene non-commercial panels are less sensitive, but more specific of the commercial Afirma® and Thyroseq® tests (Fig. 4).

Fig. 4
figure 4

The pooled sensitivities, specificities, Positive and Negative Predictive Values (PPV, NPV) for commercial and non-commercial tests

Several non-commercial panels for indeterminate cytologies have been also developed based on the analysis of different miRNAs, being miR-146 the only one tested in all series (Supplemental Table 3) [50, 80, 104,104,105,106,107,108,109,110,111,112,113,114,115,116].

Shen et al. [113] identified and validated a set of four miRNAs (miR-146b, -221, -187 and -30d) in 30 AUS samples, obtaining a sensitivity of 63.6%, specificity of 78.9%, PPV of 64%, and NPV of 79%, with a prevalence of malignancy of 37%.

Santos et al. [114] developed a new molecular classifier test (mir-THYpe) that analyzes the expression profiles of 11 miRNAs (let-7a, miR-103, miR-125a-5p, let-7b, miR-145, RNU48, miR-146b, miR-152, miR-155, miR-200b, and miR-181b) obtained from the same FNA cytology smear slides used to classify the thyroid nodule as indeterminate. In the validation set, the mir-THYpe test reached 100–83% sensitivity, 82–79% specificity, 25–38% PPV, 100–97% NPP, 5–13% cancer prevalence in Bethesda III and IV nodules, respectively. Mazeh et al. analyzed the expression of 6 miRNAs (miR-21, -31,-146b, -187, -221 and -222) in 11 indeterminate FNA samples, and found a sensitivity of 89%, specificity of 100%, PPV of 100% NPV of 66 [115], and a prevalence of malignancy of 63%.

Aside from these panels which analyzed the expression of miRNAs in FNA cytologies, some Authors investigated the use of circulating miRNA, which would represent a simpler and less invasive procedure [117,118,119,120]. In particular, Pilli et al. [120] analyzed the expression of two miRNA (mi-95, -190) in the serum of 72 Bethesda III and IV FNAC with an available histological diagnosis, reaching a sensitivity of 71.9%, a specificity of 85%, PPV 79.3% and NNP 79.1%, with a prevalence of malignancy of 44%. Despite these promising results, the analysis of miRNAs in the serum poses some concerns, such as the low level of miRNAs and technical problems associated with the analysis of such samples.

Molecular testing of NIFTP

Noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP) is an encapsulated or clearly delimited, noninvasive neoplasm with a follicular growth pattern and nuclear features of PTC. This entity has been established in 2016 after the revision of the outcome of 108 patients with noninvasive follicular variant PTC not treated by radioactive iodine by a working group of thyroid experts [121]. After a follow-up of at least 10 years there was no recurrence recorded, and this peculiar entity was then re-classified as non-malignant. This reclassification aims to avoid overtreatment of patients with an indolent lesion. NIFTPs are associated with “RAS-like” mutations (RAS, BRAF K601E mutations, PAX8/PPARG, THADA fusions) [122], and share gene expression profile with encapsulated follicular-variant PTC, minimally invasive follicular carcinoma and follicular adenoma [80]. Since all the commercial tests described here were developed prior to the nomenclature change, NIFTPs were classified as malignant in the validation sets. Accordingly, in both the validations studies and in the “real-world” clinical settings 95% and 80% of NIFTP were classified as suspicious/malignant by GEC or ThyroSeq® v2, respectively (Supplemental Tables 1 and 2). The reclassification of NIFTP as a benign neoplasm would likely affect the predictive value of these tests.

Conclusions

The diagnosis of indeterminate lesions of the thyroid is a challenge in cytopathology practice. Indeed, up to 30% of cases lack the morphological features needed to provide definitive classification. The molecular characterization of thyroid nodules has become more easy and exhaustive since the advent, in the last 10 years, of NGS and Gene Expression technologies which have provided better stratification of patients. Two different categories of molecular tests have been developed, the ‘rule-out’ methods, which aim reduce the avoidable treatment of benign nodules, and the ‘rule-in’ tests that have the purpose to optimize surgical management (total thyroidectomy or loboisthmectomy). Although each test has different advantages and limitations in the evaluation of indeterminate FNA samples, they are progressively increasing their performance levels and are predicted to become an integral part of the thyroid nodule evaluation, especially if their cost will be reduced. Finally, it should be highlighted that the genetic characterization of a thyroid nodule has a positive impact not only in the initial treatment but potentially in the follow-up of patients, too. Indeed, some molecular markers, including the most studied BRAF and TERT promoter mutations, have been shown to harbor a prognostic value and their evaluation is predicted to be of help in the stratification of patients into distinct risk groups and in a better assessment of their outcome.

Moreover, in the era of targeted therapies, knowing the molecular signature of the tumor is crucial for the selection of the most appropriate antineoplastic compound.