Introduction

Fine-needle aspiration (FNA) for cytological evaluation is the pivotal tool for the management of thyroid nodule. The Bethesda System for Reporting Thyroid Cytopathology (TBRSTC) [1] is the most used system for classification of thyroid cytology worldwide, while the systems proposed by British Thyroid Association (BTA) [2] and Italian consensus for the classification and reporting of thyroid cytology (ICCRTC) [3] are used mainly in Europe. Theoretically, the aim of thyroid FNA is to discriminate cytologically benign nodules to be managed by clinical and ultrasound follow-up from those warranting thyroidectomy due to cytology consistent with malignancy. However, some limitations of thyroid cytological assessment exist, such as the inconclusive reports (i.e., indeterminate samples) and FNA suspicious for malignancy (SFM). The latter category, namely Bethesda V [1], Thy4 [2], and TIR4 [3], represents a condition in which the likelihood to confirm the cancer at histology ranges from 50 to 75% in TBRSTC [1], from 68 to 70% in BTA [2], and from 60 to 80% in ICCRTC [3]. Even if these systems have been published in updated versions over the time, no significant changes were present in the definition of the suspicious for malignancy category [4,5,6]. Furthermore, the risk of malignancy (ROM) of this FNA category decreases when considering the noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTPs) as non-malignant entity [7]. As a consequence, there is no general consensus on the best therapeutic management of patients with SFM. In this context, the most recent American Thyroid Association (ATA) guidelines suggest to reduce both the extent of surgery and postoperative radioiodine treatment for many low-risk cancers with consequent raising possibility for lobectomy as initial surgical management [8]. Many factors could influence the decision of unilateral versus bilateral thyroid surgery in these patients and the molecular testing represents one major issue [1, 8].

BRAF (V600E) is the most frequent genetic mutation in papillary thyroid cancer (PTC) and has been reported as a predictor of poor prognosis of these patients. Being PTC the most frequent cancer type among nodules with preoperative FNA classified as Bethesda V [1], Thy4 [2], and TIR4 [3], there is a large literature on the use of BRAF status testing in this setting of patients. However, until now, there is not enough solid information on this topic.

This study aimed to obtain more robust information in this field by conducting a systematic review and meta-analysis. The primary outcome was the reliability of BRAF (V600E) testing in detecting malignancy before surgery in nodules with FNA reading of Bethesda V [1], Thy4 [2], and TIR4 [3]. The secondary outcome was to analyze both positive and negative predictive value (PPV and NPV, respectively) of BRAF test to detect malignancy considering the surgical histology as gold standard.

Methods

Conduct of Review

The present systematic review and meta-analysis was conducted according to PRISMA guidelines [9].

Search Strategy

A comprehensive literature search was conducted using the online databases of PubMed/MEDLINE and Scopus. The search aimed to find data on the analysis of BRAF mutation performed on cytological samples of nodules cytologically classified as suspicious for malignancy (i.e., Bethesda V [1], Thy4 [2], and TIR4 [3]). The online search was conducted by the following algorithm: (suspicious for malignancy AND molecular analysis) OR (Bethesda V AND BRAF) OR (Thy4 and BRAF) OR (TIR4 AND BRAF) OR (suspicious thyroid lesions AND BRAF) OR (suspicious thyroid cancer AND BRAF) OR (suspicious for malignancy AND BRAF) OR (suspicious thyroid lesions AND molecular analysis) OR (suspicious thyroid cancer AND molecular analysis). A beginning date limit was not used. The search was updated until June 15, 2019, and no language restrictions were used. With an attempt to expand the search, references of the retrieved articles were also screened to identify additional studies.

Study selection

As the main inclusion criterion, only original articles reporting BRAF status analyzed on FNA specimens after a cytological diagnosis of suspicious for malignancy according to the above systems [1,2,3] were included. Exclusion criteria were absence/incompleteness of data of BRAF analysis, studies in which another classification system was used, BRAF test performed in non-cytological specimens, series comprising less than ten cases, series comprising only malignant post-surgical diagnosis, and articles with overlapping data on patients or nodules. Two researchers (PT, LS) independently reviewed titles and abstracts of the retrieved articles, applying the selection criteria; then, all authors independently reviewed the full text of the remaining articles to determine their final inclusion.

Data extraction

For each included study, the following information was extracted independently by two investigators (PT, LS) in a piloted form: [1] study data (authors, year and journal of publication, country of origin); [2] number of nodules with suspicious FNA; [3] cytological system used; [4] number of BRAF (V600E) mutated cases among nodules with suspicious FNA; [5] cytological preparation; [7] number of cancers and benign lesions at histology among patients operated upon. Data were cross-checked, and any discrepancies were discussed and mutually solved.

Study quality assessment

The risk of bias of included studies was assessed independently by two reviewers (PT, LS) through the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool for the following aspects: patient selection, index test, reference standard, flow, and timing. Risk of bias and concerns about applicability were rated as low, high, and unclear risk.

Statistical analysis

A proportion meta-analysis was performed to obtain the pooled rate of BRAF mutated cases on FNA samples among all suspicious cytologies, also separated in different subgroups. Moreover, a proportion meta-analysis was performed to obtain the pooled rate of BRAF mutated cases detected on FNA among all BRAF mutated cancers detected on histological specimens. Particularly, a lesion-based analysis was conducted. For statistical pooling of data, the DerSimonian and Laird method (random-effects model) was used [10]. Pooled data are presented with 95% confidence intervals (95% CI) and displayed using a forest plot. The I-square index was used to quantify the heterogeneity among the studies, and significant heterogeneity was defined as an I-square value > 50%. A funnel plot was carried out for any outcome and publication bias might be considered when smaller size studies had on average different results with respect to the larger ones. Statistical analyses were performed using the StatsDirect statistical software (StatsDirect Ltd; Cambridge, UK).

Results

Eligible articles

A number of 360 original articles were found by using the above algorithm. Applying the above selection criteria, a number of 34 studies were finally included in the review [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44]. Figure 1 illustrates the flow of search.

Fig. 1
figure 1

Flowchart of study selection process. Full explanation of the terms used for the search is reported in the text. SFM, suspicious for malignancy

Qualitative analysis (systematic review)

The 34 studies included were published by authors from 12 different countries during the period from 2009 and 2019. An overall number of 1428 nodules with FNA report of Bethesda V, Thy4, or TIR4 was found. Of these, a subgroup of 1287 (90.1%) lesions underwent surgery and 1153/1287 (89.6%) of these were histologically proven to be malignant. The majority of studies (25/34) performed only BRAF test, while the other ones performed one or more further genetic analyses alongside with BRAF. The analysis of BRAF (V600E) status was always performed by commercialized PCR kits. The larger part of studies classified nodules according to TBRSTC. Data on size of nodules were usually missing. Study sample size ranged from 11 to 142 nodules. Table 1 shows in detail the main characteristics of the studies included. Figure 2 illustrates schematically the available data from the included studies. Notably, there were 20 studies (58.8%) reporting histologically proven diagnosis for both cancers and benign lesions.

Table 1 Summary of main characteristics of the 34 articles included in the meta-analysis
Fig. 2
figure 2

Data found in the studies included in the review

Study quality assessment

As summarized in Supplemental Table 1, in many of the 34 studies, the overall risk of bias and the concerns about applicability were considered to be low. However, high risk of bias and concerns regarding applicability about the patient selection (and also the flow and timing item) were present in five studies [18, 26, 39, 40, 43], where BRAF mutational analysis was not routinely and consecutively performed in all cytological suspicious for malignancy nodules but only in selected cases. Risk of bias and debatable feasibility of index test were rated as high in two articles [20, 42], since palpation-guided technique for FNA was adopted for part of the nodules. Moreover, although a high risk of bias for reference standard belonged to six studies [21, 22, 27, 30, 33, 44], the preoperative FNA accuracy in predicting malignant histology was within the ranges specified in the cytological classification systems [1,2,3]. Lastly, unclear judgment to the items of QUADAS-2 was given when studies showed approximated or missing data.

Quantitative analysis (meta-analysis)

The meta-analysis of the prevalence of BRAF (V600E) mutation among nodules with SFM FNA (Bethesda V, Thy4, TIR4) was conducted by pooling the results of 1428 lesions of the 34 studies (Table 2). The rate of positive BRAF ranged from 15 to 85% in the different series. The pooled rate was 47% (95% CI = 40 to 54). There was inconsistency (I [2] = 85.5%) (Fig. 3a). As shown in funnel plot, a significant publication bias was not to be considered (Fig. 3b). In view of the heterogeneity found among all articles, we performed a subgroup analysis considering only studies with more than 50 nodules. However, the pooled prevalence did not change (50%, 95% CI = 36 to 64) and there was again inconsistency among studies (I [2] = 93%).

Table 2 Data available in the 34 articles included in the systematic review
Table 3 Pooled PPV and NPV of BRAF (V600E) testing performed on FNA read as suspicious for malignancy (i.e., Bethesda V [1], Thy4 [2], TIR4 [3])
Fig. 3
figure 3

Meta-analysis of the prevalence of mutated BRAF (V600E) cases among all FNA samples read as suspicious for malignancy (i.e., Bethesda V1, Thy42, TIR43). a Pooled prevalence (95% CI); b funnel plot

Both PPV and NPV were calculated considering only 20 articles [13, 14, 18,19,20,21,22, 25, 26, 29, 31, 33, 34, 37,38,39,40,41, 43, 44] in which histological diagnosis of both cancers and benign lesions were detailed according to BRAF status. The overall number of these nodules was 903, being mutated and wild-type BRAF (V600E) 505 and 398, respectively. Among these nodules, 812 were PTC (of which 501 mutated BRAF and 311 wild-type BRAF) and 91 were benign. Pooled PPV and NPV of BRAF testing were 99% (95% CI, 97–99) and 24% (95% CI, 16–32), respectively. A significant publication bias was not to be considered (Table 3).

Discussion

Whether BRAF (V600E) testing could be useful to improve the decision of surgical approach in patients with SFM thyroid FNA is still unclear. To date, four meta-analyses have been reported on this topic [45,46,47,48]. All these reviews included data published up to 2014 from studies on indeterminate nodules (i.e., Bethesda III, Bethesda IV, Bethesda V). Jia et al. [45] did not report specific data on SFM FNA category. Fnais et al. [46] included a limited number of studies and reported BRAF status only in histologically proven PTC. Su et al. (47) included as reference standard clinical plus cytological follow alongside with histology. Jinih et al. [48] did not detail results of SFM. Also, these studies did not consider the most recent classification for thyroid FNA [1,2,3] as the only standard to identify SFM. Considering these study designs, the present meta-analysis is fully different and the present results are quite new.

Our systematic review enrolled a large number of studies evaluating the BRAF testing in a selected sample of nodules with FNA read as SFM. Importantly, these studies performed BRAF mutation analysis on cytological samples. Following our algorithm of search, we found online a significant number of articles reporting data on the use of BRAF (V600E) test in FNA classified according to TBRSTC [1], BTA [2], or ICCRTC [3]. The overall number of nodules (i.e., 1428) undergone BRAF test was relevant, and a 90.1% of these had histological follow-up. The pooled rate of mutated BRAF cases was 47% and a publication bias should be not considered (i.e., the rate of BRAF mutation was not influenced by sample size of the studies). These results did not change when we considered only the subgroup of articles with larger sample. As a major result, even if PPV of BRAF test was excellent (99%), its NPV was very poor (24%). These findings achieve some relevance for clinical practice.

The expected cancer rate of nodules with FNA of suspicious for malignancy ranges from 50 to 80% according to available guidelines [1,2,3]. International guidelines [2, 8, 49,50,51] agree with the surgical indication of these patients but what should be considered the optimal approach is widely debated. The Recommendation 20 of ATA guidelines [8] strongly indicates total thyroidectomy for nodules cytologically classified as SFM having BRAF (V600E), RET/PTC, and PAX8/PPAR mutations. However, these guidelines state that mutational analysis of BRAF or the 7-gene mutation marker panel might be considered if such data would be expected to alter surgical decision-making (R17B, weak recommendation, and moderate-quality evidence). Moreover, ATA guidelines [8] claim that thyroid lobectomy alone is sufficient for < 1 cm, unifocal, intrathyroidal carcinomas in the absence of prior head and neck irradiation, familial thyroid carcinoma, or clinically detectable cervical nodal metastases and as the initial procedure for thyroid cancer > 1 cm and < 4 cm without extrathyroidal extension and without clinical evidence of any lymph node metastases (R35, strong recommendation, and moderate-quality evidence). In this context, AACE/ACE/AME Task Force on Thyroid Nodules [49] recommend to use BRAF mutational analysis or intraoperative histological examination as a guide for the extent of surgery, without specifying the type of surgery and if some clinical data have to be considered for the surgical decision. The BTA 2014 guidelines [2] recommend to perform diagnostic hemithyroidectomy for Thy4 and they only mention molecular tests as a tool to assist in stratification of risk, without providing details about genes. NCCN guidelines [50] state that total thyroidectomy or lobectomy with isthmusectomy should be indicated on the basis of clinical and morphological criteria and not on the basis of molecular diagnostics which may be performed in Bethesda III and IV results. Lastly, European Thyroid Association (ETA) guidelines [51] report that selected mutations, such as BRAF, TERT, and TP53, need to be further investigated; nonetheless, in the main text of these guidelines [51], it is asserted that the detection of BRAF (and, probably, also of TERT promoter and TP53 mutations) may drive towards total thyroidectomy if the clinico-pathological setting is appropriate and that the identification of specific mutations, such as BRAF (V600E) and TERT in thyroid nodules > 1 cm, justifies a total thyroidectomy and possibly prophylactic central lymph node dissection.

The present results from our meta-analysis demonstrated that a routine use of BRAF test is not indicated in all FNA read as SFM. In fact, available literature suggests that this approach is likely not cost-effective being positive test present in less than 50% of these cases. However, when managing patients with clinico-pathological features prompting to extend surgery, BRAF may be reasonably used as a rule-in test because its positivity can strongly indicate the presence of cancer. On the contrary, in patients without further suspicious features, BRAF test is not supported to corroborate the clinical indication to non-extended surgery (i.e., rule-out use of BRAF). In example, in a patient with single nodule cytologically SFM with no evidence of extra-thyroid extension at ultrasound, a lobectomy may be considered and this clinical decision cannot be significantly influenced by BRAF testing due to its poor NPV. According to the present results, R17 of ATA should be considered with high-quality evidence and probably the grade of recommendation might be revised.

In our pooled series, four BRAF mutated cases were false positive (i.e., benign at histology). This is a critical issue, since BRAF mutation is considered pathognomonic of PTC and decision to move to total thyroidectomy plus lymphadenectomy may be based on this finding. One BRAF mutated case on cytology was an oncocytic adenoma [18] at histology. The other three cases showed a chronic lymphocytic thyroiditis, a multinodular hyperplasia [20], and an undefined benign lesion (thyroiditis or nodular hyperplasia) [39]. It is known that some highly sensitive BRAF detection method can cause false positive results by detecting mutation when this is present in only 2% of cells. It should be reminded that BRAF is frequently mutated in other more common neoplastic conditions, such as melanoma, colorectal carcinoma, lung carcinoma, and leukemia, that can affect the thyroid as metastasis. Kuhn et al. [52] detected the presence of BRAF mutated cells in a FNA of the thyroid classified as SFM that turned out at histology to be primary Langerhans cell histiocytosis of the thyroid. In the latter examples, BRAF mutation cannot be considered a false positive result; however, management could be different from that of a primary thyroid tumor.

Some limitations of the present findings should be mentioned. First, we have no data on the impact of BRAF test in the surgical decision-making of patients. All included studies are retrospective while there is not a randomized controlled trial on this topic. Then, whether BRAF status would have influenced the surgical approach is not known. Second, high heterogeneity was present probably due to the high number of studies included. The strength of the present study was its design which allowed to obtain clear data differently from the previous reviews [45,46,47,48].

In conclusion, the present meta-analysis showed that BRAF (V600E) mutation was found in about one in two nodules with thyroid FNA read as suspicious for malignancy according to the most recent classification systems. PPV of BRAF test to predict malignant lesions was excellent, while its NPV was very poor. Considering these evidence-based data, the routinely BRAF testing in FNA suspicious for malignancy cannot be recommended. In the setting of patients with FNA report of Bethesda V, Thy4, or TIR4, BRAF may be useful to support a more extended surgical approach in selected cases with further suspicious clinical/ultrasound features. A role of BRAF testing to guide to less extensive surgery is not supported.