Introduction

Thyroid nodules are highly prevalent (20 % to 76 %) in the general population, and the occurrence increases with age [1]. Most are asymptomatic and can only be identified with imaging techniques. High-frequency (12–18 MHz) linear transducers provide high spatial resolution images, allowing detection of 1-mm cysts and recognition of solid nodules starting at 2–3 mm. Because of this fact and because of the increasing experience acquired in this field, ultrasound (US) has been established as the first detection tool of choice in thyroid studies [2].

US allows for identification of a wide morphological and size spectrum of thyroid lesions. This poses difficulty for the endocrinologist regarding how to select those nodules for fine needle aspiration biopsy (FNAB), in order, on the one hand, to not over-indicate this procedure and, on the other, to not exclude those cases that do require diagnostic puncture. Based on this, in order to better assist a specialist it is necessary to establish US parameters with adequate predictive values that assist either in the determination of the presence of cancer or to identify benign lesions in patients.

Thyroid imaging, reporting and data system (TIRADS) classification was created in our institution by a multidisciplinary team and published in 2009 [3], as an attempt to solve the problem of nodule selection for FNAB. It is an adaptation of the American College of Radiology’s (ACR's) breast imaging reporting and data system (BI-RADS) [4], used universally today on breast images for thyroid pathology. The TIRADS classification is based on a prospective series of patients subject to a prior sonographic study performed where nodules were classified in one of 10 sonographic patterns aimed to establish malignancy risks by classification in a corresponding TIRADS category (2 to 5). Keeping the malignancy rates of the BI-RADS system we were able to apply a similar management algorithm (puncture for nodules TIRADS 4 and 5, follow-up for the nodules in the rest of the categories). Nonetheless, the limitation of this original work was inherent due to using FNAB as the gold standard. FNAB cyto-histologic diagnosis includes a percentage of undetermined lesions whose final result (benign or malignant) was not quantifiable since surgery was not performed on all of them. Due to this uncertainty, we performed a validation study against a surgical reference standard to confirm the utility of our TIRADS classification.

The goal of the present study is to validate US TIRADS classification in a prospective surgical series of 502 nodules, determining sensitivity, specificity, as well as predictive values and likelihood ratios, as well as parameters which will allow confirmation of its role in the management of the thyroid nodular pathology.

Materials and method

A prospective cohort registry of patients undergoing thyroid US and thyroidectomy was conducted in our centre after Institutional Review Board approval.

TIRADS classification

The TIRADS classification was previously published [3]. Briefly, the following elements can be noted:

As an analogy of ACR's BI-RADS system [4], TIRADS designates a score for both general thyroid pathology (TIRADS 1–6; Table 1) and for US-assessed thyroid nodules (TIRADS 2–6), with an increasing probability of a cancer diagnosis (e.g. TIRADS 1 is a normal exam, TIRADS 6 is a certified malignancy). The system catalogues all US-detected thyroid lesions into different groups (TIRADS 2–5), assuming the same malignancy risk established in the BI-RADS system. As a result, the classification also homologates their clinical management (e.g. TIRADS 4 and 5: FNAB, TIRADS 2 and 3: follow-up).

Table 1 TIRADS categories. Malignancy risk established in this surgical series and the recommendations for clinical management (categories TIRADS 2–5)

Since there is no individual ultrasonographic sign that allows highly accurate prediction of malignancy on its own, TIRADS is based on 10 ultrasound patterns (Table 2). This allows the classification of virtually all thyroid nodules identified by US (Figures 1, 2, 3, and 4). To define and specify each US pattern, the following variables were considered: sonographic structure, echogenicity, shape, orientation, borders, capsule presence/absence, calcifications, hyperechoic spots and vascularization.

Table 2 Ten ultrasound patterns, their definitions and corresponding TIRADS category
Fig. 1
figure 1

Colloid patterns: Type 1: Oval anechoic image with hyperechoic spot = colloid cyst (a-c). They are benign (TIRADS category 2) and do not require diagnostic aspiration or treatment. Type 2: Spongiform nodule, grid–shaped in which the thyroid parenchyma has become less compact locally, due to a larger amount of colloid material (d-f). Hyperechoic spots are always seen. They lack a capsule and are highly vascularized on color Doppler evaluation (f). Classified as benign (TIRADS category 2) and do not need aspiration. Type 3: Mixed hyperplastic colloid nodule, in which the fluid/colloid component (g) or the solid isoechoic component (h) can dominate. In the mainly cystic lesions, vegetating images, septa and thickened walls can be seen. Usually vascularized on color Doppler images (i). The presence of hyperechoic spots suggests benignity (TIRADS category 2). Without hyperechoic spots they are classified as TIRADS category 3

Fig. 2
figure 2

Thyroiditis patterns: 1 Hashimoto's chronic thyroiditis on US (a–d): Gland of variable size, undulated surface, heterogeneous structure, decreased echogenicity and highly vascularized on color Doppler (d). Perithyroid lymph nodes (b, arrow) may be found. These are benign findings belonging to TIRADS category 2 with two variants: Hyperechoic pseudo-nodule variant (“white knight”) corresponds to an area rich in Hurthle cells (a, arrow). This is a benign finding, thus no diagnostic aspiration is required (TIRADS category 2). Hypoechoic pseudo-nodule variant (c, arrow): hypoechoic pseudo-nodules that appear to be different from the other thyroiditis focus dispersed within the parenchyma. TIRADS category 32 De Quervain's subacute thyroiditis (e–h): Multiple, markedly hypoechoic lesions, irregular in shape, with indistinct, blurred margins (e, j arrows). Initially, the hypoechoic lesions are seen as poorly vascularized (h), while in the late, regeneration phase, its vascularization increases (i). During the first weeks, very small, round, markedly hypoechoic lymph nodes (g, short arrows) appear in infra-thyroid and peri-isthmic location. TIRADS category 2. De Quervain’s thyroiditis scar in transverse view (k) and longitudinal view (l)

Fig. 3
figure 3

Neoplastic Patterns: nodules completely surrounded by a capsule (a-p). They can be solid or mixed (c,e), isoechoic (a,c,e-h,p), hyper- (b,i,k) or hypoechoic (d,l,o). Usually vascularized on color Doppler, with peripheral vessels, from which intranodular branches arise (h). Two variants: 1) Simple neoplastic pattern (e-h) distinguished by its continuous but thin hypoechoic halo. It has a low degree of suspicion (TIRADS category 4A) and usually corresponds to a colloid nodule in its early evolutionary stage. 2) Suspicious neoplastic pattern describes the solid or mixed nodules with real continuous capsule (i-p), associated to other findings that increase its risk (irregular capsule (i,j), a hyperechoic nodule with homogenous structure (i,k) – TIRADS category 4B). The alternation of hypo and hyperechoic areas within a solid encapsulated nodule – “mosaic” aspect (m) - is suspicious of papillary cancer, as well as a completely hypoechoic, encapsulated nodule (d,l,o). The presence of calcifications increases the degree of suspicion (o,p), including “egg-shell” peripheral calcifications (o)

Fig. 4
figure 4

Malignant Patterns: Three variants: 1) Malignant Pattern A describes solid, hypoechoic nodules with irregular shape, microlobulated or irregular margins, no capsule (a-j). Approximately 65% of the nodules with this aspect result malignant (TIRADS category 4B), the rest corresponds principally to colloid nodules. The small malignant nodules are compact, round in shape (c,d) or sometimes taller than wider (b,e) and vessels are not visualized in its interior, only afferent vessels that penetrate the lesion (j) - meanwhile, the small colloid nodules can have oval shape and are frequently seen as highly vascularized. The presence of calcifications - microcalcifications (f,g) or coarse calcifications (h,i) - increases its malignancy risk up to a 77% (TIRADS category 4C). 2) Malignant Pattern B (k-o) are highly suggestive of malignancy (>95%, TIRADS category 5). They are solid, hypo or isoechoic, irregular in shape and margins, vascularized, without capsule. Microcalcifications are always present, distributed mainly towards the periphery of the lesion (k-n). There is a rare variant: when no nodule isvisualized, only disperse microcalcifications can be seen in the parenchyma (o). 3) Malignant Pattern C represents the less frequent US appearance of papillary carcinoma (p-t). It consists of solid (p) or mixed (q-t) isoechoic non-encapsulated nodules, which are always highly vascularized (t). At first sight, they can be misinterpreted as colloid nodules, but hyperechoic spots (the principal finding that characterizes colloid nodules) cannot be found. The presence of calcifications increase the suspicion of malignancy (q-s). A nodule with these characteristics should be aspirated (TIRADS category 4C)

Ultrasound technique and patient management

IU22 Gemini scanners (Philips Healthcare) with 5–12- and 5–17-MHz transducers and colour Doppler mode were used. Thyroid US and US-guided FNAB were performed by specialised radiologists, with more than 12 years of experience in both thyroid US and interventional procedures (EH, JN, CW).

The radiology report routinely includes TIRADS classification, expressing malignancy risk of the nodule found. In case of multiple nodules, the most suspicious one is considered.

Patients with nodules classified as TIRADS 4 and/or 5 categories are subject to FNAB and cyto-histology evaluation. Our researchers used the “clot technique” [5], which consists of preparing smears and a cellblock from a blood-rich FNAB sample obtained through a single (exceptionally two) puncture using a 19–21-G needle under local anaesthetics. These cellblocks include tissue fragments for histopathology analysis.

Interpretation of the FNAB samples for cyto-histological analysis is based on Bethesda terminology that proposes various diagnostic categories according to malignancy risk and clinical management recommendations [6].

All patients with a surgical indication (malignancy, suspicion of malignancy, follicular neoplasia or follicular lesion in FNAB) are reassessed with a pre-operative US of the neck for nodal staging. A diagram of the cervical lymph nodes is added to the radiologist's report, pointing out the location of suspicious or clearly malignant nodes.

Thyroid surgical interventions are always total thyroidectomies. Hemi-thyroidectomies are not performed in our institution. Since in thyroidectomies performed for a TIRADS 4–5 indication usually there were TIRADS 2 and 3 nodules, we were also able to provide histological correlations for those that otherwise would not have had a surgical intervention. This situation provided us the TIRADS 2–3 pathological information for later comparison to the US patterns.

Study patient cohort

Between June 2009 and October 2012, consecutive patients who had undergone pre-operative loco-regional staging US and pathology studies (thyroidectomy specimens) registered at our institution were included. We excluded patients with incomplete surgical or pathological information, those undergoing surgery at other institutions, and nodules whose anatomo-pathological characterization was not possible due to tissue manipulation.

After staging with US, all nodules susceptible of recognition on gross examination were drawn (with a maximum of 10 nodules per patient), including their spatial location, size, US pattern and TIRADS category in a specially designed chart, entitled “Ultrasonographic Chart” (Appendix A1 - Supplementary material). This included the primary nodule and any other nodule identifiable on US, regardless of US pattern or TIRADS category. TIRADS score was not to be modified after the FNAB result from the database register.

The radiologist completed a second chart exclusively for the pathologist, entitled “Pathology Chart”, where the same nodule schematic representation was kept without information on the US pattern nor TIRADS category (Appendix A2 - Supplementary material). The pathologist (AC) had access only to this chart where the final pathology results (e.g. benign or malignant) for each of the identified nodules were recorded.

Relevant information from both charts was transcribed to a third chart, entitled “Concordance Chart”, by an independent team (IR, VS). The ultrasound pattern and TIRADS category of each nodule was linked to its final pathology result, thus completing the database. An anonymous correlative identification number was assigned to each identified, enrolled nodule.

Pathology

Pathology analysis of the thyroidectomy specimens was done by a single pathologist specialised in thyroid with more than 15 years of experience (AC).

The thyroidectomy specimens were fixed in formalin and sent to the pathology laboratory, together with the patient's clinical information and the “Pathology Chart”. In each sample, the upper pole of the right lobe was labeled. In the pathology laboratory the posterior surface of the gland was inked and sectioned in 1-2 mm thick slices, parallel to the sagittal plane. During this process, every nodule identified was labeled and submitted separately for analysis. Shape, size, colour, consistency, presence of capsule, calcification and localisation within the gland were recorded for each nodule. In every case, the findings were correlated with the diagram included in the “Pathology Chart”. Microscopic analysis of all identified nodules was performed and a histopathology diagnosis made. Pathology results were recorded in the “Pathology Chart” for each nodule separately.

Statistical analysis

Qualitative characteristics were described along with their percentage distributions. For quantitative variables, averages [and standard deviation (SD)], or interquartile ranges in case of asymmetry were used. Normal distribution of quantitative variables was assessed using Shapiro–Wilk testing. An analysis of different cut-off points was completed in order to evaluate the diagnostic capacity of TIRADS classification, estimating sensitivity, specificity, positive and negative predictive values, and positive and negative likelihood ratios.

Interobserver agreement: A subset of 30 consecutive nodules was selected from our database, and 3 experienced radiologists independently examined and rated all US images available. The interobserver agreement was measured using a weighted kappa statistic with 95 % confidence intervals (CI). Levels of agreement included values ≤ 0 as no agreement; 0.01–0.20 as slight, 0.21–0.40 as fair, 0.41– 0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1.00 as almost perfect agreement.

Results

A total of 502 nodules (in 210 patients) identified in gross specimen and also classifiable by US were included. The average number of nodules per patient was 2.39 (±1.64). Median age was 46 years (IQR 18 years). Out of the total number of patients, 164 were women (78.1 % of the sample group).

Nodules had a median size of 7 mm (3–60 mm) with an IQR of 7 mm. The overall distribution in TIRADS categories was as follows: 116 TIRADS 2 (23.11 %), 56 TIRADS 3 (11.15 %), 243 TIRADS 4 (48.41 %), and 87 TIRADS 5 (17.33 %). TIRADS category 4 was further divided into 4A (17 nodules, 6.99 %), 4B (78 nodules, 32.10 %) and 4C (148 nodules, 60.91 %).

The percentage of malignancy for each category was as follows: 0 % (0/116) in TIRADS 2, 1.79 % (1/56) in TIRADS 3, 76.13 % (185/243) in TIRADS 4 [considering subgroups: 5.88 % (1/17) in TIRADS 4A, 62.82 % (49/78) in TIRADS 4B, 91.22 % (135/148) in TIRADS 4C], and 98.85 % (86/87) in TIRADS 5 (Table 1). In this surgical series, the malignancy percentage corresponds to a global value of 54.18 % (272/502).

A cut-off point in TIRADS 4 for malignancy yielded a sensitivity of 99.6 % (CI 95 %: 98.9–100.0), specificity of 74.35 % (CI 95 %: 68.7–80.0), with a PPV of 82.1 % (CI 95 %: 78.0–86.3), an NPV of 99.4 % (CI 95 %: 98.3–100.0), a PLR of 3.9 (CI 95 %: 3.6–4.2) and an NLR of 0.005 (CI 95 %: 0.003–0.04).

Considering the TIRADS 4 subgroups and a cut-off point in TIRADS 4B, this provided a sensitivity of 99.3 % (CI 95 %: 98.2–100.0) and specificity of 81.3 % (CI 95 %: 76.3–86.3), with a PPV of 86.3 % (CI 95 %: 82.4–90.1), an NPV of 98.9 % (CI 95 %: 97.5–100.0), a PLR of 5.31 (CI 95 %: 4.99–5.65) and an NLR of 0.01 (CI 95 %: 0.005–0.04).

A subset of 30 nodules from the database was used for assessment of interobserver agreement, and weighted kappa results were 0.66 (95 % CI: 0.43–0.77).

Discussion

The routine use of thyroid US has lead to increasing detection rates of thyroid nodules in the general population [7]. Although the vast majority of these nodules are benign, a significant number of diagnostic aspirations are performed to rule out malignancy. Defining reliable and reproducible US criteria is essential for optimal resource allocation, and decreasing the stress and anxiety of patients during subsequent studies. Robust US criteria will allow the selection of those nodules warranting diagnostic puncture due to higher malignancy risk. Moreover, a standardised language and reporting system for both radiologists and clinicians is needed.

Our researchers envisioned the adaption of the BI-RADS concept to thyroid pathology. Thus, the novel TIRADS concept was created at our institution in 2000–2001, and first reported in 2009 [3]. Similarly to the BI-RADS classification, the TIRADS designates a 1 to 6 scale scoring for general thyroid pathology, and 2 to 6 for the nodules, with an increasing possibility of malignancy. Moreover, risk malignancy categorisation homologates the BI-RADS system in terms of clinical management guidance (e.g. TIRADS 4 and 5: FNAB; and the remaining only follow-up).

Since the Koike et al. [8] and Kim et al. [9] studies in 2001, there has been an abundance of publications that have attempted to define the risk of malignancy of thyroid nodules, focused mainly on the characteristics of non-follicular lesions, as found in papillary cancer (representing 80 % of thyroid cancers). These studies [1017] are based on the assessment of different US characteristics such as marked hypoechogenicity, irregular margins, presence of microcalcifications, increased vascularity and being taller than wider, among others. Nonetheless, on their own, none of these are enough to determine malignancy with an adequate predictive value. Therefore, the authors of this study have proposed various formulae to combine them and improve the diagnostic yield. The TIRADS relies on US patterns (10 in total), which, in turn, result from combining different individual US characteristics.

The TIRADS acronym was used for the first time in 2009 [3]. Since then, Park et al. [15] used the TIRADS concept in a retrospective study equation with 12 variables, using multiple logistic regression analysis, resulting in 5 categories of assessment: T-US 1–5 with an increasing chance of malignancy. Russ et al. [16] published their TIRADS categories based on 24 sonographic characteristics. Their combination defined the TIRADS categories 1, 2, 3, 4A, 4B, 4C and 5. Their study was based on a retrospective analysis of 500 FNAB nodules from one observer at a single institution. Kwak et al. [17], in a retrospective study that included 1658 nodules, developed a predictive model based on US characteristics which, according to their analysis, are more frequently associated with malignancy: solid structure, hypoechogenicity, markedly hypoechogenicity, microcalcifications, microlobulated or irregular margin, and being taller than wider. In their TIRADS proposal, TIRADS 3 lesions do not have any of these characteristics, while TIRADS 4A lesions have one, 4B lesions have two, 4C lesions have three and TIRADS 5 lesions have all of the suggested elements to suspect malignancy. In a more recent publication from Russ et al. [18], they prospectively evaluated the diagnostic accuracy of their TIRADS system [16] on 4550 nodules with and without elastography, and estimated the reduction of indications of FNABs in 33.8 %. Other authors have adopted this concept and have developed their own classification systems [19, 20]. As a corollary of all these efforts various different TIRADS systems coexist at present.

Our TIRADS classification has the advantage of presenting US criteria to characterise all types of nodules, the nonfollicular as well as the follicular histology for malignant nodules. Furthermore, our classification is not limited to identification of those nodules with high malignancy risk, but it also aims to categorise the most likely to be benign as such. This latter concept is of great relevance considering the very high prevalence of benign lesions. As we have shown in this surgical series, the TIRADS classification has a NLR of less than 0.1. This can safely translate to changing the clinical management for TIRADS 2 and some TIRADS 3 nodules, recommending a follow-up or "do not touch" approach.

In comparison to the other studies previously mentioned, TIRADS, as with BI-RADS, goes beyond a single-method static classification of thyroid nodules. In this system, the thyroid nodules are contextualised, integrating other factors (e.g. clinical, imaging findings, a nodule’s changes over time, previous FNABs results, etc.), increasing the classification’s robustness, and straightforwardness of its translation into clinical management decisions. Additionally, the TIRADS classification is applicable to thyroid pathology in general since it also considers different diffuse pathologies (e.g. Hashimoto's thyroiditis, De Quervain thyroiditis, Graves' disease) and varying clinical situations, such as pre-surgical staging or follow-up of patients with previous cancer surgery, etc. (Appendix A3 – Supplementary material).

Over time, we have incorporated slight modifications to optimise our system to clinical management. The TIRADS 4 category (as BI-RADS 4) includes a broad range from 5–95 % of malignancy risk (Table 1), and we consequently stressed a need for sub-categories. In the original publication, we defined TIRADS 4A and 4B. Afterwards, in order to mirror the BI-RADS system, and to ease the comparison of results with other studies, we proposed the use of TIRADS 4A, 4B and 4C categories which, in the present study, translate to malignancy risks of 5–10 %, 11–65 % and 66–95 %, respectively.

Different guidelines recommend diagnostic puncture of thyroid nodules larger than 10–15 mm [2123]. Remarkably, the TIRADS classification considers US appearance features and does not incorporate the size of the nodules as relevant for FNAB indication. In the case of suspicious nodules (TIRADS 4 and 5), the smallest size to be sampled is determined by the operator's skills (greater than 3–4 mm in our experience). This approach is similar to that proposed by the 2015 ATA guidelines [24]. However, it is worth noting that the diagnosis of microcarcinomas (e.g. nodules less that 10 mm) is a matter of current international debate. On the other hand, FNAB may be performed in benign or probable benign patterns (TIRADS 2 and 3), as in nodules larger than 2–3 cm (e.g. mixed nodules with colloid pattern type 3) in order to confirm their benign nature and this way help the clinical management.

Like BI-RADS classification, the TIRADS system is continuously improving and evolving, and is able to be modified according to new evidence as it becomes available. This might include in the future elastosonography findings [25, 26], contrast-enhanced ultrasound [26, 27], PET findings, or other imaging techniques.

To our knowledge this is the first study correlating US findings with final histopathology in the surgical specimen. Therefore, the present results of the diagnostic capacity of the TIRADS classification are not biased by the inherent inaccuracy of FNAB-based cytohistology results. Furthermore, in the present series, we collected information of the other non-suspicious nodules present in the resected specimen. The latter allowed for the first time to correlate pathology findings with nodules categorised as TIRADS 2, that otherwise would not have been assessed histologically and thus confirm their absolute non-malignant aetiology.

There are some limitations to our study. First, it is a single-institutional trial in a tertiary referral hospital. Also, since this is a surgical series of thyroidectomized patients (e.g. high suspicion of malignancy or follicular lesion/neoplasia confirmed with FNAB) there is overrepresentation of cancers (54.18 %), compared to an FNAB-based series of the general population (4–5 %) [1], including our own previously published FNAB-based study (14 %) [3]. However, as previously mentioned, this allowed us to have information of nodules with low suspicion grading (TIRADS 2) included in the resected thyroid assessment.

It is important to point out that, in our view, the TIRADS classification is not designed to replace FNAB as a diagnostic technique, rather to provide a more objective and reproducible method for selecting those nodules that do require this procedure. By establishing a standardised language and coding system for radiologists and clinicians, it facilitates the clinical management and follow-up of thyroid nodules. Prospective trials evaluating multi-institutional use of TIRADS is warranted to assess its cost-effectiveness, and to establish TIRADS as a reference for thyroid nodule management and, in a broader sense, for thyroid pathology in general.

Conclusions

This study confirms the good concordance between TIRADS categories based on the ultrasound aspects of thyroid nodules and the final histopathology. Therefore, we could determine the malignancy risk of each category, thus demonstrating its potential clinical usefulness when used to guide clinical decisions. Nodules classified as TIRADS 4 and 5 warrant diagnostic aspiration (FNAB). Conversely, those nodules classified as TIRADS 2 and 3 harbour a very low malignancy risk (0–2 %) and, therefore, can be safely monitored, reducing the number of unnecessary procedures.