Introduction

Ultrasound (US) is recognized as the most important modality for the assessment of malignancy risk of thyroid nodules. In fact, from the large experience of the last two decades, we know that thyroid nodules with specific US features, such as strong hypoechogenicity, taller-than-wide shape, irregular margins, microcalcifications, and extrathyroidal extension, are at high risk to be cancers [1, 2]. On the contrary, those nodules without high-risk US features can be considered at low to intermediate risk and their likelihood to be a cancer significantly drops. Since 2009, a number of US risk stratification systems (RSSs) have been developed to classify thyroid nodule based on their sonographic patterns. The majority of these RSSs is labeled as TIRADS, as the acronym of Thyroid Imaging Reporting and Data System, in agreement to the case of the breast US (i.e., BIRADS). The first proposal came from a Chilean group [3]; afterwards, later versions appeared in the main document of clinical practice guidelines for the management of thyroid nodules [4, 5] and in radiological position papers [6,7,8,9]. Table 1 summarizes the main differences among the RSSs. They include up to six categories, with the estimated risk of malignancy rising through the different classes. Four of these [the American, European and Korean Thyroid Association (ATA, ETA, KTA, respectively) and the Chilean RSSs] [3,4,5,6,7,8] were conceived based on the nodule’s sonographic pattern, in that US features are combined into specific risk categories. The remaining one [American College of Radiology (ACR) RSS] [9] consists of summing points previously attributed to each US feature (a point-based scoring system). As for the high-risk features, only hypoechoic presentation, irregular margins and microcalcifications are included in all of them. With the advent of these systems, an increasing number of studies have been published on this topic and the majority of these attempted to compare the performance of these TIRADSs/RSSs. As the major finding, a meta-analysis reported that their reliability in selecting nodule for biopsy is quite similar [10]. In addition, the temporal stability of the initial risk assessed by TIRADSs/RSSs remains stable over the time [11].

Table 1 Major differences in terms of structure and high-risk characteristics among the RSSs considered in the present study

The present study was aimed at systematically review the literature on this topic to account for the scientific production on TIRADSs/RSSs and ultimately address the following questions: how many articles and what type of studies have been published on the most diffused TIRADSs/RSSs [3,4,5,6,7,8,9]; which TIRADSs/RSSs have preferably been compared and discussed in the published meta-analyses, original articles and other kind of studies; what is the geographic distribution of the publications in this field.

Materials and methods

Review conduction

The systematic review was conducted according to the PRISMA statement [12]. The checklist is reported as supplemental file.

Search strategy

A three-step search strategy was planned. First, keywords were identified in PubMed. Second, PubMed and Scopus were searched using the following algorithm “ATA ultrasound system OR (“TIRADS” OR “TI-RADS”) OR (“EU-TIRADS” OR “EU-TI-RADS”) OR (“K-TIRADS” OR “K-TI-RADS”) OR (“ACR-TIRADS” OR “ACR-TI-RADS”)”. Third, any citation of the above RSSs was recorded. The last search was performed on July 10th, 2021. No language neither time restriction was adopted. Two investigators (GF, PT) independently searched papers, screened titles and abstracts of the retrieved records, and finally selected the included references.

Data extraction

The following information was extracted independently and in duplicate by two investigators (GF, PT) in a piloted form: first author, year of publication, country of the first author, study type, RSS(s) under investigation among those were considered [3,4,5,6,7,8,9]. Data were cross-checked and discrepancies were mutually discussed.

Data analysis

The number of articles evaluating each TIRADS/RSS was calculated. Then, the overall number of articles evaluating all systems was obtained. The kind or articles (i.e., original paper, meta-analysis, etc.) was recorded.

Results

Records found

According to above search algorithm, a number of 538 records were found and, after exclusion of duplicates and articles not meeting the inclusion criteria or not in the area of interest, 502 studies were finally included in the systematic review.

Article type

Among the articles included in the present systematic review, 13 were systematic review with meta-analysis, 424 were original paper, 34 were narrative review, 4 were guidelines or practical recommendations, and 27 were other kind of article (i.e., 12 comment, 2 editorial article, 13 case report).

Analysis of inclusion of the different TIRADSs/RSSs in the published studies

The number of articles evaluating at least one RSS was increasing over the time, with a sharp steepening of the curve starting from 2017. This trend applies to all types of RSSs. Overall, the TIRADSs/RSSs [3,4,5,6,7,8,9] were evaluated as follows: Horvath TIRADS 213 times, ACR-TIRADS 190, ATA 144, K-TIRADS 99, and EU-TIRADS 71. Two articles included all the five systems. Also, there were 21 articles evaluating four systems (i.e., ATA, ACR-, K, and EU-TIRADS), 26 articles three systems, 92 articles two systems, and 361 evaluating only one of them. Figure 1 illustrates the number of articles published up to 2020, while data of 2021 were not displayed because referring to the first half of the year only.

Fig. 1
figure 1

Number of articles published evaluating the five herein considered TIRADSs/RSSs [3,4,5,6,7,8,9]. Each point of each curve represents the occurrences of any TIRADS/RSS in the articles published in that year

As showed in Fig. 2, the trend analysis of cumulative number of articles evaluating any system showed a progressive increase over the time for all TIRADSs/RSSs; as above mentioned, since data of 2021 were referred to the first half of the year, they were not included.

Fig. 2
figure 2

Cumulative number of articles over the time evaluating the TIRADSs/RSSs. Each point of each curve represents the total number of articles evaluating any TIRADS/RSS at that moment

Geographic distribution

The first author of the 502 articles was from 45 different countries. He was from China in 25.7%, USA in 15.9%, Republic of Korea in 14.3%, Italy in 6.9%, and from other 41 countries in less than 5% of cases. Considering the Continents, the first author was Asian in 210 cases, European in 138, American in 113, Middle Eastern in 24, African in 11, and Oceanian in 6. Figure 3 illustrates the filled map of these findings.

Fig. 3
figure 3

Filled map of the world prevalence of studies focused on TIRADSs/RSSs found by present systematic review. The number of published articles per country was the following: China 129, USA 80, Republic of Korea 72, Italy 35, Germany 18, France 16, India 15, Turkey 15, Brazil 15, Switzerland 13, Poland 12, Canada 10, Australia 4, Egypt 4, UK 4, Israel 4, Spain 4, Philippines 3, Singapore 3, Taiwan 3, Norway 3, Romania 3, Russia 3, South Africa 3, Iran 3, Mexico 3, Argentina 2, Colombia 2, New Zealand 2, Saudi Arabia 2, Thailand 2, Malaysia 1, Pakistan 1, Austria 1, Belgium 1, Bulgaria 1, Czech Republic 1, Hungary 1, Ireland 1, Portugal 1, Cameroon 1, Chile 1, Nigeria 1, Tunisia 1, Uganda 1, Denmark 1

Discussion

The present study was conceived to analyze the interest of researchers in the US systems for the risk stratification of the thyroid nodules. With this objective, we systematically searched for the published articles evaluating the first system proposed by Horvath et al. [3], the system included in ATA guidelines for management of nodule and cancer [4, 5], and the most recognized TIRADS, such K- [6, 7], EU- [8], and ACR-TIRADS [9]. The results of the present systematic review may have the chance of arouse interest in researchers and readers focused on this field, and in the panelists of the systems themselves.

Three main findings were reported and are worthy of attention. First, a large number of articles were found since the primordial RSS released in 2009 [3]. They account for over 500 studies, the majority being original papers, but with a not negligible number (n = 13) of meta-analyses. Second, a fast-growing number of studies evaluating or discussing the TIRADSs/RSSs was observed over the time. This trend is particularly relevant for ACR-TIRADS [9], despite its more recent appearance in paper databases. It is noteworthy the continuous interest for the Horvath TIRADS, which has now been proposed over 10 years ago and, unlike the other systems, has not been proposed or endorsed by a scientific society. Third, the body of publications is distributed across Western and Eastern countries. Notably, investigators are Asian in 41% of cases and, in particular, from China in one fourth. As for the latter, in 2020 a Chinese RSS has been released (i.e., C-TIRADS) [13]: the interest of researchers in this system will become clearer in the next future. Overall, this remarkable scientific activity that revolves around RSSs suggests the interest and the need of the scientific community in testing and/or validating tools that allow to standardize and, hopefully, facilitate the management of thyroid nodules. This is undoubtedly facilitated by the epidemic numbers that thyroid nodular disease is assuming worldwide [14] and the move towards an individualized approach tailored on the patients’ risk [15].

Unfortunately, the present study cannot answer some other questions arising out clinical practice. In fact, whether these systems are used in routine practice is not known. Furthermore, whether these systems are easy-to-use during clinical assessment of patients and if one system is easier and more handle than the other ones is an unexplored matter. In this context, we should consider the heterogenous clinical scenarios across different continents and countries. For instance, in United States and other countries neck US is generally performed by technicians or nurses and, within this context, using a schematic point-based RSS, such as ACR-TIRADS, can help the correct acquisition of images and the following interpretation.

Some potential limitations of the present data should be addressed. First, we created a specific algorithm (see above) and we found a large number of articles. The fact that a 7% (36/538) of records initially found was excluded because it did not meet the inclusion criteria should represent an indirect proof that our data can be considered as reliable. However, the present was not a pure librarian search, since it was undertaken with the perspective to give the users of TIRADSs/RSSs an almost complete vision about their diffusion in the scientific research databases. Second, the count per country of published articles was performed considering the affiliation of the first author. In addition, series with potential to overlap between two or more articles were not considered. This means that some particular cases (i.e., multicenter series) might be neglected. Third, here we considered only the most diffused TIRADSs/RSSs [3,4,5,6,7,8,9]. Even if this choice was arbitrary, we feel that this was in agreement with the worldwide diffusion of the TIRADSs/RSSs in clinical practice. Fourth, two systems that we included in our review were revised over the time and we considered both versions of them [4,5,6,7]. This might slightly affect the data of these RSSs.

Conclusions

The present systematic review showed that the number of scientific articles focused on US systems for risk stratification of thyroid nodules is high and has been importantly increased over the time. In addition, the interest of researchers in this field is well balanced between Western and Eastern countries. These data reveal the worldwide interest of researchers in testing and improving the accuracy of RSSs, and vindicate the need for a collective and systematic effort to work out a universal classification system. A task force of experts representing all scientific societies working in the field is currently engaged in efforts to develop a unified, evidence-based, international system (i.e., I-TIRADS) [16].