Introduction

Nodular goiter is the most frequently detected disease of the thyroid gland especially in iodine deficient areas. A detailed history, physical examination, and serum thyroid stimulating hormone (TSH) measurement are the first steps for evaluating a patient with a thyroid nodule [1]. Ultrasonography (US) is a very useful method in detecting thyroid nodules but is not very accurate in differentiating malignant lesions from benign lesions [2]. Several studies have been performed using high-resolution US to determine features of a malignant thyroid nodule, such as hypoechogenicity, microcalcification, margin irregularity, increased blood flow in the nodule, local invasion, or regional lymphadenopathy [28]. Recently, elastosonography (ES) which is based on the fact that hard and firm structure is associated with an increased risk of malignancy has been introduced as a noninvasive method for the differential diagnosis of benign and malignant thyroid nodules. It depends on the principle that under compression by an external force, softer tissues are more vulnerable to deformation than the harder tissues [911].

The literature includes various studies that investigate ES procedure in thyroid nodules [1219]. In all of these studies, scoring was made by using the color scale. Then the scores were compared with cytological and/or histopathological data, and the sensitivity and specificity of the technique to differentiate benign and malignant nodules were calculated using these scores. ES scoring is a qualitative method. Recently developed second-generation ES devices make scoring based on a color scale and measure the hardness of the tissue quantitatively, as well. Depending on the hypothesis that a quantitative value showing tissue stiffness may be more valuable for the differential diagnosis, Vorlander et al. [19] measured strain values in thyroid nodules by a second-generation ES device. The authors calculated strain values in 309 thyroid nodules and compared the results with histopathological diagnosis. The nodules were grouped in three according to strain values and negative predictive value (NPV) was 100% and positive predictive value (PPV) was 42%.

In this study, we aimed to determine strain indexes (SI) of thyroid nodules with a second-generation ES device, calculate the SI cut-off values for histopathologically determined benign and malignant nodules, and investigate the role for ES in the differential diagnosis of thyroid nodules.

Materials and methods

Study protocol

This prospective study was carried out between June 2009 and 2010. Patients admitted to our out-patient thyroid clinic were recruited for the study. Past medical history was obtained and thyroid examination was done in all patients. Sensitive TSH and thyroid autoantibodies [(thyroid peroxidase antibody (Anti-TPOAb), thyroglobulin antibody (Anti-TgAb)] levels were measured. Thyroid US was performed by an experienced specialist (D.T.) using the same US machine. Fine needle aspiration biopsy (FNAB) was carried out by an experienced specialist (A.O.) in indicated nodules. Cytological evaluation was made by the same cytologist that was experienced in thyroid nodule cytology (G.G., Associate Professor). FNAB was repeated in nodules with nondiagnostic cytology results. Patients were re-evaluated using US and Doppler US and discussed by a council of endocrinologists, general surgeons, and pathologists. This evaluation guided surgeons to decide the surgical approach in patients that needed operation. Among the patients discussed in the council, operation was indicated in 550 patients with solitary or multiple thyroid nodules, who had compression symptoms or malignant or suspicious cytology results. In the patients that did not meet exclusion criteria, ES was performed by the same experienced specialist (B.C.) who was blinded to cytology results. Two hundred ninety-two patients of 550 underwent total thyroidectomy in our hospital with the surgeries performed by the same surgical team (B.K., C.S, and A.K.). Histopathological evaluation was also made by the same cytopathologist that evaluated cytology results.

The study protocol was approved by the local ethical committee (Ankara Ataturk Research and Education Hospital Local Ethical Committee Decision: Date: 01.06.2009; Decision Number: 10.06.2009). All patients were informed about the procedure by the same clinician (C.A.) before ES and gave written informed consent.

Exclusion criteria

Among the patients that were evaluated in the thyroid disease council and recommended to undergo surgery, patients younger than 15 years old, patients who had previous history of thyroid surgery or percutaneous invasive procedure for thyroid nodule or radiotherapy to head and neck region were excluded from the study. Other exclusion criteria were as follows: cardiac or pulmonary disease that may complicate surgery, pure cystic nodules, hemorrhagic nodules, multinodular coalescent nodules with undetermined sizes, anaplastic carcinoma, Riedel thyroiditis, and extensive cervical metastasis that may need radical neck dissection. Patients who refused surgery were also excluded.

Laboratory examination

Serum TSH, Anti-TPOAb, and Anti-TgAb levels were measured by chemiluminescence methods using commercial kits (Chemiluminescent Microparticle Immunoassay, CIMA®) with Abott Arhictec machine. Normal ranges for TSH, Anti-TPOAb, and Anti-TgAb were 0.35–4.94, <5.61, and <4.11 IU/ml, respectively.

Ultrasonography

Esaote color Doppler US (MAG Tecnology Co, Ltd. Model: 796FDII Yung-ho City, Taipei, Taiwan) and a superficial probe (Model No: LA523 13–4, 5.5–12.5 MHz) were used for standard US. Nodule localization, diameters (mm), volume, halo, echogenicity, marginal regularity, type of calcification and vascularization pattern (stage 1: absence of blood flow; stage 2: peripheral vascularization; stage 3: intranodular vascularization; stage 4: marked intranodular vascularization) were evaluated by using standard US. Stages 1 and 2 vascularizations were classified as type 1, stages 3 and 4 were classified as type 2 [20].

US guided FNAB

FNAB was performed under US guidance using a General Electric Logiq pro 200 (Model number: 2270968, GE Healthcare Korea, Seongnam-SI, Gyeon GGI-DO, Korea) and a 5.5–7.5-MHz probe.

Elastosonography

US elastographic examination was performed using Hitachi EUB 7500 elastosonography (Hitachi Medical Corporation 4-14-1, Soto-Kanda, Chiyoda-ku, Tokyo, Japan) and three different superficial probes that were compatible with this machine (1st probe: 13-6: EUP-L54MA 6–13 MHz, Hitachi Medical Corporation 4-14-1, Soto-Kanda, Chiyoda-ku, Tokyo, Japan; second probe: 10-5:EUP-L53 5–10 MHz, Hitachi Medical Corporation 4-14-1, Soto-Kanda, Chiyoda-ku, Tokyo, Japan; third probe: 10-5:EUP-L53L 5–10 MHz, Hitachi Medical Corporation 4-14-1, Soto-Kanda, Chiyoda-ku, Tokyo, Japan). For each nodule, the appropriate sized probe was chosen.

None of the researchers of this study was a member or consultant of Hitachi Medical, General Electric or MAG Technology. Also, they were not members or consultants of any other company that manufactures an US machine with ES feature.

Elastosonography technique

The ES probe was placed on the neck first in transverse axis (TA) and then in longitudinal axis (LA). Measurements were done separately in both axes. The free-hand compression applied on the neck was standardized by real-time measurements displayed on a numerical scale (grade 1–5) by Hitachi EUB 7500 software. Accordingly, the optimal level of compression during ES evaluation was graded from 3 to 4. It is important to keep the level of compression constant throughout the examination. The technique is improper if score changes occur as a consequence of over-compression. The examiner monitored the optimal level of compression and score by real-time image analysis during the examination. All the images were stored and reviewed subsequently. The examiner highlighted a box which included the nodule and a sufficient amount of surrounding thyroid tissue. The US elastogram was displayed over the B-mode image in a color scale ranging from red to blue. Red indicated the highest elastic strain (softest tissue) and blue indicated no strain (hardest tissue). The elastosonographic image was matched with a color scale and classified using the elasticity score that was developed by Itoh et al. [21] (Fig. 1).

Fig. 1
figure 1

Sample images illustration of strain index measurements in transverse and longitudinal axes

Scoring and measurement of strain and strain index

The probe was compressed with light pressure and relaxed approximately two times per second on the investigation area. During the compression and relaxation cycles, displacement of probe was 2–4 mm. ES score was accepted to be accurate when the image was maintained constant on the monitor for at least 15–20 s and determined in both axes.

Score 1 was entirely green, score 4 was entirely blue, and score 5 was entirely blue with light blue halo. In nodules with these scores we made the measurements with only one color and repeated for four times and used the mean values of these four measurements. While measuring SI, the whole nodule was marked with the marker of the system without crossing the margins, and then measurements were done (Fig. 1a, d, e).

Score 2 was defined when more than 50% of the nodule was green with the remaining blue, and score 3 was defined when more than 50% of the nodule was blue with the remaining green. Thus, in nodules with strain scores 2 and 3, to determine the mean nodule strain we made two measurements from the green part and two measurements from the blue part of the nodule. The final score was the mathematical mean of these four measurements (Fig. 1b, c).

For measurement of parenchymal strain (A) and nodule strain (B), marker of the system was used. The thyroid parenchyma strain measurement was made from the similar parts (particularly soft green–red parts) in all patients. The ratio of nodule strain to parenchymal strain (B/A) was defined as SI (Figs. 1, 2).

Fig. 2
figure 2

Schematic representation showing application of the marker of the software (area surrounded by yellow circle) to measure the strain indices of the nodule and parenchyma; A Normal thyroid parenchyma (This marked area was determined independently from the location of the nodule and was used for repeated measurements), B Area in the nodule marked for the measurement of strain elastosonographically (B/A = SI)

Cytological and histopathological examination

Materials obtained by US guided FNAB were air-dried, stained with the May-Grünwald–Giemsa stain, and evaluated according to Bethesda system classification [22]. Accordingly, cytology results were classified as follows: (1) benign (nodular goiter, chronic lymphocytic thyroiditis, and hyperplastic/adenomatoid nodule); (2) follicular lesion of undetermined significance/atypia of undetermined significance; (3) follicular neoplasm/suspicious for follicular neoplasm, Hurthle cell neoplasm/suspicious for Hurthle cell neoplasm; (4) suspicious for malignancy; (5) malignant; and (6) nondiagnostic.

Localization of the nodule or nodules in the thyroid tissue, which were evaluated by ES, was shown to the surgeon by a simple drawing. The surgical team marked the nodule or nodules with a surgical suture before sending the specimen for histopathological examination, so that the suspicious area be examined with particular interest.

Thyroidectomy specimens were examined macroscopically by the cytopathologist; photographs were taken and all the specimens were marked on a standard schedule scoring. In this way, it was possible to compare the previous US and cytology data with the histopathological findings. Histopathological evaluation was made using the 2004 WHO classification [23].

Statistical analysis

All statistical analyses were performed with SPSS 18.0 (SPSS Inc., IL, USA) statistical software before the statistical analysis data were checked against the errors. Descriptive statistics were shown as frequency tables or the mean ± SD notation for qualitative and quantitative data, respectively. Sensitivity (Sn), specificity (Sp), PPV, and NPV were calculated from the cross-tables. Relations between the qualitative data were examined using with Chi-square test. Unpaired Student’s t test or Mann–Whitney U test was used to compare the continuous type variables for two groups. A logistic regression analysis with forward LR method was used as the histopathological results were assigned as dependent variables and the other relevant parameters, such as vascularization pattern, calcification, presence of halo, echogenicity, etc. were covariates. Receiver operating characteristics (ROC) curve analysis were applied to calculate the cut-off values of LA strain values, sensitivity, and specificity values. A P value less than or equal to 0.05 was evaluated as statistically significant.

Results

The study included 391 nodules in 292 patients with a mean age of 46.08 ± 11.89 years (19–77). There were 242 (82.88%) women and 50 (17.12%) men, and the difference in age between women (45.86 ± 11.85 years) and men (47.16 ± 12.11 years) was not statistically significant (t = 0.706; P = 0.481).

All patients were euthyroid at the time of investigation and mean serum level of TSH was 1.32 ± 0.98 IU/ml. Anti-TPOAb and/or Anti-TgAb were positive in 139 (47.60%) patients.

According to Bethesda classification of cytology, among 391 nodules in 292 patients, 146 (37.34%) were in group 1, 93 (23.79%) were in group 2, 55 (14.07%) were in group 3, 38 (9.72%) were in group 4, 31 (7.92%) were in group 5, and 28 (7.16%) were in group 6.

Final histopathological diagnosis was malignant in 125 (31.97%) nodules of 106 patients and benign in 266 (68.03%) nodules of 186 patients. Of 125 malignant nodules, 100 were papillary carcinoma (70 classical papillary carcinoma, 24 follicular variant, 3 solid trabecular variant, 2 columnar variant, and 1 oncocytic variant), 15 follicular carcinoma, 9 Hurthle cell carcinoma, and 1 medullary carcinoma. Of 266 benign nodules, 219 had a final diagnosis of nodular goiter, 13 follicular adenoma, 9 hyperplastic nodule, 16 cellular adenomatous nodule, 8 hyperplastic cellular adenomatous nodule, and 1 adenomatoid oncocytic nodule.

According to elastosonographic evaluation, nodules with scores of 1, 2, or 3 were classified as being probably benign and with scores of 4 or 5 as being probably malignant. These classifications were then compared in all cases with the histopathological diagnosis and based on the results; sensitivity and specificity of ES scoring were calculated.

Distribution of ES scores in all the nodules was as follows: score 1 in 30 (7.67%), score 2 in 109 (27.88%), score 3 in 102 (26.09%), score 4 in 111 (28.39%), and score 5 in 39 (9.97%) nodules (Table 1).

Table 1 Distribution of types of nodules according to elastosonography scores and final histopathological diagnosis

Of 266 nodules that were confirmed to be benign on histopathological examination, 189 (71.05%) had an ES score of probably benign, while 77 (28.95%) had an ES score of probably malignant. Among 125 malignant nodules, 52 (41.60%) were in the group of probably benign and 73 (58.40%) were in the group of probably malignant according to ES scoring classification. Scoring was significantly related with histopathological diagnosis (χ2 = 36.513; P < 0.001) (Table 2). However, the discrimination for score 1 was 90.00%, while it was 58.97% for score 5. In addition, discrimination for benign nodules was lower for score 2 (74.31%) compared to score 3 (79.41%).

Table 2 Sensitivity and specificity of elastosonography scoring

Mean SI of malignant and benign nodules were compared. In TA, mean SI was 24.47 ± 25.36 in malignant and 14.99 ± 23.25 in benign nodules and the difference was statistically significant (P < 0.001). Since the data were not normally distributed, we compared them with Mann–Whitney U test. In LA, mean SI of malignant nodules (43.50 ± 32.80) was significantly higher compared to benign nodules (18.83 ± 26.24) (P < 0.001).

Including all nodules, the optimal SI cut-off value using TA was 8.817 (AUC = 66.0 ± 0.03%; 95%CI: 60.4–71.6%) (Sn: 61.3%, Sp: 63.5%). In this axis, we calculated SI cut-off value corresponding to 90% sensitivity as 1.045 (Sp: 10.2%), and to 90% specificity as 41.878 (Sn: 22.58%). Since discrimination values of ROC analysis for TA measurements were low (≤66.0%), LA measurements were used for further analysis.

In LA, mean SI of all nodules was 26.72 ± 30.70 (0.39–151.00). Optimal SI cut-off value in this axis for all the nodules was calculated as 16.709. Sensitivity and specificity corresponding to this value were 73.4 and 70.0%, respectively (PPV: 53.5%, NPV: 84.9%, AUC = 75.4 ± 0.03%; 95%CI: 70.2–80.5%). In LA, SI cut-off value for 90% sensitivity was 4.516 (Sp: 35.7%) and SI cut-off value for 90% specificity was 55.250 (Sn: 34.7%) (PPV: 39.8%, NPV: 88.8%) (Fig. 3). In same axis, we calculated SI cut-off value corresponding to 90% specificity as 55.250 (Sn: 34.7%).

Fig. 3
figure 3

Rectangle The optimal cut-off value for strain index in longitudinal axis in all nodules (16.709). Ο: Cut-off value for strain index in longitudinal axis in all nodules corresponding to 90% sensitivity (4.516). Sn sensitivity, Sp specificity, Ac accuracy

When we compared histopathological findings of the nodules with the features in B-mode gray-scale US, vascularization patterns in Doppler US, and scores of elastosonographic evaluation, and SI we found that calcification, halo and echogenicity were significantly related with histopathological diagnosis. Nevertheless, presence of autoantibodies, vascularization pattern, marginal irregularity, nodule volume and the ratio of anteroposterior diameter to transverse (AP/T) were not related with histopathological diagnosis. Sensitivity, specificity, PPV, and NPV for the parameters that were shown to have significant relation to histopathological diagnosis were given in Table 3.

Table 3 Sensitivity and specificity of B-mode ultrasonography features to predict malignancy

B-mode US features that were significantly related with histopathological diagnosis were combined, and cut-off values for SI of nodules with these features were calculated. Accordingly, the optimal SI cut-off value for hypoechoic nodules with microcalcifications and without a halo (n = 104) were 17.020 (AUC = 86.5 ± 0.04%; P < 0.001; 95%CI: 79.4–93.6%). For this value, sensitivity was 84.3% and specificity was 81.1% (Fig. 4). In these nodules, when sensitivity was increased to 90%, SI cut-off value in LA decreased to 5.537 (Sp: 61.3%).

Fig. 4
figure 4

ROC Curve for hypoechogenic nodules with microcalcification and without a halo

In hyperechoic nodules with microcalcification and with a halo (n = 41), optimal SI cut-off value in LA was calculated as 14.809 (AUC = 78.9 ± 0.10%; P = 0.038; 95%CI: 58.7–99.1%) (Sn and Sp were 80.0 and 72.2%, respectively). In these nodules, when sensitivity was increased to 90%, SI cut-off value in LA was found to be 5.488 (Sp: 41.7%) (Fig. 5).

Fig. 5
figure 5

ROC Curve for hyperechogenic nodules with microcalcification and without a halo

In hypoechoic nodules with microcalcification and with a halo (n = 193), optimal SI cut-off value in LA was determined as 14.400 (AUC = 72.8 ± 0.04%; P < 0.001; 95% CI: 64.9–80.8%). Sensitivity and specificity of this value were 67.2 and 71.1%, respectively. In these nodules, when 90% sensitivity was desired, SI cut-off value in LA would be 4.265 (specificity 33.3%) (Fig. 6).

Fig. 6
figure 6

ROC Curve for hypoechogenic nodules with microcalcification and a halo

Discussion

American Society of Radiologists in US has recently defined malignancy related US features of thyroid nodules as presence of microcalcifications, hypoechogenicity, marginal irregularity, absent halo sign, solid pattern, intranodular vascularization, and shape (taller rather than wide) [2]. Malignant nodules tend to be much harder than benign nodules. As declared in recent reports, firm or hard consistency is associated with an increased risk of malignancy [2426]. However, conventional US does not give information about the hardness of the nodule.

ES is a new diagnostic technique that evaluates tissue stiffness, and it was recently applied for the diagnosis of thyroid nodules [1219] (Table 4). In previous studies, different techniques, devices, or methods have been used, but a common feature of these studies was that ES scoring was made according to specific color scale including 4–6 groups (yellow–red, green-green/yellow, green/blue, and blue). While some of the studies have compared ES scoring and FNAB results [14, 27], the studies by Lyshchik, Rago, Rubaltelli, Hong, and Vorlander have compared ES scoring and final histopathological diagnosis [12, 13, 16, 17, 19]. Sensitivity and specificity of this technique to differentiate benign and malignant thyroid nodules were reported to range between 36 and 100% in different studies.

Table 4 Methods used in and results of studies on elastosonography in the thyroid nodules

Since both classical palpation and ES scoring that may be viewed as “ultrasonographic palpation” depend mostly on the examiners experience, they have a low reliability compared to objective assessments. Considering the hypothesis that a quantitative method evaluating tissue stiffness may be more valuable for the differential diagnosis, “strain index” measurement has come into question. In the literature, there are three studies investigating the SI of thyroid nodules. In the first study, Lyshcnik et al. [12] have evaluated 31 nodular goiter patients with a real-time elastography implemented on a clinical US scanner and off-line processing of strain images that were reconstructed from radiofrequency echo data stored during US examination. The authors have concluded that SI value obtained from off-line processed elastograms was the strongest independent predictor of thyroid malignancy. This criterion had a specificity of 96% and a sensitivity of 82%. However, this method is labor intensive and time consuming. Nevertheless, this study was the first to define SI in thyroid nodules.

In the second study by Dighe et al. [15], only in 20 of 53 nodules final histopathological diagnosis was available. They have used carotid arterial pulsation as the compression source and measured the strain observed in the thyroid nodule. A stiffness index higher than 18 in a nodule was suggested to be predictive for malignancy with a sensitivity of 87.8% and specificity of 77.5%. The limitations of the study were lack of final histopathological diagnosis in a considerable number of patients and indirect measurement of stiffness index. These two studies were costly, labor intensive and time consuming, probably because they did not use a second-generation real-time US.

Second-generation ES devices not only allow scoring according to color scale but measures stiffness of the tissue quantitatively, as well. Vorlander et al. reported the first study that used a second-generation ES to evaluate strain values in thyroid nodules [19]. They calculated strain values in 309 nodules and compared the results with histopathological findings. The nodules were classified as hard, intermediate and soft according to strain values. NPV and PPV were calculated as 100 and 42%, respectively. The authors indicated that the most important factor limiting their study was inclusion of cystic and roughly calcified nodules which may have effect on strain values. In addition, they have emphasized that false negativity rates were higher in follicular thyroid cancer and follicular adenoma.

In our study, by the help of a second-generation ES device, we aimed to determine the SI of thyroid nodules, calculate the SI cut-off value for histopathologically confirmed benign and malignant nodules and investigate the diagnostic role of ES in the differential diagnosis of thyroid nodules. We also calculated SI cut-off values of thyroid nodules according to morphological features that were determined by US and doppler US. Including 391 nodules, this study has the largest sample size among the studies to date.

Our results showed that scoring with ES provided a significant discrimination when final histopathology was used as a reference standard. However, although the expected result was higher rate of malignancy prediction with increasing ES scores, we found statistically lower discrimination with higher ES scores (90% for score 1 and 58.97% for score 5). In another words, performance of score 5 to predict malignancy was lower than the performance of score 1 to predict benignity. The reason for this result which is in contrast with the literature might be that ES scoring is a qualitative assessment method. In addition, previous studies on ES have included a select population of patients that were already planned to undergo surgery for suspicious cytology or large nodule size. This might have contributed to the high predictive value of ES scoring in other studies.

In this study, we used “strain index” measurement as a more objective parameter for the differential diagnosis of benign and malignant nodules. Until now, no terminology or measurement methods that determine degree of stiffness with ES has been accepted widely. We made two measurements from nodule (A) and parenchyma (B) with the help of the device and used ratio of B to A (B/A) to compare nodule stiffness with the parenchyma stiffness. To decrease inter-assay variability, we repeated measurements for four times and used the mean values. For terminology, we referred the study of Lyshcnik et al. who defined SI for the first time. Our study differed from the study of Vorlander et al. by the method of nodule stiffness measurement and nomenclature. The authors have calculated A/B and defined the result as “Strain Value”. Another important difference was exclusion of cystic nodules in our study. We included calcified nodules but calculated a different SI cut-off value for these nodules.

According to our results, the optimal SI cut-off value for all the nodules was 16.709 (sensitivity: 73.4%, specificity: 70.0%). Sensitivity and specificity of ES scoring were 58.4 and 71.0%, respectively. Thus, we showed that SI had a higher sensitivity and a similar specificity when compared with ES scoring. In addition, we also found that SI values that were calculated according to the morphological features of nodules in gray-scale US had a higher sensitivity and specificity to predict malignancy. In hypoechoic nodules with micro-calcifications and without a halo, for a cut-off value of 17.020, sensitivity was 84.3% and specificity was 81.1%. We think SI values determined by ES might prevent unnecessary FNAB when assessed with ultrasonographical features of nodules. Future studies evaluating SI values with ultrasonographical, cytological, and histopathological findings may guide for the surgical interventions. An important limitation of this study was that while we calculated SI cut-off values, we classified nodules into two main groups—malignant and benign. Malignant nodules were mainly papillary thyroid cancer (80%) and variants. However, we think determination of SI cut-off values for different thyroid cancer types in studies including larger sample sizes will be more helpful for the differential diagnosis.

This is the first study that calculates SI cut-off values in thyroid nodules and determines the role of these values for the differential diagnosis of thyroid nodules. This preliminary study showed that measurement of SI with ES as a noninvasive procedure may assist conventional methods for the differential diagnosis of benign and malignant thyroid nodules.