Introduction

With an increasing incidence in recent years, breast cancer had emerged as a leading cause of cancer death in women [1]. To detect breast cancer at an early stage, especially in dense breasts, ultrasound (US) had become a popularized screening modality [2, 3]. The US Breast Imaging-Reporting and Data System (BI-RADS) recommends interventional management of breast lesions with an expected likelihood of malignancy of more than 2%. However, nearly 90% of these biopsies yielded benign results in the ACRIN 6666 trial [4]. Since a majority of these benign breast lesions underwent biopsy, the inadequate specificity was not desirable [5]. To improve the specificity of BI-RADS, breast imagers searched for additional inexpensive and noninvasive methods to downgrade some benign breast lesions from biopsy to follow-up. The potential to miss breast cancer when a downgrade was carried out became a major concern; therefore, breast lesions of high suspicion for malignancy (BI-RADS category 4c or 5) were not considered a downgrade because the morphological features of grayscale alone were sufficient to prompt biopsies [6, 7]. BI-RADS category 4a lesions, making up at least 25% of breast lesion biopsies [8], were deemed appropriate to downgrade to surveillance with additional methods.

Previous studies showed that malignant lesions tended to be stiffer than benign lesions [9, 10]. US elastography has been widely used to depict tissue stiffness [7, 11,12,13]. Shear wave and strain elastography are two different types of elastography [14], with strain elastography assessing deformability, including elastic imaging (EI) and virtual touch tissue imaging (VTI), while shear wave elastography is used for virtual touch tissue quantification (VTIQ) and measuring the propagative speed of sound waves at low frequencies quantitatively [15]. These techniques have been put into use in many commercial ultrasound scanners, like Siemens, Philips, Supersonic imaging, etc. It was confirmed that the specificity of any single type of elastography was higher than that of US BI-RADS [16,17,18], while the sensitivity was significantly lower than that of US BI-RADS [19], which indicates that some breast cancers would be incorrectly downgraded if only a single type of elastography was added to conventional US [17, 18, 20].

Since different types of elastography have their own merits and demerits, combinations of them may be a potential way to achieve complementation. Previous studies showed that combinations of different types of elastography could improve sensitivity in all BI-RADS 4 categories when compared with a single type of elastography, but the cut-off value of BI-RADS category 4a might be totally different from that of all BI-RADS 4 categories [18, 21]. Therefore, the purpose of this study was to determine whether a combination of different types of elastography could be used to downgrade BI-RADS category 4a lesions accurately while still avoiding missing cancer.

Materials and methods

For this prospective institutional review board-approved study, verbal informed consent was obtained from all patients between January 2016 and May 2018. This study was registered at the Research Data Deposit public platform (http://www.researchdata.org.cn), with all key raw data updated and an approved RDD number of RDDA2018000751.

Study participants

Eligible participants were female patients of at least 20 years of age with one or more US-detected breast lesions, which were classified as BI-RADS category 4a (Fig. 1). Exclusion criteria comprised: women in pregnancy or those who were lactating, those with ipsilateral breast implants or an ipsilateral breast surgery history, those receiving radiation therapy or chemotherapy for any cancer, those with masses larger than 3 cm in diameter and deeper than 3 cm in depth, those who refused to provide informed consent or biopsy, or those with incomplete information or unqualified images.

Fig. 1
figure 1

Study flow diagram

Image acquisition

Conventional US and US elastography data were acquired with the Siemens S2000 ultrasound system (Siemens Medical Solutions, Mountain View, CA, USA) equipped with a 9L4 linear transducer. Two investigators (Y.N.H and Y.B.L) who had 2 and 5 years of experience, respectively, in breast US and at least half a year of experience in elastography performed all exanimations.

With patients in a supine position, the probe was moved slightly on the breast to identify the target lesion, and at least two orthogonal images and color doppler images were obtained. The final US BI-RADS assessments were recorded according to the expected probability of malignancy [22]. Three types of elastography, including EI, VTI, and VTIQ, were performed for each lesion along the longest diameter. For each type of elastography, the probe was applied vertically with extremely slight pressure to minimize pre-compressions.

EI imaging, induced by cardiovascular/respiratory pulsation, was displayed as different colors according to the displacement degree, with different colors from red to green to blue representing increasing stiffness. The region-of-interest (ROI) box was focused on the target lesion and included the superficial pectoral muscle layer and subcutaneous fat, with more than a 5 mm distance from the lateral borders to the lesion boundaries [9, 23].

VTIQ, induced by acoustic radiation force impulse (ARFI), was used to measure the shear wave velocity (SWV) of the lesions. Patients were required to suspend respiration for 3 to 5 s when ARFI was initiated. The quality map, which was displayed in green-yellow-red representing high-intermediate-low quality, respectively, was obtained first to assess the quality of the SWV measurement. Intermediate and low-quality areas should be avoided for the measurement of SWV. Then, the image was turned into an SW-velocity map, in which different colors represented the SWV from low (blue), intermediate (green or yellow), to high (red). The numeric SWV value was displayed in m/s. For each lesion, at least three ROIs (2 × 2 mm) were placed to measure SWV over the stiffest portion of the lesion.

VTI, induced by ARFI, was displayed as a grayscale image [21] in which bright represented soft tissue and dark represented hard tissue. Participants were also asked to hold their breath for 3 to 5 s when VTI was generated.

Image analysis

The elastography images were analyzed independently by two radiologists (J.H.Z. and J.H.) who had 15 and 5 years of experience, respectively, in breast US and at least 2 years of experience in US elastography. For any disagreements, a consensus was reached by reviewing the images jointly. Other imaging results and pathologic results were blinded to the reviewers. For EI imaging, elasticity scores, which were based on strain distribution in the breast lesion and its surrounding tissue, were categorized as follows: score 1, evenly red or green; score 2, predominantly green with focal blue spots; score 3, equal amounts of green and blue; score 4, predominantly blue; score 5, shadowed blue in the lesion and its surrounding tissue [23] (Fig. 2).

Fig. 2
figure 2

Elasticity imaging scores of the lesions: a score 1; b score 2; c score 3; d score 4; e score 5

For VTI imaging, breast lesions were scored from 1 to 5, indicating stiffness from soft to hard. For score 1, the lesions were almost bright, with 0–20% dark portions; score 2, predominantly bright, with 20–40% dark portions; score 3, equal levels of dark and bright, with 40%– 60% dark portions; score 4, predominantly dark (60–80% dark portions); and score 5, almost completely dark (≥ 80%) [18, 24] (Figs. 3).

Fig. 3
figure 3

Virtual touch tissue imaging elasticity scores of the lesions: a score 1; b score 2; c score 3; d score 4; e score 5

For VTIQ, the quality map was checked again, and, if qualified, the highest SWV was chosen to represent the stiffest part of the lesion [15, 25, 26].

Intra-observer and inter-observer agreement

To test the intra-observer agreement, repeated giving the score for the same breast lesions on EI and VTI were performed in 30 breast lesions by one radiologist (J.H) on two different days. Inter-observer agreement was investigated by two radiologists (J.H.Z and J.H), measuring the same breast lesions independently on the same day in another 30 breast lesions. Agreement of EI or VTI was defined as consistency in malignant or benign lesions according to the cut-off value. When it was difficult to judge the scores of EI and VTI, the lesion was labeled with a higher score to reduce the chance of missing breast cancer.

Statistical analysis

All lesions underwent US-guided core needle biopsy or excision biopsy and were confirmed by histopathology. MedCalc (version 15.2.2 for windows; Mariakerke, Belgium) and SPSS (version 20.0 for windows; SPSS, Inc, Chicago, IL, USA) were used for statistical analysis. For BI-RADS, a lesion was considered malignant when it was classified as higher than BI-RADS category 3. For single types of elastography, the cut-off value was calculated with the maximum Youden index, and, for the combined elastography, the lesion was assessed as malignant as long as one of any single type of elastography was higher than its cut-off value. The areas under the receiver operating characteristic curve (AUC) were calculated and compared by DeLong et al. The sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were then calculated and compared with the McNemar test. Kappa values were evaluated to validate the consistency. Kappa values were assessed as follows: < 0.00 were poor, 0.00–0.20 were slight, 0.21–0.40 were fair, 0.41–0.60 were moderate, 0.61–0.80 were substantial, and 0.81–1.00 were almost perfect. Differences were considered significant when the P value was less than 0.05.

Results

Participants and lesions

458 women (mean age, 43 years; standard deviation, 10; range, 20–76 years) with 494 BI-RADS category 4a lesions (mean size, 13 millimeters; standard deviation, 6; range, 5–30 millimeters) were included for final analysis. Pathologically, 49 (9.9%) lesions were malignant and 445 (90.1%) were benign. Patients with benign breast lesions have no difference in age and size of lesions from those with malignant ones. The histopathologic details of the breast lesions are shown in Table 1.

Table 1 Histological features of the lesions confirmed by pathology

EI, VTI, and VTIQ features

For EI, score 1 was found in 14 lesions (2.8%), which were all benign; score 2 was found in 100 lesions (20.2%), including 99 benign lesions and 1 malignant lesion; score 3 was found in 205 lesions (41.5%), including 198 benign lesions and 7 malignant lesions; score 4 was found in 134 lesions (27.1%), including 106 benign lesions and 28 malignant lesions; score 5 was found in 41 lesions (8.3%), including 28 benign lesions and 13 malignant lesions. For EI, a low score favored benign lesions: 319 lesions had scores of no more than 3, and 311 were benign compared with 134 of 175 lesions that scored higher than 3 (P < 0.001).

For VTI, score 1 was found in 10 lesions (2.0%), which were all benign; score 2 was found in 119 lesions (24.1%), which were all benign; score 3 was found in 257 lesions (52.0%), including 244 benign lesions and 13 malignant lesions; score 4 was found in 92 lesions (18.6%), including 66 benign lesions and 26 malignant lesions; score 5 was found in 16 lesions (3.2%), including 6 benign lesions and 10 malignant lesions. Most benign lesions (83.8%) were classified as score 1 to score 3 for VTI, and most malignant lesions (73.4%) were classified as score 4 to score 5 (P < 0.001).

For VTIQ, the mean SWV for benign lesions was 2.96 ± 0.89 m/s (range 1.08–6.39 m/s) and was 4.19 ± 1.64 m/s (range 2.26–8.86 m/s) for malignant lesions (P < 0.001) (Table 2).

Table 2 Findings between malignant and benign BI-RADS category 4a breast lesions by EI, VTI, and VTIQ

Diagnostic performances

The cut-off values were EI score > 3 with a maximum Youden index of 0.536, VTI score > 3 with a maximum Youden index of 0.573, and SWV > 3.30 m/s with a maximum Youden index of 0.408. The sensitivity, specificity, positive predictive value, negative predictive value, accuracy, and AUC for various methods are shown in Table 3. Since all BI-RADS category 4a lesions were considered malignant, US BI-RADS achieved a sensitivity of 100% (49 of 49) but a specificity of 0% (0 of 445).

Table 3 Diagnostic performance of conventional US and combinations of different types of elastography

Among the single types of elastography, VTI achieved the highest AUC of 0.836 and the best specificity of 83.8% while EI achieved the highest sensitivity of 83.7%. The specificities of EI, VTI, and VTIQ were significantly higher than those of US BI-RADS (69.9%, 83.8%, 75.5% vs. 0, respectively, P < 0.001), while the sensitivities were significantly lower than those of US BI-RADS (83.7% vs. 100%, P = 0.016; 73.5%, vs. 100%, P < 0.001; 65.3% vs. 100%, P < 0.001).

Among the combinations of different types of elastography, including EI + VTI, EI + VTIQ, and VTI + VTIQ, EI + VTI yielded the highest sensitivity of 98% and an AUC of 0.815 with a specificity of 64.9%. There was no significant difference in sensitivity between EI + VTI or EI + VTIQ and US BI-RADS (98% vs 100%, P = 1.000; 93.9% vs 100%, P = 0.25), while the sensitivity of VTI + VTIQ was lower than that of US BI-RADS (85.7% vs 100%, P = 0.016). The specificities of any of the above combinations of two types of elastography were significantly higher than those of US BI-RADS (64.9%, 58.9%, 66.3% vs. 0, respectively, P < 0.001). When compared with EI + VTIQ, EI + VTI showed significantly better specificity (P < 0.001).

Adding elastography to US BI-RADS

When a single type of elastography, including EI, VTI, and VTIQ, was added to downgrade BI-RADS category 4a lesions, 319 (64.6%), 386 (78.1%), and 353 (71.5%) lesions were downgraded, respectively, but 8 (16.3%), 13 (26.5%), and 17 (34.7%) lesions, respectively, were downgraded incorrectly.

When a combination of two types of elastography, including EI + VTI, EI + VTIQ, and VTI + VTIQ, was added to downgrade BI-RADS category 4a lesions, 290 (58.7%), 266 (53.8%), and 303 (61.3%) lesions were downgraded, respectively, among which 1 (2.0%), 3 (6.1%), and 7 (14.3%) lesions, respectively, were downgraded incorrectly.

Intra-observer and Inter-observer agreement

For EI evaluation, the Kappa values were 0.842 [standard error (SE): 0.107] for inter-observer agreement and 0.923 (SE: 0.075) for intra-observer agreement (both P < 0.001).

For VTI evaluation, the Kappa values were 0.867 (SE: 0.091) for inter-observer agreement and 0.933 (SE: 0.065) for intra-observer agreement (both P < 0.001).

Discussion

The BI-RADS system has received widespread acceptance for the characterization of breast lesions, and according to this system, a large number of breast lesions should be biopsied. However, nearly two-thirds of these biopsies yielded benign results [8]. Therefore, there was great need to develop additional methods to reduce unnecessary biopsies for benign lesions. A main concern of considering a downgrade was missing cancer; therefore, downgrading breast lesions of high suspicion for malignancy (BI-RADS category 4c or 5) was not recommended, and BI-RADS category 4a lesions, which have a low suspicion for malignancy, were thought to be suitable to be downgraded from biopsy to surveillance by additional methods [8].

As a new emerging technology, elastography has been shown to be promising in distinguishing malignant breast lesions from benign ones [9, 10]. Previous studies showed that elastography could be applied to downgrade the BI-RADS category to reduce unnecessary biopsies [27]. In our study, three types of elastography were assessed for downgrading US BI-RADS category 4a breast lesions. Although the specificity was significantly improved, the sensitivity of any single type of elastography was significantly lower than that of US BI-RADS, which was concordant with the results of previous studies [17, 18, 20]. Applying a single type of elastography to downgrade BI-RADS category 4a lesions improved the specificity from 0 to 69.9–83.8% and reduced 64.6–78.1% biopsies; however, 16.3–34.7% of breast cancers were missed, which is unacceptable in clinical practice.

Since different types of elastography have their own merits and demerits, and since breast lesions have different features on different types of elastography, a combination of these technologies may achieve better performance. The current study showed that applying combinations of different types of elastography to downgrade BI-RADS category 4a lesions yielded a sensitivity of 85.7–98%, reducing the amount of missed cancer to 2.0–14.3%. In terms of sensitivity, there was no significant difference among US BI-RADS, EI + VTI, and EI + VTIQ, while the specificity of EI + VTI was significantly better than that of US BI-RADS and EI + VTIQ, which indicates that the combination of EI and VTI is a promising approach for downgrading BI-RADS category 4a lesions without increasing the risk of missed cancer. In this study, only 1 cancer was missed when EI + VTI was used to downgrade BI-RADS category 4a lesions. In this regard, reducing unnecessary biopsies by the combined use of EI and VTI might help women with BI-RADS category 4a lesions. Previous studies showed that combinations of VTI and VTIQ could improve the sensitivity of elastography, but all BI-RADS category 4 lesions, including 4a, 4b, and 4c, were included [18], and breast lesions of high suspicion for malignancy (BI-RADS category 4c) were not recommended to be downgraded by any additional method [7, 8].

EI and VTI are two different types of strain elastography that are used to assess the deformability of breast lesions; therefore, they may be complementary to each other when used in combination to evaluate breast lesions [27]. EI is induced by physiological vibration, while VTI is triggered by an acoustic radiation force impulse [28]. As demonstrated in our study, applying EI to downgrade BI-RADS category 4a lesions missed 8 cancers, among which 7 cancers were detected by VTI. Applying VTI to downgrade BI-RADS category 4a lesions missed 13 cancers, among which 12 cancers were detected by EI  (Fig. 4). By considering these two types of elastography for BI-RADS category 4a lesions, sensitivity would not be lost, and higher specificity and AUC would be achieved.

Fig. 4
figure 4

A 46-year-old woman with invasive lobular cancer. a A solid, hypo-echogenic, irregular, well-defined margin lesion is shown with conventional ultrasound; this lesion was classified as Breast Imaging-Reporting and Data System (BI-RADS) category 4a. b The elasticity imaging (EI) score of the lesion is 4. c The virtual touch tissue imaging (VTI) score of the lesion is 3. d In virtual touch tissue quantification (VTIQ) mode, the highest shear wave velocity (SWV) is 2.78 m/s

The diagnostic performance of VTIQ in this study was not as good as that in previous studies [18, 21]. One possible explanation is that VTIQ is not especially sensitive to small lesions, particularly to those lesions with a diameter less than 1 cm [7], and the diameter of 44.7% lesions in our study was smaller than 1 cm. On account of the different size distribution and pathology composition of breast lesions [12, 23, 29], it is difficult to compare our results accurately with those of these studies. In addition, an absolute value of VTIQ was not clinically practical because it was difficult to judge the lesions as benign or malignant when the SWV value was near the cut-off value.

Some limitations of this study need to be addressed. First, this was a single-center study, and the number of participants was relatively small; as a result, a full spectrum of breast diseases could not be covered. Therefore, further multicenter studies will be needed to validate the results of this study. Second, BI-RADS category 3 lesions were not enrolled because the follow-up time for most of these lesions was less than 12 months, and few of these lesions were biopsied. Furthermore, since multiple studies showed that the malignancy rate of BI-RADS 3 lesions was less than 0.7% [30,31,32], downgrading BI-RADS 4a lesions is more important than upgrading BI-RADS 3 lesions.

Conclusion

In summary, a combination of different types of elastography significantly improved the sensitivity and decreased the risk of downgrading breast cancers. The combination of EI and VTI showed a sensitivity similar to that of US BI-RADS, but the specificity was significantly improved, demonstrating that this combination is a potential way to downgrade BI-RADS category 4a lesions.