Introduction

Breast cancer is the most commonly diagnosed cancer in Chinese women, with an estimated 367,900 new cases in 2018 [1]. Breast cancer in China is characterized by a rising trend and relatively earlier age at diagnosis. The advanced stage at presentation leads to poor survival [2, 3], causing substantial economic and societal effects, especially in resource-limited areas [4]. Early detection of breast cancer can significantly improve disease-specific survival [5]. However, according to population-based cancer registries in China, less than 1% of breast cancer cases were detected by screening [6]. In this context, the improvement of early cancer detection in women from outpatient is exceptionally significant in China, especially in resource-limited areas.

Breast density reflects the content of fibroglandular tissue in the breast. More than half of Chinese women aged 45–65 years were categorized as having dense breasts [7, 8]. Women with dense breasts not only have significantly increased risk of breast cancer, but also experienced the lower sensitivity of mammography (MG) in breast cancer detection due to the masking effect of dense tissue [9]. Thus, additional imaging modalities, such as ultrasonography, tomosynthesis, or magnetic resonance imaging (MRI), are needed to improve breast cancer detection in women with dense breasts.

The high infrastructural needs and cost of tomosynthesis or MRI have prevented these technologies from being used more often in resource-limited areas. Handheld ultrasound (HHUS) is widely available and relatively inexpensive [10], and there is limited evidence that it can improve the detection of breast cancer, either in a screening setting [11, 12] or in a diagnostic setting [13]. However, HHUS is labor-intensive and time-consuming, and the performance is highly dependent on the skills of the operators. Even in China, where the ultrasonologists are independent of radiologists and have long experience with breast ultrasound for breast cancer detection and diagnosis, the performance of ultrasonologists varies to a large extent [14, 15]. Unlike HHUS, automated breast ultrasound system (ABUS) has a reproducible and less operator-dependent process for image acquisition. In addition, the image acquisition can be dissociated from interpretation, which means the image can be acquired by a trained operator and interpreted by any qualified ultrasonologist/radiologist sitting at a different location (through cloud sharing of images) or at a different time. This can decrease variability and improve reproducibility [16]. ABUS also provides three-dimensional (3D) representation of the whole breast, while the reconstructed coronal plane has been shown to improve diagnostic accuracy [17]. ABUS makes it possible to apply ultrasonography in a broader range and overcome some of the limitations imposed by limited health resources. ABUS has better than or comparable performance to HHUS [18,19,20]. However, to date, no study has compared the diagnostic performance of ABUS and HHUS in women with dense breasts from outpatient, or as an adjunct to MG in this population. Thus, the aim of this study was to evaluate the diagnostic performance of ABUS and HHUS in Chinese women with dense breasts from outpatient, both in combination with mammography and separately, for the first time.

Methods

This study was approved by the Institutional Review Board of Cancer Hospital, Chinese Academy of Medical Sciences, and all five participant hospitals. Written informed consent was obtained from all participants. This study was registered in the Chinese Clinical Trial Registry (ChiCTR1800017908).

Study design and study participants

This is a cross-sectional multicenter, clinical research study conducted at five separate tertiary-care hospitals in China between February 2016 and March 2017. The study methodology has been described in detail in our earlier publication [21]. In brief, a total of 1973 women, aged 30–69 years, attending outpatient clinics at five tertiary-care hospitals, were recruited for a clinical diagnostic study aimed at comparing HHUS to ABUS. The participants included 680 women aged 30–39 years and 1273 women aged 40–69 years. Each participant had a clinical breast examination performed by physicians, followed by ABUS and HHUS performed by technicians and ultrasonologists, respectively. Women who were aged 40 years or older also received conventional mammogram examinations. Demographic data and information on breast cancer-related risk factors were collected at enrollment via a face-to-face interview using a structured questionnaire. All examinations and image interpretation were performed after the interview.

Only women with mammographic dense breasts were included in the present analysis to evaluate the diagnostic performance of ABUS and HHUS. Breast density was visually assessed by the radiologists interpreting the mammograms according to Breast Imaging Reporting and Data System (BI-RADS) density category. BI-RADS a (fatty) and b (scattered fibroglandular elements) were defined as nondense breasts, while BI-RADS c (heterogeneously dense) and d (extremely dense) defined dense breasts. At last, a total of 937 women aged 40–69 years, with dense breasts, were included in this study.

Image interpretation

Images of either type of imaging investigation were interpreted according to the BI-RADS classification. The interpreting ultrasonologist/ radiologist for one modality was blinded to the results of other examinations. The highest BI-RADS category on ABUS, HHUS, and MG was considered as the imaging diagnostic result for that individual participant. For analytical purposes, an assessment of BI-RADS categories 1–3 was considered as a negative finding, while an assessment of BI-RADS categories 4–5 was considered as a positive finding. Additional magnetic resonance imaging (MRI) or biopsy was performed to avoid verification bias, according to the patients’ preference in those with BI-RADS category 3, as well as 10% of those with BI-RADS categories 1–2 that were selected randomly. Women diagnosed with BI-RADS categories of 4–5 on any imaging modality or a BI-RADS category of 3 along with an abnormal MRI had a core biopsy performed within three months of visible abnormality. If more than one lesion was found in both breasts, a single final assessment was recorded based on the lesion with the worst features.

Equipment

All ABUS scans were acquired with Invenia ABUS (GE Healthcare, WI, USA). The 15 cm ultras-broadband transducer automatically applies compression to the breast across the whole breast and obtains images from different views, such as lateral, anteroposterior, and medial. The workstation then can reconstruct the breast and display 3D volumes in a 2-mm-thick coronal slice from the skin to the chest wall.

The HHUS images were acquired with the Aixplorer system (Supersonic Imagine, Aix en Provence, France), GE LOGIQ9 (GE Healthcare, WI, USA), iU22 Ultrasound System (Philips Medical System, WA, USA), and S2000 (Siemens Medical Solutions, CA, USA).

The mammograms were obtained by Fujifilm FDR MS-2500 (Fujifilm Crop, Tokyo, Japan), GE Sengraphe DS (GE Healthcare, WI, USA), and Hologic Selenia (Hologic, MA, USA).

Histopathology

Histopathological diagnosis was performed at the pathology department of each hospital where the participants were enrolled. The histopathology was assessed by qualified pathologists following surgery or core biopsy, with each specimen undergoing formalin fixation followed by paraffin embedding.

Statistical methods

The unit of analysis was the individual participant. Continuous variables were described as mean and standard deviation (SD) and were compared with the T-test. Categorical variables were described as a percentage and were compared with the Chi-square test or Fisher’s exact test for the variables with low expected cell counts. Sensitivity, specificity, false-positive rate (FPR), positive predictive value (PPV), negative predictive value (NPV), and area under the curve (AUC) of the receiver operating characteristic (ROC) were calculated to evaluate the diagnostic performance of HHUS and ABUS separately, and when used as an adjunct to MG. Although women underwent all three modalities, we simulated the diagnostic use of ultrasound as an adjunct to MG in women with dense breasts (Fig. 1). The gold standard for evaluation of the different diagnostic methods was final breast histopathology. The agreement of ABUS and HHUS was estimated with both percent agreement, which was calculated as the number of agreement divided by total number, and κ statistic, which also accounts for chance agreement [22]. Statistical significance was evaluated with two-sided tests, with 0.05 set as the threshold for significance. All analyses were performed in SAS 9.4 software (SAS Institute Inc., Cary, NC).

Fig. 1
figure 1

Diagnostic performance of automated breast ultrasound and handheld ultrasound in women with dense breasts

Results

Based on mammographic breast density, 937 women aged 40 years and above were classified as having dense breasts. The mean age of these participants was 49.1 years (SD: 6.8). Of the 937 women, 221 women (23.6%) were diagnosed with breast cancer, including 200 with invasive breast cancer and 21 with ductal carcinoma in situ (DCIS). Palpable lesions were present in 292 (31.2%) of the total participants. Women with breast cancer tended to be older than women free of cancer (50.5 vs. 48.7 years, p < 0.01) at enrollment and were more likely to be postmenopausal (42.1% vs. 32.3%, p = 0.01) (Table 1).

Table 1 Characteristics of outpatients aged 40–69 years old with dense breasts on mammography

In women with dense breasts on MG (N = 937), if ABUS was used as an adjunct test (women positive on either test underwent biopsy), the sensitivity was as high as 99.10% (95% CI 96.77–99.89%) and the specificity was 86.87% (95% CI 84.18–89.26%). The PPV and NPV were 69.97% (95% CI 64.56–75.00%) and 99.68% (95% CI 98.85–99.96%), respectively. The combination of HHUS and MG obtained similar sensitivity (99.10%, 95% CI 96.77–99.89%) and NPV (99.67%, 95% CI 98.82–99.96%), but lower specificity (84.92%, 95% CI 82.08–87.46%) and PPV (66.97%, 95% CI 61.59–72.05%), indicating HHUS had higher false-positive results (Table 2). The AUC of the combination of ABUS and MG was higher than the combination of HHUS and MG (0.93 and 0.92, respectively, p < 0.01). We observed a high percent agreement between ABUS and HHUS of 0.94 (κ = 0.85, 95% CI 0.81–0.89, p < 0.01) in categorizing the radiological findings with BI-RADS scores in all women with mammographic dense breasts.

Table 2 Diagnostic performance of ABUS and HHUS as adjuncts to mammography in outpatients aged 40–69 years old with dense breasts on mammography

In women with MG-negative dense breasts (N = 678, 72.36%), 30 additional cancer cases were detected using ultrasound (28 cases being detected by both HHUS and ABUS, 1 case by HHUS only, 1 case by ABUS only). The incremental cancer detection rate was identical by ABUS and HHUS, with an estimated incremental cancer detection rate of 42.8 (95% CI 28.8–60.9) per 1000 ABUS examinations or 1000 HHUS examinations. Among the 30 breast cancers detected by ultrasound, 27 were invasive breast cancer and 3 were DCIS. Nineteen of these women were classified as BI-RADS 1–2 and 11 as BI-RADS 3 by MG. The average size of detected lumps was around 17 mm. More details are shown in Table 3. There was only one case missed by both ultrasound and mammogram that detected by MRI.

Table 3 Characteristics of ultrasound-detected breast cancers in women with mammography-negative dense breasts

The diagnostic performances of ABUS and HHUS in women with MG-negative dense breasts are shown in Table 4. There were 25 false-positive results caused by ABUS, while 39 false-positive results were caused by HHUS (FPR = 3.86% and 6.03%, respectively). Among these false-positive findings, 19 of them were found by both ABUS and HHUS. After review by an ultrasonologist (XL), according to the characteristics on ABUS or HHUS, it was reasonable to classify these lesions as suspicious findings, including complex cystic and solid echo pattern, intraductal lesion, fibroadenoma with multiple coarse calcifications in postmenopausal women, and uncircumscribed margins with angular features. Specifically, the twenty HHUS-only false-positive interpretations were all classified as BI-RADS 3 by ABUS and BI-RADS 4A by HHUS, indicating low suspicion for malignancy (2–10%). These cases displayed indistinct margins on HHUS while benign features on ABUS. The percent agreement between ABUS and HHUS in MG-negative women was 0.95, with an estimated κ of 0.75 (95% CI 0.66–0.84, p < 0.01).

Table 4 Diagnostic performance of ABUS and HHUS in women with mammography-negative dense breasts

Discussion

In our multicenter hospital-based study, ultrasound as an adjunct to mammography significantly improved the breast cancer detection rate in women with dense breasts. The sensitivity was very high irrespective of the nature of the ultrasound. The incremental cancer detection rate in MG-negative dense breasts was 42.8 per 1000 ultrasound examinations. ABUS and HHUS showed high agreement with breast cancer detection in women with dense breasts. However, additional ABUS and HHUS resulted in 25 and 39 false-positive cases, respectively, resulting in unnecessary biopsies.

Our results suggest that supplemental ultrasound can lead to improved detection of breast cancer in Chinese women with dense breasts, as has been demonstrated consistently in previous studies [12]. In another hospital-based study from Korea [23], the cancer detection rates in symptomatic women with MG-negative dense breasts were 151.1 per 1000 diagnostic ultrasound examinations (21/139), and 22.4 per 1000 screening ultrasound examinations in asymptomatic women (10/446), thus supporting that ultrasound is useful for detecting malignancy in MG-negative dense breasts. Even in screening or asymptomatic populations, where the cancer detection rates were much lower than that in ours, ultrasound still showed incremental detection. In the American College of Radiology Imaging Network (ACRIN) 6666 study, supplemental ultrasound added the detection of 5.3 cancers per 1000 women with elevated breast cancer risk [24]. In a comparative trial of Adjunct Screening With Tomosynthesis or Ultrasound in Women With MG-Negative Dense Breasts (ASTOUND-2), adjunct ultrasound had an incremental cancer detection rate of 4.9/1000 screens (95% CI 3.21–7.19%). Since there was a high proportion of breast cancer cases (221/937), it is reasonable for us to gain such a high incremental detection rate.

Although the diagnostic yield was increased with additional ultrasound, these studies highlighted the importance of the trade-offs between the detection of additional cases and unnecessary interventions due to higher false-positive outcomes when using adjunct imaging [25]. Youk et al. [23] reported a false-positive rate of 5.2% by ultrasound in MG-negative dense breasts, which was comparable to that present in our study with ABUS (3.86%) or HHUS (6.03%). The ultrasonographic characteristics that lead to positive findings indicated that ultrasound has some advantages over MG in detecting certain breast diseases, such as intraductal papilloma, and identifying cystic components associated with a mass. It is possible that the pressure of ABUS’s probe makes the indistinct margin on HHUS not visible on ABUS, leading to more false-positive interpretations by HHUS based on our definition of “positive finding (BI-RADS 4A and above)”. Besides, shadowing from dense parenchyma was the leading cause of false-positive ABUS interpretations in another study [26], while poor visibility and shadowing caused by inadequate contact was the most common cause of image misinterpretation as false-negative readings in ABUS [27].

In line with prior studies [28, 29], our results among women with dense breasts indicate high agreement (κ = 0.85) between the two ultrasound modalities and high reliability as well. The ultrasonologists involved in our study were from tertiary hospitals; they have long experience with HHUS and relatively shorter experience with ABUS. Our observation of the high agreement between these two modalities indicates that ABUS may act as a replacement of HHUS in China. ABUS may play a more significant role, given its standardized image acquisition process, in areas where the ultrasound examinations were conducted by technicians or radiologists rather than ultrasonologists. These findings also have important public health implications for the early detection of breast cancer in resource-limited settings, given the fact that, on the one hand, the high prevalence of dense breasts and the high proportion of patients presenting in early age (due to the demographic profile) decrease the sensitivity of MG; and, on the other hand, MG or MRI is not affordable in these areas. At the same time, ABUS has some advantages over HHUS in breast cancer detection. First, the acquisition can be separated from interpretation, thus images can be read later and remotely [30]. Second, ABUS demonstrated better interobserver agreement compared to HHUS [31], which will almost certainly be a promising tool for overcoming the poor standardization and reproducibility of HHUS results [31]. Third, the reading time of ABUS was shown to be as short as 2.9–9 min [32], and the reading time of an experienced and non-experienced user is comparable [33]. The performance can be further improved by 33% with the appearance of computer-aided detection software [34]. Patients also reported a positive experience with ABUS, complaining of less pain during examination [35]. Implementation of ABUS, however, has some barriers. First, automated ultrasound is more expensive than HHUS, and additional training is required [30]. Second, the current iteration of ABUS is unable to perform axillary scanning, which potentially provides more information on the diagnosis of breast cancer.

To the best of our knowledge, this is the first study to evaluate the diagnostic performance of ABUS and HHUS in women with dense breasts in China. The interpretation of study results was conducted independently for ABUS, HHUS, and MG in this comparative study, which avoided potential bias to some extent. Furthermore, this is a multicenter study utilizing participants representative of those from the outpatient to be diagnosed for breast cancer in China.

The limitations of our study should also be taken into consideration. First, this is a cross-sectional study; there is no follow-up information for women considered to be non-cases. This may lead to an overestimation of the test sensitivities due to verification bias. The high level of the ultrasonologists in these five tertiary hospitals and the high prevalence of breast cancer in this study may also contribute to the overestimation of sensitivities. However, if the tests did not perform satisfactorily in this high prevalence setting (not the case in our study), they are unlikely to perform well in real-life settings. On the other hand, women with probably benign findings were confirmed by MRI or biopsy; thus, the chance of false negatives should be very low. Second, we did not collect detailed information on pathology results, which prevented us from analyzing the stage and hormonal receptor status. A study with long-term follow-up is needed to evaluate the effects of ultrasound examination on breast cancer prognosis and survival. Third, given that the ultrasonologists in our five tertiary-care hospitals can represent the highest level of ultrasonologists, it may not be representative of real-world performance. Whether ultrasound, especially ABUS, given its high reproducibility and less operator dependency, performs well in the resource-limited area, or as primary diagnostic or screening tool, needs further exploration in cohort studies. Our study does, however, provide theoretical support for using automated ultrasound as a primary breast cancer detection modality.

In conclusion, ABUS and HHUS have a high agreement in breast cancer detection in women with dense breasts. They can significantly increase diagnostic performance as adjuncts to MG-negative women with dense breasts. Given the high prevalence of dense breasts and the multiple advantages of ABUS over HHUS, such as less operator dependence and reproducibility, ABUS showed great potential for use in breast cancer early detection, especially in resource-limited areas.