Introduction

Mammography has been used as a primary imaging investigation to diagnose a breast cancer by virtue of its high sensitivity. However, mammographic sensitivity can be reduced in specific circumstances and a dense breast is regarded as one of the important factors affecting the accuracy of mammography [14]. To detect mammographically occult cancers, several studies have evaluated ultrasound as an adjunct to mammography in women with dense breasts since the 1980s [5]. With advances in imaging equipment and techniques of ultrasound, recent studies have reported that ultrasound can detect a substantial number of mammographically occult cancers (supplemental detection yields of 2.7–4.6 per 1,000 women screened with ultrasound)[6], which encourages the supplemental use of ultrasound in dense breast [14, 6]. But, those studies evaluated the performance of ultrasound based on the detection rather than the characterization of lesion.

Since the introduction of Breast Imaging Reporting and Data System (BI-RADS) for ultrasound to standardize terminology for describing and classifying lesions [7], several studies have assessed the reliability of BI-RADS lexicon or classification in evaluating masses on ultrasound for the likelihood of malignancy and have reported the good performance of this reporting system [815]. But, those studies evaluated ultrasound findings, regardless of mammographic result, and there has been no report about the performance of BI-RADS on ultrasound in conjunction with negative mammography in dense breast.

The purpose of this study was to assess the performance of breast ultrasound based on the BI-RADS final assessment categories when mammography is negative in women with dense breast.

Materials and methods

Study population

This study was conducted with institutional review board approval and a waiver of patient informed consent because this study was retrospective.

From July 2001 through June 2005, 23,579 consecutive bilateral whole-breast ultrasound examinations were performed at our institution where all patients who were referred for ultrasound underwent hand-held bilateral whole-breast examinations. After reviewing the institutional database, we selected 3,820 examinations which were performed as an adjunct to negative mammography (BI-RADS category 1 or 2) in dense breasts. Dense breast was defined as BI-RADS density category 3 (breast tissue is heterogeneously dense, approximately 51–75% glandular) or 4 (breast tissue is extremely dense breast, >75% glandular) [7]. We excluded 2,313 cases for which did not have surgical biopsy and at least a 2-year follow-up ultrasound (n = 1,954) or for which a nonmalignant core needle or fine needle biopsy result was not proven by surgical biopsy and did not have at least a 2-year follow-up ultrasound (n = 359). Therefore, a total of 1,507 ultrasound examinations in 1,046 women following negative mammography in dense breasts constituted the basis of this study (Fig. 1). Of 1,046 women, 752 underwent ultrasound once and 294 had two or more ultrasound examinations.

Fig. 1
figure 1

Flow chart of selection protocol

Image acquisition and interpretation

Screen-film mammography was performed with dedicated equipment (DMR; General Electric Medical Systems, Milwaukee, WI). Standard craniocaudal and mediolateral oblique views were routinely obtained and additional mammographic views were obtained as needed. Before performing ultrasound, all mammograms were interpreted by one of nine radiologists with fellowship training (n = 7) or extensive clinical experience of 4–8 years (n = 2) in breast imaging. The finding and final assessment categorization of each mammogram was analyzed prospectively according to the BI-RADS.

Hand-held bilateral whole-breast ultrasound was systematically performed by one of these nine experienced radiologists. The examiner knew the results of clinical examination and mammography at the time of the ultrasound examination. High-resolution ultrasound units with 7.5- or 12-MHz linear array transducers (HDI 5,000 or 3,000, Philips-Advanced Technology Laboratories, Bothell, WA; Logic 9, GE Medical systems, Milwaukee, WI) were used. The finding and final assessment categorization of each ultrasound examination was analyzed prospectively by the radiologist who performed the examination according to the BI-RADS. Before 2003, the ultrasound based BI-RADS was not established and findings at ultrasound had been classified prospectively as five categories according to the risk of malignancy similar to mammographic BI-RADS [13, 16]. When more than one mass was found in both breasts, a single final assessment was made based on the mass with the most suspicious features.

Management

We recommended a routine annual follow-up mammography in women with a category 1 or 2 lesion, follow-up ultrasound after 6 months followed by annual examination in women with a category 3 lesion, and immediate biopsy in women with a category 4 or 5 lesion; in some cases, however, tissue sampling was performed at the request of the patient or clinician, regardless of radiologic recommendation. Biopsies were performed with fine needle aspiration biopsy, ultrasound-guided core needle biopsy, or surgical excision. The choice of excisional biopsy rather than percutaneous biopsy was based on the preference of the surgeon. Fine needle aspiration was indicated for complicated cysts. Ultrasound-guided core needle biopsies were performed using an automated gun (Pro-Mag 2.2, Manan Medical Products, Northbrook, IL) and a 14-gauge Tru-cut needle with a 22-mm throw (SACN™ Biopsy Needle, Medical Device Technologies, Gainesville, FL) or using an 8 or 11-gauge vacuum-assisted device (Mammotome; Ethicon Endo Surgery, Cincinnati, OH).

After biopsy, the radiologist confirmed the concordance of pathological results with imaging finding and specific recommendations were made for the patients and the referring physicians [17]. Malignancies were accepted as the final diagnosis and patients were immediately recommended to have definitive treatment. High-risk lesions (e.g. atypical ductal hyperplasia, lobular neoplasia, radial sclerosing lesion, papillary lesions with atypical features, possible phyllodes tumors) and benign lesions (i.e. not either malignant or high-risk lesion) with imaging-pathologic discordance resulted in recommendations to have surgical excision. Those patients with concordant benign lesions were recommended to have follow-up ultrasound according to the management of category 3 lesion.

Data analysis

After review of medical records and radiologic reports, clinical and radiological variables for each examination were coded. The collected clinical variables were age, associated symptom and personal history of breast cancer. For radiological variables, breast density on mammography and the ultrasound based BI-RADS category were noted. According to the results of biopsy or follow-up ultrasound, cancer rate for each BI-RADS category (the number of cases with cancer divided by the total number of examinations per category) was calculated. The standard reference of diagnosis was composed of the results of surgical excision, the malignant pathologic result at core biopsy, and the results of at least 2-year follow-up ultrasound. For the cancer case, we also recorded the nodal status, size, and stage of the cancer, based on the American Joint Committee on Cancer staging system [18]. In addition, false-negative examinations defined as cases with the pathologically confirmed malignant lesion assigned to ultrasound based BI-RADS category 1, 2, or 3 were analyzed. Cancers diagnosed more than 2 years after ultrasound were excluded from the false-negative cases [19].

Furthermore, screening and diagnostic examinations were segregated. Screening examinations involved asymptomatic women who were further divided into general and treated population (i.e. periodic surveillance of an asymptomatic cancer patient treated with breast conservation surgery or mastectomy). Diagnostic examinations were segregated according to indication for examination: short-interval follow-up of a probably benign lesion and workup of a palpable mass or bloody nipple discharge [20]. Regarding screening-general, screening-treated, and diagnostic group, cancer rate for each BI-RADS category were analyzed separately as described above. Also, data of medical audit were obtained; abnormal interpretation rate, positive predictive value (PPV), cancer detection rate per 1,000, rate of nodal metastasis and early stage cancer (stage 0 or 1), or mean size of invasive cancer.

Statistical comparisons were performed using the chi-square or Fisher’s exact tests for categoric data, and the Kruskal-Wallis test for continuous data. Statistical analysis was performed with computerized statistical software (PASW Statistics, ver. 17.0.2, SPSS Inc., Chicago, IL). For all analyses, results were considered statistically significant if the p value was 0.05 or less.

Results

Cancer rate and false-negative result

Clinical and radiological variables are listed in Table 1. The age of study population ranged from 21 to 74 years (mean, 47.5 ± 7.8 years; median, 47 years). Of 1,507 examinations, 931 patients (61.8%) had personal history of breast cancer. For ultrasound examinations following negative mammography, BI-RADS category 1, 2, and 3 accounted for 92.2% (1,390 of 1,507) and BI-RADS 4 and 5 accounted for 7.8% (117 of 1,507) (Table 2). A total of 146 biopsies were performed with fine-needle aspiration biopsy in four, core biopsy in 95, and surgical excision in 47. Among them, forty-three lesions were confirmed as malignancy and 16 lesions were confirmed as benign lesion at surgical excision. The remaining 1,448 cases which were neither surgically excised nor diagnosed as malignancy at fine-needle aspiration or core biopsy had at least 2-year follow-up ultrasound (mean, 39.5 ± 11.4 months; range, 24-81 months; median, 37 months), confirmed as benign lesion. Therefore, cancer rate in this study group was 2.9% (43 of 1,507) and among ultrasound based BI-RADS categories, significant difference was found in the cancer rate (p < 0.0001) (Table 2).

Table 1 Clinical and radiological variables
Table 2 Ultrasound based BI-RADS category and cancer rate

Cancer lesions assigned to category 4 or 5 accounted for 88.4% of all cancers in this study group. Cancer rate for category 4 or 5 was 32.5% (38 of 117) and all of four category 5 lesions were confirmed as malignancy. The remaining five cancers were assigned to category 1, 2 or 3 at ultrasound (0.4%, 5 of 1,390), falling into false-negative examinations (Table 2). The details of those false-negative ultrasound examinations are summarized in Table 3. Three patients (60.0%) had a history of breast cancer and no patient had associated symptom. The mean size of cancer, measured pathologically, was 9.3 ± 0.96 mm (range, 8–10 mm; median, 9.5 mm). All but one had diagnosis of malignancy at follow-up ultrasound and mean delay in diagnosis was 8.2 ± 5.2 months (ranges, 0–12 months; median, 11 months). A woman who had the diagnosis of cancer without any delay (case 3 in Table 3) underwent mammography and breast ultrasound due to diffuse hot uptake in both breasts at whole-body 18F-FDG PET scan, but both mammography and ultrasound were negative. Ultrasound-guided core needle biopsy was immediately performed in breast parenchyma and Burkitt lymphoma was confirmed.

Table 3 False-negative cases

Screening versus diagnostic examination

Table 4 summarizes the frequency and cancer rate of each ultrasound based BI-RADS category for screening and diagnostic examination. Of 1,507 ultrasound examinations, 446 (29.6%) and 922 (61.2%) were general and treated population in screening examinations and 139 (9.2%) were diagnostic examinations (short-interval follow-up of a probably benign lesion (n = 78, 56.1%) and workup of a palpable mass (n = 58, 41.7%) or bloody nipple discharge (n = 3, 2.2%)). In all groups, the cancer rate was significantly different among BI-RADS categories (p < 0.0001). Among three groups, the cancer rate was significantly different (p < 0.0001) and the highest in diagnostic group (15.8%, 22 of 139). According to BI-RADS final assessment category, cancer rates in category 1, 2, or 3 (i.e. false-negative rates) were not significantly different among three groups (0.3% (1 of 395) in screening-general group, 0.3% (3 of 893) in screening-treated group, and 1.0% (1 of 102) in diagnostic group) (p = 0.543), but cancer rates in category 4 or 5 (i.e. PPV of biopsy recommended) were significantly different (19.6% (10 of 51) in screening-general group, 24.1% (7 of 29) in screening-treated group, 56.8% (21 of 37) in diagnostic group) (p = 0.001). Table 5 summarizes the clinical outcomes of ultrasound based on BI-RADS category in women with mammographically negative dense breast. Among three groups, abnormal interpretation rate, PPV of biopsy performed, cancer detection rate, and rate of early stage cancer, and the size of invasive cancer were significantly different and the highest in diagnostic group. Regarding cancer characteristics, the proportion of larger, more advanced-stage cancer was the highest in diagnostic group.

Table 4 Ultrasound BI-RADS category and cancer rate for screening and diagnostic examination
Table 5 Medical audit based on ultrasound based BI-RADS category for screening and diagnostic examination

Discussion

A dense breast parenchyma may mask noncalcified nondistorted tumors because such tumors may have x-ray attenuation similar to fibroglandular tissue [21]. Supplemental imaging studies such as ultrasound and MRI can be used to detect those mammographically occult cancers in dense breast. Compared with MRI, ultrasound is relatively inexpensive, usually requires no contrast agent , is well tolerated, and is widely available for equipment [4]. Moreover, the detection benefit of supplemental ultrasound in mammographically occult cancers can increase with increasing grades of breast density because most breast cancers are relatively hypoechoic within a background of hyperechoic fibroglandular tissue [1, 22, 23]. In clinical practice, ultrasound is performed as an adjunct to mammography in women with dense breasts and supportive data have been reported [14, 6]. However, most were not based on BI-RADS for ultrasound. Although the American College of Radiology Imaging Network (ACRIN) protocol 6666 have reported the performance of screening ultrasound in women at elevated risk of breast cancer using BI-RADS [4], positive mammographic examinations were included and mammographic and ultrasound interpretations were independent. The present study evaluated the performance of supplementary ultrasound based on BI-RADS to negative mammography in women with dense breast. Also, because our study population was heterogeneous (Table 1), the results were separately analyzed according to risk of breast cancer and indication of examination (i.e. screening and diagnostic).

In accordance with the definition in BI-RADS [7], a lesion assigned to BI-RADS category 3 should have less than a 2% risk of malignancy and a lesion assigned to BI-RADS category 5 have a high probability (at least 95%) of being cancer. A lesion assigned to BI-RADS category 4, therefore, comes to have the probability of malignancy ranging from 2% to 95%. In this study, cancer rate was 0.6% in category 3, 30.6% in category 4, and 100% in category 5, which conformed to those defined ranges and showed significant difference among BI-RADS categories. In the summary of published studies (Table 6), the mean cancer rate for each BI-RADS final assessment was within the ranges of the provided probability of malignancy and our result was well in line with that of those studies. Compared with the results of previous mammographic studies (category 2, 1.5%; category 3, 3.8%; category 4, 34.0%; category 5, 83.6%) [14], our study showed a better result for the prediction of malignancy.

Table 6 Cancer rates for ultrasound based BI-RADS category in published studies

In most previous studies that evaluated the performance of BI-RADS final assessment by ultrasound, false-negative results were not assessed because BI-RADS category 1 or 2 was not included and ultrasound examinations with a less-than 2-year follow-up were enrolled (Table 6). However, the actual true-negative and false-negative rates are determined by at least a 2-year follow-up of benign cases [24]. BI-RADS category 3 lesions need at least a 2-year follow-up to be changed to category 2, benign, after 2 years of stability [7]. In the present study, therefore, ultrasound examinations assigned to BI-RADS category 1, 2, or 3 with at least a 2-year follow-up ultrasound were enrolled to prevent the performance characteristics of ultrasound from being inflated and 0.4% (5 of 1,390) of false-negative rate was revealed. In the study by Kim et al [13] and the ACRIN protocol 6666 [4], false-negative rate was 0.2% (9 of 3,701) and 0.4% (9 of 2,331), respectively, which was comparable to our result. For our false-negative results, there were delays in diagnosis of cancer ranging from 0 to 12 months. Nevertheless, cancers were diagnosed during the routine or scheduled imaging follow-up period, and all ductal carcinomas were diagnosed at an early stage. Appropriate follow-up might be helpful for avoiding a significant delay in diagnosis.

Regarding study population in this study, heterogeneous nature was shown, that is, including a mixture of screening and diagnostic examination and a high proportion of patients with cancer history. Regarding mammographic examinations, substantial differences in outcomes are found when auditing screening versus diagnostic examinations and some of these differences have been shown to be statistically significant [20, 25]. In this study, therefore, the examinations were segregated into three groups (i.e. screening-general, screening-treated, and diagnostic) and analyzed separately, and then significant difference among groups was found in most parameters (Tables 4 and 5). But, all groups showed similar false-negative rates which was substantially low. Concerning the result of medical audit, all three groups showed better performance with higher PPV of biopsy performed and cancer detection rate in comparison with the previous results of mammography benchmarks and screening ultrasound [4, 26, 27] (Table 7). Regarding cancer characteristics, most cancers were diagnosed at early stage in screening group, compatible with previous results [4, 26, 28]. Cancers in diagnostic group, however, were likely to be more advanced at diagnosis. In diagnostic examination, the clinical findings, especially presenting palpable lump play an important role in the management of lesion, in addition to ultrasound finding. In diagnostic mammography benchmarks [27], PPV and cancer detection rate were higher and cancers were more advanced for palpable lump evaluation cases than for other indications. Of 21 cancers assigned to category 4 or 5 at diagnostic ultrasound in this study, fifteen lesions (71.4%) were palpable. The presence of palpable lump might influence the result of ultrasound evaluation and make such different results between screening and diagnostic group. In screening-general group, the cancer detection rate was high (22.4 per 1,000) compared to other studies and even the screening-treated group (Table 7). In general, the rate of cancers detected on patients screened for the first time (prevalent cancers) should be much higher than in a population that has been screened previously (incident cancers) [29]. Compared with screening-treated group having regular follow-up examinations after cancer treatment-incidence screening, ultrasound in screening-general group was more likely to be a single prevalence screen and to detect more cancers. Moreover, a 2-year follow-up was required to identify false-negative result in our study population as mentioned above, which may result in exclusion of many possible benign lesions and relatively high cancer detection rate in screening-general group.

Table 7 Comparison of audit result in three groups with the previous results of mammography benchmarks and screening ultrasound

Although the addition of ultrasound to mammography increased the diagnostic yield, the main potential limitation of ultrasound as an adjunct to mammography is increasing false-positive results (ie, biopsy with benign results) [4, 30]. In the ACRIN protocol 6666 [4], false-positive rates for mammography plus ultrasound (10.4%, 275 of 2637) were higher than that of mammography alone (4.4%, 116 of 2637). Of 136 women having suspicious findings biopsied based on ultrasound alone, only 12 (8.8%) were diagnosed with breast cancer. In our study, 79 cases out ot 117 BI-RADS category 4 or 5 on ultrasound were confirmed as benign, falling into false-positive result (5.2%, 79 of 1507). PPV of BI-RADS category 4 or 5 on ultrasound was 32.5% (38 of 117) which is comparable with 34% PPV from the Breast Cancer Surveillance Consortium report [31] and 25–40% PPV recommended by the AHRQ [32]. The lower false-positive results and higher PPV of biopsies based on ultrasound may contribute to the use of ultrasound in mammographically negative dense breast.

However, lack of uniformity and shortage of qualified personnel could be barriers to implementing widespread additional ultrasound in mammographically dense breast. Automated whole-breast ultrasound may be one of the solutions, which gathers standardized uniform image sets by lesser trained personnel, allowing shorter, more efficient time use by physicians interpreting the studies. For its diagnostic performance, in a recent study [33], automated whole-breast ultrasound in women with dense breasts and/or at elevated risk of breast cancer resulted in significant cancer detection improvement compared with mammography alone (3.6 per 1,000; 38.4% PPV for biopsy) and 90% of invasive cancers detected were smaller than 20 mm. But it will result in hundreds of images to be reviewed by the radiologist and stored which is still a resource-intensive procedure and must be considered in the overall cost-effectiveness [34].

Our study had some limitations. First, selection bias may exist because this is retrospective study and 2,313 cases for which a non-malignant biopsy result was not proven by surgical biopsy and did not have at least 2-year follow-up ultrasound were excluded. Also, our institution is a tertiary care hospital where the proportion of patients with history of breast cancer is high and patients with history of cancer treatment are expected to be more compliant with long-term follow-up ultrasound, which might result in the large proportion of patients with personal history of breast cancer (61.8%, 931 of 1,507). Second, there can be discrepancy between radiologists performing ultrasound examinations because ultrasound is operator-dependent examination and interobserver variability can exist. But, good interobserver agreement for ultrasound BI-RADS final assessment has been reported and interobserver variability might hardly influence the result of this study [8, 12]. Third, nonvisualization of breast cancer on mammography may be due to factors other than dense breast, such as poor positioning, tumor histology, and tumor size. Still, ultrasound adjunctive to mammography may be also valuable in such settings.

In conclusion, breast ultrasound based on BI-RADS final assessment as an adjunct to negative mammography can be useful for predicting malignancy in women with dense breast. Proper classification of BI-RADS final assessment on breast ultrasound will help referring physicians, radiologists, and patients to understand their management options and implications.