Introduction

The goal of preoperative imaging of breast cancer (BC) is to provide an accurate size assessment of the index lesion and identify additional multifocal, multicentric, and bilateral cancers to plan the type and extent of breast surgery [1,2,3]. Contrast-enhanced magnetic resonance imaging (CEMRI) is considered the most sensitive technique to achieve this aim and is frequently used in tertiary referral centers [4]. On the other hand, preoperative CEMRI is not of universal adoption given still limited indications [5], controversial impact on oncological outcomes [6], and availability issues.

Not surprisingly, the use of "first-line" digital mammography (DM) alone or in combination with digital breast tomosynthesis (DBT) [7] is still a common strategy to achieve both diagnosis and preoperative assessment of BC, despite lower accuracy than CEMRI in predicting cancer size [8] and detecting additional disease, particularly in dense breasts [9]. Over the last years, contrast-enhanced mammography (CEM) has emerged as an alternative tool to perform preoperative assessment, demonstrating contrast-enhancing tumor lesions over a background of fibroglandular tissue suppressed by spectral saturation [10]. CEM has been shown to approximate CEMRI in assessing BC size and additional disease in the preoperative setting [11], potentially reducing false-positive findings compared to CEMRI [12, 13].

Preoperative assessment using CEM might be of value when CEMRI is not available or is contraindicated due to patient-related factors (e.g., claustrophobia or safety issues in the presence of medical devices). To our knowledge, only one study has [14] compared CEM with first-line techniques, showing that CEM alone was superior to DM and DBT in identifying BC multifocality (sensitivity 92.5% vs. 53.8% and 77.0%, respectively), while DBT was the best tool to assess BC size. Of note, those Authors included dense breasts only in the analysis. Overall, it is unclear whether CEM can replace DM and DBT in the preoperative setting, especially in terms of cancer size assessment, and whether the expected increase in cancer detection rate (CDR) can be obtained over the entire spectrum of breast density.

The purpose of this study was to compare the combination of DM and DBT (DM + DBT) versus CEM as tools to detect BC and assess tumor size in a preoperative setting.

Methods

Study population and standard of reference

The referring Institutional Review Board approved this study as an interim retrospective branch of an ongoing prospective trial investigating the role of CEM in the preoperative assessment of women with contraindications to CEMRI, which is the standard preoperative imaging technique in our center. The acquisition of informed consent from patients was waived for this retrospective branch of the trial.

Between May 2019-March 2020, we consecutively included women with biopsy-proven BC referred to preoperative CEM because of one or more of the following contraindications to CEMRI: (1) presence of foreign bodies or medical devices documented to be "unsafe" on the magnetic resonance imaging environment; (2) presence of foreign medical bodies or medical devices documented to be "conditional" located in the breasts region, potentially compromising image quality of CEMRI because of artifacts; (3) claustrophobia; (4) patient refusal to undergo CEMRI; (5) comorbidities associated to age > 65 years making CEMRI at risk of resulting incomplete or unfeasible; (6) risk for systemic nephrogenic fibrosis and/or hypersensitivity reaction to gadolinium-based contrast medium based on the recommendations of the European Society of Urogenital Radiology (ESUR) [15]. In the study were included patients who underwent both DM and DBT as first-line diagnostic tools not earlier than one month before CEM.

Exclusion criteria were: (1) unavailability of surgical and post-surgical data (n = 5); (2) neoadjuvant treatment before surgery (n = 3); (3) incomplete breast imaging (including bilateral DM, DBT and CEM) (n = 3); (4) presence of post-biopsy changes such as hematomas, possibly impairing tumor size evaluation at CEM (n = 6).

The final population consisted of seventy-eight women (mean age 68.7 years; range 52–85 years). Indications to initial imaging leading to the diagnosis of BC, and in turn preoperative CEM, were as follows: palpable breast nodule in 18 women, nipple discharge in 1 woman, and self-referred screening in 54 women. The remaining 5 women were referred to our center because of abnormal findings found on the organized screening program. Patients underwent unilateral mastectomy in 35 cases, bilateral mastectomy in 3 cases, unilateral breast-conserving surgery in 39 cases, and bilateral breast-conserving surgery in 1 case.

The reference standard was pathologic analysis of the surgical specimens. Analysis was performed by one of three pathologists (5–25 years of experience in breast pathology) according to the College of American Pathologists guidelines [16].

Imaging protocols

DM and DBT were performed on one of two different systems (Giotto TOMO, Internazionale Medico Scientifica, or AW5000 3D Selenia Dimensions, Hologic). Both the examinations included bilateral craniocaudal (CC) and mediolateral oblique (MLO) views, with synthetic mammography reconstruction for both views in the case of DBT. Acquisition parameters of DBT are shown in Supplementary Table 1. While synthetic mammograms were provided along with DBT, they were not used for the purpose of the study.

CEM was performed on a dedicated system (Selenia Dimensions, Hologic, Inc) 1–2 min after intravenous administration of 1.5 mL/Kg of iodinate contrast medium (Omipaque 350 mg/Iml, General Electrics, or Iomeprol 350 mgI/ml, Bracco), followed by a bolus of saline solution (20 mL). Injection was performed under remote control (Accutron CT-D, Medtron) at a rate of 2 ml/s. For each breast, we acquired CC and MLO views in the following order: CC and MLO views of the affected side, CC and MLO views of the contralateral side, and finally, a second CC and/or MLO views of the affected side, depending on the attending radiologist decision. The total examination time for a set of four views set was of about 10–15 min.

Image analysis

Image analysis was carried out independently by four readers, who were not those who reported CEM during clinical activity. Readers included two breast radiologists with more than 10 years of experience [reader 1 (R1) and reader 2 (R2)], and two radiology residents with 3 years of experience in breast imaging [reader 3 (R3) and reader 4 (R4)]. A study coordinator, not involved in image evaluation, independently presented to each reader DM + DBT (as parts of a single examination) or CEM images. Examinations of the same patient were presented randomly in separate sessions, at an interval of time of at least one month. Image evaluation was performed on a Picture Archiving and Communication System (PACS) workstation (Suitestensa Ebit Srl, Esaote Group Company). R1–R4 were aware of the preoperative purpose of the examinations, but blinded to lesion number and location, as well as final pathology results.

For each patient and imaging tool, R1-R4 reported number, size (cm) location (breast and quadrant), and BI-RADS categories using mammography descriptors in the case of DM + DBT or low-energy CEM images, as well as MRI descriptors in the case of recombined CEM images [17, 18]. No quantitative analysis was performed for CEM. Any finding categorized BI-RADS 4 or BI-RADS 5 was assessed as suspicious. When analyzing DM and DBT, a BI-RADS category was provided. Measurements were performed in the view (and imaging set in case of DM + DBT) showing the largest lesion size.

During the DM + DBT reading sessions, R1 was also asked to categorize breast density according to the BI-RADS Atlas Fifth Edition [17]. For the study purpose, breast density was dichotomized as low (BI-RADS category A and B) versus high (category C and D).

Statistical analysis

Since the Shapiro–Wilk test showed non-normal distribution of continuous variables, we reported them as median and interquartile range (IQR) values. Proportions were reported together with 95% confidence intervals (95%CI).

Based on the matching with the standard of reference, we calculated the CDR of both imaging strategies as the ratio between correctly identified cancers [true-positives (TP)] and the total number of pathologically confirmed cancers [TP + false negatives (FN)]. Suspicious imaging findings were categorized as false-positives (FP) if not confirmed at second-look ultrasound and/or targeted biopsy, and if not corresponding to any lesion found on the surgical specimen or during the follow-up. On this basis, we also calculated the complement of positive predictive value (1-PPV), defined as the ratio between FP and the total number of positive assignments (TP + FP). 1-PPV was intended as a metric alternative to specificity to weight the effect of FPs on accuracy. Indeed, specificity would have been ambiguous to calculate in a 100% cancer prevalence context, i.e., without properly said true negatives.

U-Mann–Whitney test was used to compare cancer size obtained with DM + DBT versus CEM on an intra-reader basis, with significance value set at 0.05. The comparison included only the findings visible with both imaging strategies. Bland–Altman analysis was performed to evaluate agreement in size determination of each imaging modality versus pathological analysis. Given the non-normal distribution of the data, logarithmic transformation was performed, expressing mean differences in size and related limits of agreement as dimensionless values. Analysis was integrated with the calculation of the intraclass correlation coefficient (ICC), using the following reference values for the degree of agreement [19]: 0.40 = poor; 0.40–0.59 = fair; 0.60–0.74 = good; 0.75–1.00 = excellent.

A commercially available software was used to perform calculations (MedCalc software bv, version 18.11.16, Ostend, Belgium).

Results

Breast density and lesions characteristics

R1 assessed breasts as “dense” in 29/78 (37.1%) patients and non-dense in 49/78 (62.9%) patients.

Pathologic examination found 100 BC (8 index lesions and 22 additional lesions), showing the features reported in Table 1. Median size was 1.2 cm (IQR 0.8–1.9 cm) for all lesions, 1.4 (IQR 1–2.1 cm) for index lesions, and 0.8 cm (IQR 0.6–1.1 cm) for additional lesions. Overall, cancer size was ≤ 10 mm in 44/100 BC (44%), corresponding to 29/78 index lesions and 15/22 additional lesions.

Table 1 Distribution of the histological types of the 100 malignant lesions included in the analysis

No FN cases emerged during clinical and imaging follow-up of non-suspicious findings not included in the surgical specimen. Imaging follow-up was performed with DM and/or DBT and/or ultrasound. The median follow-up duration was 18 months (range 14–24 months).

CDR of DM + DBT and CEM

R1, R2, R3, and R4 found a total of 116, 110, 99, and 98 suspicious findings on DM + DBT, and 117, 105, 107, and 109 suspicious findings on CEM, respectively. DM + DBT and CEM showed comparable CDR for index lesions, while CEM showed higher CDR for additional lesions, regardless of the reader (Table 2). Overall, the CDR of CEM was higher than that of DM + DBT in less experienced readers (R3 and R4), while it was comparable in experienced readers (R1 and R2). Figure 1 illustrates an example case.

Table 2 Cancer detection rate (CDR) and complement of the positive predictive value (1-PPV) achieved by the four readers using digital mammography plus digital breast tomosynthesis (DM + DBT) or contrast-enhanced mammography (CEM)
Fig. 1
figure 1

Seventy-four year-old woman undergoing preoperative CEM for grade 2 invasive ductal carcinoma (IDC) in the inner quadrants of the right breast. All readers identified the index lesion (arrows) presenting as a mass in the medio-lateral oblique (MLO) view on digital mammography (DM) (a), and MLO (b) and cranio-caudal (CC) (c) views on digital breast tomosynthesis (DBT). On CEM (MLO view in [d], and CC view in [e]), all readers detected two additional cancer lesions: a mass lesion less than 1 cm in size (dashed arrows), and a linear non-mass enhancement (arrowheads), which were both grade 2 IDC at pathologic evaluation

The number and characteristics of false-negatives are detailed in Supplementary Table 2. Regardless of the imaging modality, all false negatives showed a median size lower than 1 cm.

When stratifying analysis according to breast density (Table 3), CEM showed higher CDRs than DM + DBT in dense breasts, mainly because of increased detection of additional lesions. There was no substantial difference in CDR in non-dense breasts, though CEM provided higher CDR than DM + DBT for R3 and R4 in the subset of additional lesions.

Table 3 CDR and 1-PPV values achieved by the four readers in non-dense versus dense breasts using DM + DBT or CEM

1-PPV of DM + DBT and CEM

1-PPVs are shown in Table 2, while details on false-positive cases are reported in Supplementary Table 2. Overall, 1-PPVs of DM + DBT and CEM were similar on an intra-reader basis (range 0.09–0.19 versus 0.10–0.18, respectively), corresponding to a comparable number of FPs.

Agreement with pathology in cancer size assessment

Median (IQR) size of suspicious findings using DM + DBT and CEM was 1.5 cm (1.0–2.2) and 1.3 cm (0.9–2.0) for R1, 1. 2 cm (0.9–1.7) and 1.3 cm (0.9–1.9) for R2; 1.2 cm (0.8–1.9) and 1.3 cm (0.8–1.9) for R3, and 1.8 cm (1.2–2.5) and 1.4 cm (1.0–2.1) for R4. Cancers found with both DM + DBT and CEM were 91/110 by R1, 82/100 by R2, 89/100 by R3, and 84/100 by R4, with no significant difference in cancer size assessment on an intra-reader basis (p = 0.590 for R1, p = 0.512 for R2, p = 0.655 for R3, and p = 0.066 for R4).

Bland–Altman plots expressing the agreement in size between DM + DBT or CEM versus pathological examination are shown in Supplementary Fig. 1. Regardless of the reader, the analysis showed comparably minimal mean differences in size assessment (range − 0.00 to 0.13 for DM + DBT, and 0.03–0.08 for CEM). Expected limits of agreement, i.e., expected discrepancies with pathological assessment, were of a comparable range, independently from the reader (e.g., − 0.34 to 0.48 for DM-DBT, and − 0.31 to 0.44 for CEM in the case of R1). The agreement with pathology expressed by ICC values was in the range of “good” for both DM + DBT and CEM in the case of R1 [0.73 (95% CI 0.41–0.82) and 0.73 (95% CI 0.59–0.82), respectively] and R2 [0.71 (95% CI 0.56–0.80) and 0.61 (95% CI 0.46–0.72), respectively]. R3 and R4 showed higher agreement with pathology when measuring cancers with CEM [“good” agreement, with ICC of 0.69 (95% CI 0.54–0.79) and 0.72 (95% CI 0.56–0.82), respectively], rather than DM + DBT [“fair” agreement, with ICC of 0.58 (95% CI 0.36–0.72) and 0.56 (95% CI 0.27–0.72), respectively].

Discussion

We found CDR values of CEM in line with the sensitivity ranges reported in studies where the technique was compared with CEMRI (0.72–1.00) [13, 20, 21]. In particular, CEM achieved higher CDR than DM + DBT for additional lesions, most of which were ≤ 1.0 cm in size, and mainly in the setting of dense breasts. The superiority of CEM occurred independently of readers' experience and was not counterbalanced by an increased number of false-positives. Additionally, we found that CEM was comparable to DM + DBT in predicting cancer size at pathological examination, emphasizing its role as a reliable preoperative tool. As suggested in a recent metanalysis by Suter et al. [23], this might be the case at least when CEMRI is not available and/or feasible. Of note, we could not to define precisely how CEM impacted in surgical terms (e.g., mastectomy rate), given the retrospective and multireader design. However, our results reasonably support the potential impact of CEM on surgical planning observed by previous authors [22], and are in line with what expected from preoperative imaging even in the surgeon’s perspective, e.g., definition of lesion extent, and screening for additional disease [4].

Helal et al. [14] compared CEM to DM and DBT in the preoperative assessment of 98 women with dense breasts, showing 0.92 versus 0.53 and 0.77 CDR for additional disease, respectively. Differences in study design make our results difficult to compare, as we analyzed DM and DBT together and examined both breasts rather than the affected one only. However, we observed a similar advantage in using CEM when assessing additional lesions in the subset of patients with dense breasts. Of importance, the difference in CDR for additional disease was in favor of CEM even in the subset of non-dense breasts in less experienced readers (R3 and R4), while there was no added value of CEM in the case of experienced readers (R1 and R2).

Our results suggest that CEM might be reserved for the preoperative assessment of women with dense breasts regardless of readers' experience, while it might be avoided in the non-dense breast if experienced radiologists from large-volume centers analyze the images. A breast-density-based approach might reduce timing and maximize the cost-effectiveness of the management process. Today, CEMRI is considered the standard technique for preoperative evaluation of cancer patients with dense breasts; if other studies confirm our findings, preoperative use of CEM might be reserved at least for patients with contraindications or who refuse CEMRI.

According to Bland–Altman analysis, CEM was comparable to DM + DBT in assessing cancer size, independently from readers' experience. Indeed, we found similar mean differences and expected discrepancies (i.e., limits of agreement) in size compared to pathological examination. For example, considering R1 results, it is possible showing that, for a hypothetical cancer measured 0.8 cm in size by both imaging strategies, the mean difference/limits of agreement with pathology would be 0.04 times/− 0.31 times to 0.40 times for DM + DBT (i.e., 0.03 cm/from 0.25 cm below to 0.32 cm above), and 0.05 times/− 0.29 times to 0.38 times for CEM (i.e., 0.04 cm/from 0.23 cm below to 0.30 cm above). Our results are different from those by Helal et al. [14], who found DBT to be superior to CEM in predicting cancer size. However, it is difficult to compare our results because of different statistical methodology. We believe that Bland–Altman analysis provided a more insightful comparison by showing the absolute magnitude of difference in size between pathology and each imaging modality.

Contrary to previous Authors [24], we found that CEM does not overestimate lesion size; if confirmed by future studies, these results might support the use of CEM for preoperative evaluation in dense breasts. Importantly, CEM was associated with higher ICC values than DM + DBT in R3 and R4 readings, suggesting that it may improve cancer size prediction in less experienced readers.

This study has some limitations. First, there was possible detection bias because readers were aware of the preoperative setting although blinded to the index lesion. However, our choice reflects the clinical scenario. We observed no increase in 1-PPV using CEM, suggesting that, while some risk for making false-positives is inherent to the preoperative assessment, it was reasonably limited using DM + DBT or CEM. Second, the study population is relatively small, with a minority of dense breast as the likely effect of having enrolled older patients based of the inclusion criteria. Larger studies with a better balance in breast density should validate our results, which should be considered as preliminary in nature. However, we believe those results can be of interest in the preoperative scenario, especially for a setting in which CEMRI is contraindicated or not available. Third, imaging follow-up for non-suspicious findings was shorter than two years. This might have influenced our results, excluding potential false-negative cases. One might assume that this risk was reasonably minimal, given the 81% negative predictive value recently reported for CEM [25, 26]. However, we acknowledge that further studies with longer follow-up of findings classified as benign should support our data.

In conclusion, we found two main results in this study. First, CEM had higher CDR compared to DM + DBT, in particular in the identification of additional small cancer lesions, which may influence surgical planning for BC. However, there was no added value in the subset of non-dense breasts as interpreted by more experienced readers, suggesting that, in large volume centers, adding preoperative CEM might be indicated in the case of dense breasts only. Second, CEM and DM + DBT achieved comparable cancer size assessment, with CEM increasing the agreement with pathology size in less experienced readers. Given the lack of a direct comparison with CEMRI, our results can be generalizable to those scenarios in which this technique is unfeasible and/or unavailable.