Introduction

Neoadjuvant chemotherapy (NAC) is a well-established treatment option for patients with locally advanced breast cancer (LABC) [1, 2]. The main advantage of NAC is the ability to ablate or reduce some tumors initially assessed as unresectable or as requiring mastectomy, enabling a post-NAC alternative of breast-conserving surgery (BCS) [3]. Additional advantages include in vivo monitoring of response to treatment, prediction of prognosis, and a reduction in the need for axillary dissection [4,5,6].

Detailed and precise imaging post-NAC is critical in defining the presence and extent of residual disease and in directing further treatment. While physical examination, mammography, and ultrasound are limited in diagnostic precision [7,8,9], MRI is established as the most accurate imaging modality and is currently recommended by the American College of Radiology and the European Society of Breast Imaging for evaluation of response to NAC [10, 11]. MRI has the benefit of both quantifying the amount of residual tumor and of depicting patterns of response, which define the scope of residual tumor within the breast [9], providing crucial information for surgical planning following NAC. Breast MRI has a high sensitivity and PPV in depicting residual disease at 86% and 93%, respectively, albeit with a more limited NPV of 65% that reduces the overall diagnostic accuracy to 84% [12]. The ability of MRI to predict pathologic complete response (pCR) to NAC in invasive breast cancer is reported to range between 52 and 74% [13, 14] with a recent metanalysis of 57 studies involving MRI data on 5811 women showing a pooled sensitivity for pCR prediction of 0.64 [15]. Thus, current policy recommends surgical sampling following NAC even if no residual MRI enhancement is seen. While differing accuracy of MRI prediction of pCR has been shown for various tumor grades and molecular subtypes [14, 16], to our knowledge, no studies have so far examined tumor response prediction in calcified compared to non-calcified breast cancer.

The prevalence of mammographic calcifications in breast cancer is estimated to be 38%, and of those undergoing NAC 19% [17,18,19]. Current surgical policy recommends excision of the full extent of calcifications following NAC, regardless of numerous observations reporting limited correlation between residual calcifications on mammography and response to NAC [17,18,19,20]. Thus, it is possible that some patients undergo more extensive surgery than required. The present study focuses on this issue, exploring whether complete resolution of MRI enhancement following NAC is sufficient to indicate a narrower scope of surgery in patients with mammographic calcifications.

Materials and methods

Patient selection and data collection

This study included patients with LABC who underwent NAC followed by surgery at a single institution between 2011 and 2018 with pre- and post-NAC MRI performed at that same institution. Patients underwent standard anthracycline/taxane-based chemotherapy with the addition of biological treatment if Her-2 enriched. For inclusion, patients also had to have a pre-NAC mammogram and a comprehensive biopsy pathology report. To isolate mammographic calcifications as the sole distinction between groups in our study, patients with prior history of breast cancer, breast surgery, or metastatic breast cancer and patients who underwent neoadjuvant hormonal therapy were excluded. Following these criteria, search of our hospital database detected a total of 262 patients during the study period who underwent NAC and surgery at our institution. Three male patients were excluded. One hundred and forty patients referred from outside facilities were excluded due to lack of imaging examinations on our system available for review. Based on pre-treatment mammogram, patients were dichotomized by presence or absence of tumoral calcifications. The extent of calcifications on mammogram and the scope of enhancement on pre-treatment MRI needed to correspond for study inclusion. Five patients were excluded due to mammographic calcifications being more extensive than MRI enhancement. The final study group included 114 patients (Fig. 1).

Fig. 1
figure 1

Flowchart presenting inclusion and exclusion of 262 patients who were treated with neoadjuvant chemotherapy followed by surgery at a single institution between 2011 and 2018 NAC—neoadjuvant chemotherapy; Mx—mastectomy

Data was retrospectively collected and included clinical demographics, tumor characteristics, NAC treatment protocol, and type of surgery. Imaging data comprised the extent of calcifications on mammogram and the pre- and post-NAC pathologic enhancement on MRI. Histological findings of pre-treatment biopsy and post-NAC surgical specimens were obtained. Data was anonymized and loaded onto an electronic spreadsheet using the Excel© software program (Microsoft Corporation). Due to the retrospective nature of this study, the need for informed consent was waived by our institutional ethics committee (reference 0299–18-HMO).

Imaging

The institution in which the study was performed is a large referral center for breast cancer; thus, mammography was performed either onsite or at one of several outpatient clinics, resulting in acquisition on a variety of mammography machines. Mammograms were 2-D digital examinations and included standard CC and MLO views of the breasts. Magnification views were not available for all patients and were thus not included in the evaluation. Mammograms from outside facilities were loaded on a dedicated workstation and reviewed by one of the participating authors.

Study inclusion criteria required that all MRI examinations be performed onsite, prior to or up to 1 week after initiation of NAC and within 2 weeks of its completion. MRI was acquired utilizing a standard protocol on a 3-Tesla (3-T) scanner (Magnetom Trio, Siemens) with a dedicated 8-channel breast coil (Siemens). The protocol included dynamic contrast-enhanced bilateral axial T1-3D-VIBE fat saturated sequences, reconstructed at 2 mm thickness prior to and at 4 time points after injection of 0.1 mmol/kg of body weight of gadopentetate dimeglumine (Magnevist, Bayer) or gadoterate meglumine (Dotarem, Guerbet) at a 2-cc/s rate, followed by a 20-cc saline flush. Post-contrast scanning was initiated 20 s after the beginning of injection. Dynamic subtraction series and MIP reconstructions were derived. Additionally, bilateral axial non-fat saturated T2W and sagittal turbo spin-echo T2W-SPAIR sequences were obtained. Sequence parameters were as previously reported from our institution [21]. Diffusion-weighted imaging was not applied. All MRI studies were reviewed by one of the participating authors. The entire dynamic series was evaluated during MRI analysis.

Correlation between pre-treatment mammogram and MRI was obtained to detect patients in whom the extent of suspicious pleomorphic calcifications corresponded with the extent of suspicious MRI enhancement. Only patients providing such correlations were included in the “calcification present” arm of the study, designed to assure that the entire extent of calcified disease was represented on the MRI, with no non-enhancing DCIS component. No residual enhancement on post-treatment MRI was documented as radiological complete response (rCR). Mammography and MRI examinations were evaluated separately by two independent readers with 20 (T.S.) and 9 (Y.A.L.) years of experience in breast imaging. Discrepancies were resolved by consensus. Analysis of quantitative data was performed applying the more senior author’s measurements.

Pathology

Pre-treatment biopsy samples were acquired either by ultrasound (US)–guided core needle biopsy (CNB) utilizing a 14-gauge needle with a 2.2-cm throw (Magnum Biopsy Gun, Bard) and obtaining at least 3 samples or by mammography-guided stereotactic biopsy performed on a dedicated prone table (Multicare, Hologic) using a 9-gauge VAB system utilizing a 12-cm (12- or 20-mm sampling chamber) needle (ATEC, Hologic) with at least 10 samples obtained. Suspicious lymph nodes in the ipsilateral axilla were sampled by semi-automated 14-gauge CNB device (TEMNO evolution, Merit Medical) with 2 cores obtained.

Tumor histology of pre-NAC biopsy specimens was classified using the WHO classification of breast tumors as invasive ductal carcinoma (IDC), invasive lobular carcinoma (ILC), or other [22]. Presence of ductal carcinoma in situ (DCIS) was also documented. Grading was performed for IDC according to the Nottingham grading system [23], while due to controversy regarding histological grading in ILC [24, 25], it was not routinely performed in this study. Immunohistochemical analysis included receptor status for estrogen (ER), progesterone (PR), and HER2, with FISH amplification as needed.

Final post-NAC surgical pathology was used as the standard of reference to specify pCR which was defined as no residual invasive tumor. Thus, for evaluation of MRI prediction of response, residual “DCIS only” was considered pCR. MRI prediction of response was correlated with surgical pathology regarding the presence of residual invasive disease and if differed from pathology, was defined as either over or underestimation.

Statistical analysis

The association between two categorical variables was determined using the chi-square test or the Fisher exact test. Quantitative variables between two independent groups were compared using the independent-samples T test. Quantitative variables between three independent groups or more were compared using ANOVA. All tests applied were two-tailed, and a p value smaller than 0.05 was considered statistically significant. The confidence interval was calculated at 95% confidence level. Cohen’s kappa coefficient was applied to determine interobserver agreement. Statistical analysis was performed using a commercial statistical software (SPSS Statistics for Windows, version 22.0, IBM) and Excel© software program (Microsoft Corporation).

Results

Patient and tumor characteristics

Table 1 shows the patients and tumor characteristics of the 114 patients included in the study. On mammography at diagnosis, 62 (54%) patients showed no calcifications, comprising group A, and 52 (46%) patients presented with calcifications, comprising group B. There was no significant difference between the two sub-groups regarding clinical factors such as age, family history, and menopausal status, with most patients being young premenopausal women.

Table 1 Patient and tumor characteristics overall and in each of the groups separately. Group A consisted of patients with no mammographic calcifications. Group B consisted of patients with mammographic tumoral calcifications

On pathology at diagnosis, IDC was by far the most common histologic type with only 3 patients indicating ILC and none demonstrating calcifications on mammogram. Tumors displaying an ER-negative/HER2-positive receptor status more commonly exhibited calcifications (33%, n = 17 calcified vs 13%, n = 8 non-calcified; p < 0.05), whereas with triple-negative pathology, calcifications were rarely found (6%, n = 3 calcified vs 33%, n = 20 non-calcified; p < 0.05). Otherwise, the two groups did not differ in pathology features including tumor size, lymph node involvement, and presence of a DCIS component (Table 1).

Following NAC, 66/114 (58%) patients underwent mastectomy compared with 48/114 (42%) who had BCS performed. Patients with calcifications in their tumors had more extensive surgery, undergoing mastectomy in 67% compared to 50% if no calcifications were present, just short of reaching statistical significance (p = 0.06).

Imaging characteristics

Imaging characteristics are summarized in Table 2. At diagnosis, 14 (23%) of the 62 patients with no calcifications (group A) had normal mammograms and 50/62 (81%) displayed a mammographic mass, more commonly than in patients with calcifications (group B) (26/52, 50%; p < 0.05). Architectural distortion was rare in both groups. The average size of a mass on mammography was approximately 3 cm (range 0.6–9 cm) and was similar in both groups (p = 0.226). Calcifications found in group B extended over 0.3–10 cm (average 4.3 ± 2.96). Post-NAC mammograms were available in 21/52 patients with pre-NAC calcifications (40%), 10 of them in complete responders to NAC. Calcifications showed no interval change in size after treatment including in patients with no viable residual disease on pathology.

Table 2 Pre-treatment imaging characteristics on mammography and MRI, overall and in each of the groups separately. Group A consisted of patients with no mammographic calcifications. Group B consisted of patients with mammographic tumoral calcifications

On pre-treatment MRI, non-mass enhancement (NME) was significantly more common and more extensive in patients with calcifications (62%, n = 32 calcified vs 29%, n = 18 non-calcified; p < 0.05) while mass enhancement was more common in those without (90%, n = 56 non-calcified vs 81%, n = 42 calcified; p < 0.05). The size of mass enhancement was similar in both groups, as was the presence of multifocal/multicentric disease (group A 58% vs group B 62%; p = 0.651).

MRI prediction of response to NAC

Table 3 reports four types of response to NAC on pathology: complete response (pCR) in 43/114 (38%) patients, partial response in 60/114 (53%), residual DCIS with no invasive tumor in 8/114 (7%), and residual tumor in an axillary lymph node with no tumor in the breast in 3/114 (3%) patients. No significant difference in type of response on pathology was found between group A and group B (p = 0.634). Among the patients with residual disease, average tumor size was 3.2 ± 2.32 cm (range 0.4–10), comparable in both sub-groups (p = 0.869).

Table 3 Response to neoadjuvant chemotherapy on pathology and on MRI and the correlation between them. Group A included patients with no pre-treatment mammographic calcifications. Group B consisted of patients with pre-treatment mammographic tumoral calcifications

Complete MRI response post-NAC (rCR), defined as no residual enhancement, was found in 40% (46/114), with residual disease depicted in 60% (partial response 53%, 60/114; no response 7% 8/114). Response on MRI was comparable between groups A and B (p = 0.837). Residual NME was more extensive than residual mass enhancement (mean 4.4 ± 2.51 vs 2.1 ± 1.41, respectively), consistent in both sub-groups. MRI estimation of response was concordant with pathology in 69%, both overall and in both groups on sub-group analysis (Table 3, Fig. 2). When discordant, the overall rates of overestimation and underestimation were equivalent (16% and 15% respectively), with overestimation slightly more common in group A (A = 18% vs B = 13.5%) and underestimation in group B (A = 13% vs B = 17.5%), though not statistically significant (p = 0.42). Among patients with rCR, the presence of calcifications did not affect the mastectomy rate (p = 0.743); however, there was a trend towards more extensive BCS in patients with calcifications with a lumpectomy specimen size of 33 ± 17.32 cc compared to 19.3 ± 13.53 cc without calcifications (p = 0.078). Furthermore, among the 22 patients with calcifications demonstrating rCR, the scope of calcifications was significantly larger in those who underwent mastectomy compared to BCS (6.5 ± 3.32 cm, range 0.7–10, n = 13 vs 1.3 ± 1.22 cm, range 0.4–4, n = 9 respectively; p < 0.05).

Fig. 2
figure 2

26-year-old with IDC grade 3 ER-negative PR-negative HER2-positive breast cancer. Pre-treatment MLO mammogram (a) shows extensive calcifications in the right upper breast (white arrows) and a post-biopsy clip (black arrowhead) Pre-treatment MRI T1W subtracted MIP image (b) shows extensive enhancement (arrows) correlating with the mammographic calcifications. Complete radiological resolution of enhancement is demonstrated on post-treatment MRI (c). Patient underwent right mastectomy to ensure complete excision of the calcifications. On surgical pathology, complete pathologic response was determined, thus establishing good correlation between MRI prediction and surgical pathology findings

Analysis of response by immunohistochemical subtypes (Table 4) shows an improved response to NAC in patients with ER-negative/Her2-positive tumors compared to other subtypes, both on pathology and on MRI, with pCR and rCR of 72% (p < 0.05). This may be related to the addition of biological therapy to standard NAC in these patients. There was no significant difference regarding MRI/pathology concordance (56.5–76%; p = 0.444) as well as overestimation and underestimation rates in discordant cases, for any of the receptor subtypes.

Table 4 Response to neoadjuvant chemotherapy on pathology and on MRI and the correlation between them by immunohistochemical subtype

Imaging interpretation

All images were reviewed by two independent readers with 20 (T.S.) and 9 (Y.A.L.) years of experience in breast imaging. There was initial disagreement regarding dichotomization by mammographic calcifications in 8/114 patients (7%, kappa = 0.95). Quantification of the extent of calcifications showed no difference between the readers (p = 0.326). Similarly, measurements of mass size and extent of NME on MRI were comparable (p = 0.626 and p = 0.574 respectively). There was substantial agreement regarding the nature of response on MRI (kappa = 0.86) with discrepancy in 11/114 (10%) patients, including discordance between partial and no response in 8 and disagreement between partial and complete response in 3 (1 with motion artifact and 1 with moderate BPE on post-NAC MRI). All discrepancies were resolved by consensus.

Discussion

The current study provides evidence that MRI accurately recognizes pCR and correctly correlates with post-NAC surgical pathology in the majority of our patients, regardless of the presence or absence of mammographic calcifications. In contrast, mammographic calcifications are inadequate predictors of residual invasive disease. Current surgical policy promotes complete excision of all suspicious mammographic calcifications in patients following NAC regardless of MRI evaluation of response [18, 26]. We question this notion, as comparison of our patient cohort with a pre-NAC comprehensive coverage of mammographic calcifications by MRI enhancement and a respective cohort of MRI tumor-enhancing patients lacking calcifications, exhibits equivalence in patient and tumor characteristics, in response to NAC on surgical pathology and in the ability of MRI to correctly estimate this response. Nevertheless, surgery was more extensive in patients with widespread calcifications. The current study supports that at least in the subset of patients represented, post-NAC surgical planning should be based on MRI assessment of response rather than on mammographic tumoral calcifications.

Calcifications on mammography following NAC are confusing as they may represent residual invasive disease, residual DCIS, or benign post-treatment necrosis. Feliciano et al reported that the majority of calcifications in 90 breast cancer patients post-NAC were benign (62%) [18]. An et al reported that residual calcifications did not correlate with residual malignancy in 45% of 29 patients following NAC, and that the extent of residual calcifications was in poor agreement with the size of residual disease. In their study, in 36% of patients with residual disease on pathology, calcifications did not correlate with viable tumor [20]. In our study, 20/52 (38%) patients with calcifications achieved pCR, and in non-complete responders, the scope of calcifications was larger than that of residual disease on pathology, supporting these observations.

The current study displays some variations in imaging characteristics of calcified vs non-calcified tumors. Specifically, tumors without calcifications more commonly exhibited masses both on mammography and on MRI, while tumors demonstrating calcifications were more commonly depicted on MRI as NME. ER-negative/Her2-positive tumors displayed more calcifications compared to other molecular subtypes, while triple-negative tumors were rarely calcified. Additionally, the ER-negative/Her2-positive subtype demonstrated a significantly better response to NAC, reaching 72% pCR. None of these variances affected the accuracy of MRI to assess response to NAC.

The ability of MRI to correctly predict pCR is high, reported as up to 74% in a study of 746 patients [13], similar to the 69% concordance between rCR and pCR in the current study in both groups. A meta-analysis of 44 studies examining the accuracy of MRI for depicting residual disease showed an overall AUC of 0.88, slightly differing by definition of pCR to include or exclude a DCIS component [27]. Thus, MRI is recognized as the most accurate imaging modality in predicting pCR, and current studies suggest that de-escalation of post-NAC treatment might be extended to a point where no surgery is applied in specific subsets of excellent treatment responders, established as no residual MRI enhancement [28, 29]. Likewise, based on the uniformity of MRI prediction of response in both groups of this study, we propose that de-escalation of surgical management of patients with calcifications should be evaluated in prospective clinical trials, consistent with the paradigm that MRI enhancement, not calcification per se, predicts the presence of residual disease.

Proponents of complete excision of calcifications, regardless of MRI evaluation, argue that leaving calcifications in the breast may leave behind low-grade DCIS which often does not enhance on MRI. Choi et al analyzed factors associated with false-negative MRI prediction of complete response to NAC in 209 patients, 41% with mammographic calcifications. They found that tumors displaying calcifications reached pCR significantly less than non-calcified tumors (31% vs 69%) [14]. This finding differs from our results which showed equivalent response to NAC in calcified and non-calcified tumors (38% and 37% respectively). However, the definition of residual disease in their study included DCIS which was present as the sole residual pathology in 43% of false-negative MRI evaluations. A recent study in 115 patients with complete MRI resolution of disease demonstrated no association between residual mammographic calcifications post-NAC and invasive cancer, although a relationship with residual DCIS was shown [30]. In our study, only the specific subset of patients with tumors exhibiting MRI enhancement throughout the volume of all detected calcifications were explored, and previous studies have indicated that higher-grade clinically relevant DCIS enhance on MRI [31], hence addressed by our study design. In this select cohort, DCIS was present on initial biopsy in 25% of patients (20% without and 32% with calcifications), yet residual DCIS following NAC was found in merely 7% (8% without and 6% with calcifications), suggesting that the majority of DCIS responded to treatment. Also, it has been documented that only about half of DCIS demonstrates mammographic calcifications [32]; thus, residual DCIS may persist regardless of calcification status of the disease. Correspondingly, false-negative MRI estimation of pCR occurred in our study in 15% of patients. However, this incorrect assessment was equal in patients with and without calcifications. Thus, the approach of post-NAC tissue sampling to evaluate pathological response may be equally indicated in both groups. Currently, such sampling is usually achieved by limited surgery around a pre-treatment localizing clip. Alternatively, initial studies suggest the possibility that large core percutaneous biopsy sampling may substitute upfront surgery in this scenario [33]. Furthermore, it is argued in the literature that residual calcifications may complicate follow-up [18], although surveillance can be achieved with MRI to detect potential recurrences, as is currently performed in patients without calcifications. In the rare case of residual disease present among non-resected calcifications undetected by MRI, these likely represent lower-grade DCIS for which clinical relevance is questionable. MRI surveillance could potentially pick up any change in tumor aggressiveness and prompt biopsy for further evaluation.

There were several inherent limitations to our study. Being an observational retrospective study, control of unmeasured confounders was limited. For example, post-NAC mammograms were available in less than half of the patients with initially proven mammographic calcifications. However, no changes in calcifications were observed where data was available and calcifications were poor predictors of NAC response. Furthermore, given lack of clear guidelines, selection bias regarding referral to MRI is probable, specifically in patients who may not be referred to MRI because a clinical decision to perform mastectomy is applied upfront, as was encountered in a significant number of patients with extensive calcifications who were excluded from our study (Fig. 1). In a large proportion of our study population, mammograms were performed at various external facilities on a variety of mammography units, running the risk of technical non-uniformity. Furthermore, the follow-up in the current study is so far too short to determine recurrence rates (less than 5 years) and remains a topic for future investigation. Finally, a relatively small number of patients, specifically in some of the subgroups, may have limited statistical significance.

In summary, our observations show comparable MRI prediction of response in both subgroups and question the more extensive surgery applied in patients with calcified disease. We suggest relying on post-NAC MRI findings rather than on magnitude of mammographic calcifications in planning the extent of surgery in the designated specific cohort of patients, with surveillance for recurrence based on MRI rather than mammography. Prospective studies to investigate a reduction in the scope of surgery accordingly are warranted. Further examination of such an approach in women with calcifications that extend beyond MRI enhancement may also be indicated to define the effect of low/intermediate DCIS on MRI prediction.