Monitoring the treatment effect of neoadjuvant chemotherapy on breast cancer is the subject of multiple studies due to its implications regarding individual patient care as well as in clinical trials and pharmacologic studies. Physical examination, conventional two-dimensional (2D) ultrasound (US), mammography, and magnetic resonance imaging (MRI), alone or in combination, have all been investigated with regard to the accuracy and efficacy of evaluating tumor response during and after chemotherapy.19 To date, commonly used imaging modalities and varying examination techniques, even in combination, demonstrate inaccuracies of both over- and under-estimation of residual tumor volumes in response to neoadjuvant chemotherapy (Table 1). There is little consistency among studies to even suggest which modalities tend to over- or under-estimate residual disease.

Table 1 Studies examining tumor assessment methods, 1997–2013

Uniform recommendations for the assessment of solid tumor response to treatment were initially published in 1979 in the WHO Handbook for Reporting Results of Cancer Treatment.10 The original WHO Handbook defined response as four categories: complete response (CR), partial response (PR), stable disease, and progression of disease, which have become the basis of current solid tumor evaluation. That initial outline has been revised and modified to the current Response Evaluation and Criteria in Solid Tumors (RECIST), published in 2000.11 These criteria have become the universal standard for not only the WHO but also the National Cancer Institute and the European Organization for Research and Treatment of Cancer. Updated 2009 RECIST guidelines further detail the complexities of MRI as it introduces more variability due to the multiplanar nature of the images and variations in scanners as well as motion artifact. Because of these issues, a computed tomography (CT) scan is generally preferred, although MRI is considered acceptable and is usually utilized for breast cancer assessment as CT is a poor imaging modality for the breast in general. However, MRI is not without its critics, and the issue of accuracy introduces sufficient inconsistency to reconsider the application of the RECIST criteria as an appropriate measure, particularly with regard to breast cancer.12,13 One study suggests that MRI volumetric studies are more accurate than applying RECIST criteria to assess the response to neoadjuvant treatment in breast cancer.13 The RECIST criteria do not advocate standard 2D US for lesions that are not easily accessible clinically; however, it has been suggested for use in assessing lymph nodes, subcutaneous lesions, and thyroid nodules. The guidelines explicitly discourage the use of US due to the subjective nature of the studies and the unreliable reproduction of a real-time examination on hard copy. Without defined anatomic landmarks, the reproducibility of the 2D US evaluation is inconsistent and is operator dependent.11

Three-dimensional (3D) US has been investigated for both solid organ analysis and vascular evaluation. As conventional US images are in two dimensions, there are several different ways in which 3D US images could be obtained.12,14,15 Freehand 3D US with magnetic field sensor tracking has been used by many investigators. In this technique, a transmitter placed near the patient produces a spatially variable magnetic field; the field is sensed by a receiver attached to the US transducer, generating a spatial map for the third dimension image reconstruction. However, the tracking accuracy of this equipment may be compromised by the presence of other magnetic fields or electrical interference, and these devices are not widely available or used. Mechanical 3D US scanning and, more recently, digital 3D US, can also be performed by an apparatus that rotates the US transducer over the desired organ while acquiring 2D images at regular intervals. This method allows for rapid image reconstruction during acquisition. Of note, 4D US refers to real-time 3D US imaging, integrating motion, which is feasible in newer machines.16 The addition of digital technology over the older mechanical hardware improves coverage, speed of acquisition, and image resolution.

We hypothesized that breast cancers could be assessed on 3D US with equivalence to MRI in detection of pathologic residual in-breast disease after systemic therapy. Even if equivalent in accuracy to MRI, 3D US is advantageous over MRI with regard to cost, time, ease of interpretation by multiple clinicians, accessibility, and avoidance of contrast agents. By correlating post-treatment tumor volumes obtained on US and MRI with the final pathologic tumor volume, our objective was to demonstrate the feasibility of using 2D or 3D US to assess response to treatment.

Materials and Methods

This was an Institutional Review Board-approved prospective trial of women diagnosed with an incident, invasive ductal breast cancer who received standard-of-care neoadjuvant chemotherapy protocols based on National Comprehensive Cancer Network (NCCN) guidelines at a single institution. All patients were seen at a single multidisciplinary breast cancer program at an NCI-designated comprehensive cancer center from 2008 to 2012. All patients were seen for an initial consultation by one of four fellowship-trained breast surgeons at our institution and by a medical oncologist for planned systemic preoperative chemotherapy. Exclusion criteria included women with clinical breast masses smaller than 3 cm, clinical skin involvement (including inflammatory breast cancer), inability to undergo MRI, or not planning surgery at our institution. Patients with multicentric breast cancers were offered entry into the study if the largest lesion was greater than 3 cm on clinical assessment by physical examination or imaging. Cases with radiologic lesions greater than 3 cm but without a sonographic mass correlate on initial diagnostic imaging were excluded due to concerns regarding extensive ductal carcinoma in situ (DCIS) without a measurable invasive component. Patients were consented to the study prior to the initiation of systemic therapy (Fig. 1).

Fig. 1
figure 1

MRI and two-dimensional ultrasound result of breast lesion. These images show significant similarities regarding residual disease dimensions, and both were concordant with pathological results. MRI magnetic resonance imaging (images courtesy of Dr. Blaise Mooney)

For all enrolled subjects, 2D and 3D US and MRI images of pre- and post-neoadjuvant tumors were obtained in addition to serial physical examinations; all imaging studies were archived electronically. All investigational 3D US images were obtained on a single machine (GE Voluson 730 Pro 3D/4D) by one of two breast surgeons (MCL and JVK). In particular, physical examination of the affected breast was concurrently performed by the performing physician at the time of 3D US, for uniformity. 3D US images were archived and reviewed at the time of the study (MCL) and independently reviewed by a second surgeon (SJG). All 2D US, mammogram, and MRI imaging was reviewed for clinical purposes by a dedicated breast radiologist prior to initiating systemic therapy, and again prior to definitive breast surgery. Outside images of adequate quality were not repeated. All subjects had comprehensive imaging within 30 days prior to initiation of chemotherapy and again within 30 days of completion of chemotherapy, prior to breast surgery. After accrual closure, the stored conventional images were re-reviewed by a breast radiologist (BM) for the purposes of this study.

Surgical pathology was reviewed and reported by breast fellowship-trained pathologists. For the purposes of our analysis, estrogen receptor (ER) and progesterone receptor (PR) staining <5 % were considered negative; Her2neu status was determined by immunohistochemistry (IHC) and verified by in situ hybridization, either fluorescent (FISH) or dual-imaging (DISH). In the cases of indeterminate Her2neu positivity on IHC, in situ hybridization results were used for treatment planning. In addition to comparison between pre- and post-treatment imaging, residual tumor on imaging was compared with residual invasive disease on surgical pathology. In situ lesions (i.e. DCIS) were not considered evaluable residual disease.

Statistical Analysis

Patients’ demographic and clinical characteristics were summarized using descriptive statistics. Differences of tumor volume on post-treatment imaging and pathology were compared using the non-parametric Wilcoxon signed-rank test. US to MRI agreement was determined by the kappa coefficient. Tumor volumes in the ER, PR, and Her2neu subgroups were compared using the Kruskal–Wallis test.

Results

Forty-two patients were enrolled between September 2008 and September 2012; three patients withdrew or did not have complete imaging, therefore 39 patients had evaluable data. Table 2 summarizes the patient population. Among the 37 patients who had ER/PR/Her2neu data, 18 (49 %) cases were ER positive, 14 (38 %) were PR positive, and 4 (11 %) patients were Her2neu positive. A total of 17 (46 %) were triple-negative tumors. All subjects were females, with a mean age of 46.9 (range 24–64) years at diagnosis. All patients received neoadjuvant chemotherapy. It is important to note that for lesions less than 20 cm3, there was no statistically significant difference between the different imaging modalities on the Wilcoxon signed-rank test (p > 0.05). Among 11 (28 %) cases with pathologic complete response (pCR), the 2D US correctly predicted pCR in six (54.5 %) cases, and the 3D US predicted pCR in five (45.5 %) cases, compared with eight (72.7 %) cases when using MRI. This represents substantial agreement between US and MRI in predicting pCR (kappa = 0.62). MRI had the lowest (57 %) positive predicted value (PPV) for detecting a pCR after neoadjuvant chemotherapy. However, this modality showed the highest (88 %) negative predicted value (NPV) of this study (Table 3). For detection of pCR, MRI had a false positive rate of 43 % (6 of 14), which was also the highest in this series. The false negative rate of 14 % was also higher than the other modalities tested. Interestingly, the false positive rate for a pCR had an error of missed tumors up to 1.6 cm in size (range 0.01–1.6 cm). In other words, despite no evidence of residual disease on MRI, suggestive of pCR, final pathology actually demonstrated residual tumors measuring up to 1.6 cm. This is significantly higher than the other modalities compared (Table 4). Regarding the detection of the presence of residual disease, a false positive rate of 13.6 % was noted for this modality.

Table 2 Features of enrolled subjects (n = 39)
Table 3 Sensitivity, specificity, PPV and NPV to assess pathologic complete response after neoadjuvant treatment for breast cancer
Table 4 True and false positive/false negatives for pathologic complete response after neoadjuvant treatment for breast cancer

For the 2D US, the PPV for detecting a pCR after neoadjuvant chemotherapy was 75 %. The NPV of conventional 2D US was 84 %, suggesting very accurate predictive values, similar to that of the MRI examination (Table 3). Remarkably, the false positive rate was 25 %, the lowest of this series. On closer investigation, the false positives for pCR were noted to be smaller than 1 cm3 on final pathology. For the detection of residual disease, a false positive rate of 21.7 % was noted. The 3D US transducer showed intermediate predictive values; the PPV was 63 % and the NPV was 81 % (Table 3). For detection of a pCR, this US modality showed a false positive rate of 38 %. Of interest, the false positives (tumors missed) had the smallest tumor size (0.003–0.24 cm3) of this series. On the other hand, the ability to detect residual disease had a false positive rate of only 8 %, the lowest of this series (Table 4).

When stratified by receptor status, 3D US demonstrated a relationship to triple-negative disease. ER-negative tumors had a significantly higher proportion (40 vs. 0 %) of post-treatment 3D US agreement to pathology volume (Fisher’s exact test p = 0.0068). Similarly, triple-negative tumors had a significantly higher proportion (46 vs. 0 %) of post-treatment 3D US agreement to pathologic volumes compared with ER + Her2 − and ER + Her2 + tumors. PR-negative malignancies trended towards agreement of post-treatment 3D US to pathology (31.6 vs. 0 %; Fisher’s exact test p = 0.0585). The accuracy of the MRI and 2D US modalities did not significantly change with receptor status.

Discussion

MRI has been adopted as the primary method for evaluating response to neoadjuvant chemotherapy in breast cancer 17 with limited prospective supporting data, likely secondary to the presumed inferiority of other imaging modalities. Unfortunately, MRI continues to have significant limitations, including cost, availability, and patient acceptance. Moreover, the accuracy of this test has been questioned and its ability to reliably detect residual disease after neoadjuvant treatment has been found to be less than ideal,18 and independent factors such as tumor markers can further compromise the accuracy of this test, including overestimation of residual disease.19,20

For clinicians, an overestimation of residual disease may not be as critical as missing residual tumor, especially given the potential consequences of an incomplete resection. Given the advances in systemic therapy and the increasing rates of pCR, particularly with Her2neu-targeted agents, observational monitoring has been proposed in lieu of surgery, and the reliability of imaging is crucial to such an endeavor.21 Therefore, it is critical to understand the limits of the various imaging modalities in terms of predictive value. Compared with other tests, the accuracy, PPV and NVP of MRI to assess pCR has shown suboptimal results.22,23 Retrospective studies have suggested that MRI is not superior to other modalities in the evaluation of post-neoadjuvant treatment response.24,25 Given the growing role of neoadjvuant chemotherapy for breast cancer patients, the importance of accurate modalities for ongoing evaluation and close monitoring of these patients cannot be overemphasized.

US circumvents many of the difficulties observed with breast MRI; in particular, accessibility, cost, and patient comfort are improved over MRI. In addition to provider acceptance, the ability to obtain an in-clinic evaluation is of prime importance to physicians and may facilitate improved monitoring of tumor response during repeated clinic visits. However, the paucity of prospective data comparing these modalities, and absolute lack of prospective studies evaluating the potential use of 3D US, formed the basis of our current prospective trial.

Clearly, post-chemotherapy imaging is an ongoing debate, and the results of these studies significantly impact surgical decision making, particularly given the fact that some patients elect neoadjuvant chemotherapy with the goal of pursuing breast-conserving therapy.26 Physical examination is notoriously inaccurate in the evaluation of response to neoadjuvant treatment; therefore, diagnostic breast imaging forms the cornerstone for many surgical decisions. In addition to the prediction of residual disease and the associated false-positive and false-negative results, an accurate assessment of the volumetric residual disease is key in the decision-making process between patient and surgeon.

Our results demonstrate a suboptimal predictive value of MRI for CR, and associated higher volumetric discrepancy to final pathology. This clearly challenges the current paradigm of breast MRI as the single and unique best option to evaluate treatment response. Our study supported the use of both 2D and 3D US modalities as non-inferior in the prediction of residual disease, and volumetric error was significantly smaller compared with MRI for both types of US examinations. From the patient’s perspective, this modality provides less radiation exposure, eliminates the need for intravenous contrast, reduces expense and, due to its availability and ease of interpretation, may facilitate detailed clinical evaluations throughout the course of the treatment.

Arguably, the strongest benefit of the MRI examination is confirmation of clinical findings and provision of descriptive findings, including a volumetric estimation, based on fixed imaging with defined landmarks. As previously mentioned, a limitation of 2D US is the lack of fixed anatomic landmarks, which may hamper or complicate interpretation by providers not performing the examination. However, the 3D US examination eliminates some of the variability in interpretation by generating an entire volumetric analysis of the lesion in question. By eliminating this source of variability, multiple providers have access to uniform imaging, both for assessment of response as well as surgical planning. In this manner, the 3D modality minimizes operator dependency while maintaining the more global benefits of the US.

Conclusions

Further studies to confirm the non-inferiority of US in this application are recommended. In particular, consideration should be given to evaluating US as a modality directed towards response assessment of triple-negative breast cancers.