Introduction

Breast cancer is one of the most common causes of cancer mortality in women and accounts for up to 28 % of all new cancer cases among women [1]. For breast cancer evaluation, breast magnetic resonance imaging (MRI) is considered to be the most sensitive method and has been gaining popularity as an adjunctive imaging tool to conventional imaging including mammography [24], even though the role of preoperative MRI remains controversial [5]. Indeed, with the use of an intravenous contrast agent, breast MRI has been shown to offer high sensitivity of up to 94-100 % for the detection of breast cancers and has no limitations in dense breasts [6, 7]. However, the specificity of MRI is known to be limited, ranging from 60 to 70 %, with a high false-positive benign biopsy rate [7, 8]. In addition, breast MRI has other known limitations, such as high cost, longer examination time and lower availability compared with other breast imaging tools.

Digital breast tomosynthesis (DBT) is an emerging digital mammography technique that provides cross-sectional data. DBT provides thin section tomographic images reconstructed from multiple low-dose projection images acquired from varying angles of X-rays passing through the breast [9, 10]. Through this technology, DBT may improve lesion visibility, consequently yielding higher sensitivity and specificity compared with conventional mammography [1113]. Until now, direct comparison between DBT and MRI has rarely been performed, and the limited data thus far have shown only comparable performance [14]. We believe that comparison of DBT and MRI in terms of sensitivity, the number of false-positives and positive predictive value (PPV) is warranted at this time to widen our understanding of DBT as an emerging imaging system.

Therefore, the purpose of this study was to compare the diagnostic performance of DBT and breast MRI as an adjunctive imaging to mammography in women with known breast cancers.

Materials and methods

This study was performed under design of a retrospective review with prospectively collected patients. Our study protocol was approved by the institutional review board, and written informed consents for DBT were obtained prospectively from all participants

Patients

Between March and December 2012, 220 consecutive patients with breast cancer underwent digital mammography with DBT as well as breast MRI for preoperative evaluation. Among these patients, 48 were excluded: 38 underwent biopsies through surgical excision or vacuum-assisted core needle biopsy prior to image acquisition, six were treated with neoadjuvant chemotherapy, one had a breast augmentation, and three were lost to follow-up. This resulted in a total of 172 patients constituting our study population. In these 172 patients, a total of 184 breast cancers were identified. In addition to 172 unifocal breast cancers, six cancers were additionally identified in contralateral breast and another six cancers in the ipsilateral breast (different quadrant from index cancers) confirmed separately prior to surgery.

The mean age of the patients was 51.3 years (range, 22-78 years). Of the 172 patients, 109 women (63.4 %) had palpable masses, 60 were asymptomatic (34.9 %) and the other three women (1.7 %) had symptoms of pain, nipple retraction and skin thickening.

Image acquisition

Digital mammography was performed with a full-field digital mammography unit with integrated DBT acquisition (Selenia Dimensions; Hologic, Bedford, MA, USA). Participants underwent bilateral two views (craniocaudal and mediolateral oblique) in the Combo mode: Digital mammography and DBT images were obtained with a single breast compression for each projection. In terms of radiation exposure, for a breast with a compressed thickness of 5.0 cm and a 50 % glandular fraction, DBT acquisition resulted in an 8 % higher mean glandular dose (MGD) per view than a digital mammography acquisition (1.30 and 1.20 mGy, respectively).

All MRIs were performed using a 1.5-T MR imager (Signa; GE Medical Systems, Milwaukee, WI, USA) with a dedicated eight-channel bilateral breast coil (GE Medical Systems) with the patient placed in the prone position. Dynamic contrast–enhanced MR examinations included one precontrast and five postcontrast examinations with bilateral sagittal image acquisition using a fat-suppressed T1-weighted three-dimensional fast spoiled gradient-echo sequence with parallel imaging. Gadobutrol (Gadovist; Bayer Healthcare, Berlin, Germany) was injected into an antecubital vein using an automated injector (Spectris Solaris; Medrad, Maastricht, Netherlands), at a concentration of 0.1 mmol/kg and a rate of 2 mL/s, followed by a 20-mL saline flush. The time interval between digital mammography with DBT and MRI ranged from 0 to 61 days (mean, 9.6 days). The order of the exams depended on the scheduling availability of our imaging suites and was not randomised. A clip was not inserted before MRI.

Performance study design

Each radiologist independently undertook three separate reading sessions for images of all patients. Each reading session contained one-third of image sets of mammography alone, DBT plus mammography and MRI plus mammography, which were randomised and presented in alternating order. Each radiologist from an academic institution had respectively clinical experience of 8, 8 and 12 years in mammography and MRI, and 3, 3 and 2 years in DBT. Each reading session was separated by more than 4 weeks to minimise any learning bias. The radiologists were aware of the overall goal of the study prior to the reading session and knew that the patients were diagnosed with breast cancer and the diagnostic performance would be estimated by lesion base. Thus, the radiologists were instructed to document all visible lesions even though the most suspicious lesion is clearly seen to validate the PPV estimates that each lesion is independent from the presence of another positive lesion in the same patients. They were blinded to the locations of all breast lesions as well as the results of other imaging modalities or clinical data. All radiologists were asked to record the number and location of all abnormalities using a graphical interface and standardised template to prevent lesion misallocation. For each lesion, radiologists assigned a confidence probability for malignancy using the American College of Radiologists' Breast Imaging Reporting and Data System (BI-RADS) categories: BIRADS 1 (negative), BIRADS 2 (benign), BIRADS 3 (probably benign), BIRADS 4A (low suspicion for malignancy), BIRADS 4B (moderate suspicion for malignancy), BIRADS 4C (high suspicion for malignancy) and BIRADS 5 (highly suggestive of malignancy). Cases assigned a BI-RADS category of 1, 2 or 3 were considered normal or benign, and those assigned a BI-RADS category of 4 or 5 were considered abnormal or malignant.

Reference standard

All patients underwent surgical excision of their primary cancer and sentinel lymph node biopsy or axillary dissection. Lesions with malignant results after biopsy or surgical excision were considered positive. Additional lesions detected by either modality with concordant biopsy results and those that did not undergo biopsy with no evidence of breast malignancy after 1 year of clinical or imaging follow-up were considered negative. Lesion-matching between each imaging modality and pathology was performed off-site in consensus by two breast radiologists with pathology reports of the surgical specimen and biopsy samples as well as the standardised templates used in the image review. The lesions in different image modalities were considered to be same lesions when they were located within 2 cm distance, and the location of the breast lesions were considered accurate if it was not more than 2 cm different from the location of the lesion described at lumpectomy or mastectomy. Surgical pathology and core specimens were reviewed by one breast pathologist.

Data collection and statistical analysis

Clinical, radiological and histopathological findings of all patients were reviewed. Mammographic breast density was determined according to BI-RADS breast density grading [15], and lesion characteristics on each modality were evaluated by two radiologists in consensus.

The primary object was to compare the diagnostic performance of DBT and MRI as an adjunctive imaging to mammography in women with known breast cancers. For each interpretation session, a jack-knife alternative, free-response receiver operator curve (JAFROC) was calculated using the software, JAFROC version 4.1. The mean diagnostic accuracy was calculated based on the mean figure of merit (FOM), defined as the probability that a lesion localisation is rated higher than the highest rated non-lesion localisation on normal images. JAFROC calculates the mean FOM of each modality and the difference between the means with 95 % confidence intervals. To evaluate the differences between the image interpretation sessions, the P value associated with an F-test from an analysis of variance model was examined.

The sensitivities and positive predictive values (PPVs) of each imaging modality for the detection of breast cancers were calculated on a per-lesion basis in the on-site evaluation by one reader, and were assessed in the off-site evaluation by each reader using the number of lesions assigned a BI-RADS category 4 or 5 from the total number of breast cancers. In addition, we analysed sensitivities in subgroups of women with a single lesion and women with two or more lesions (non-single lesions). McNemar’s test and Fisher’s exact test were utilised to compare the sensitivities and PPVs for mammography alone, DBT plus mammography and MRI plus mammography.

Significance testing on the lesion level and patient level was conducted using generalised estimating equations (GEEs) with a logit link and an independent working correlation structure to adjust the effect of clustering on radiologists and patients. GEEs were utilised to compare the sensitivities and PPVs for mammography alone, DBT plus mammography and MRI plus mammography. The proportion of patients with at least one false-positive finding was calculated for the false-positive lesions [16, 17] and compared with GEE. A two-tailed P value of less than 0.05 was considered to indicate a significant difference.

Results

Patients

All patient and lesion characteristics are described in Table 1. Malignant cases included 153 invasive ductal carcinomas with or without ductal carcinomas in situ (DCIS), 15 DCIS, 5 mucinous carcinomas, 4 invasive lobular carcinomas, 3 invasive papillary carcinomas, 2 medullary carcinomas, 1 apocrine carcinoma and 1 secretory carcinoma. The mean size of invasive cancers was 2.1 cm (standard deviation [SD], 1.2 cm) and the mean size of DCIS was 3.6 cm (SD, 1.6 cm).

Table 1 Characteristics of patients and lesions

Histopathological confirmation was available for 23 benign lesions obtained through excisional biopsy (n = 18) and core-needle biopsy (n = 5). They included 11 fibrocystic diseases, 4 fibroadenomas, 2 florid ductal hyperplasias, 2 atypical ductal hyperplasias, 2 columnar cell changes and 2 lobular carcinomas in situ (LCIS). Lesions with benign biopsies were followed-up with mammography, and/or ultrasound and MRI at 1 year (mean follow-up duration, 18.6 months; range, 11.4–24.8 months).

Breast compositions of the patients according to the BI-RADS as recorded during the initial clinical interpretation were six (3.5 %) almost entirely fatty cases (BI-RADS composition a), 32 (18.6 %) scattered fibroglandular densities (BI-RADS composition b), 101 (58.7 %) heterogeneously dense cases (BI-RADS composition c) and 33 (19.2 %) extremely dense cases (BI-RADS composition d). Among all 184 cancers, 102 cancers presented as masses, 28 as microcalcification, 27 architectural distortion, 18 as asymmetry and 9 cancers showed no abnormality on DBT plus mammography. On MRI plus mammography, 167 cancers presented as masses and 17 cancers presented as non-mass enhancement.

Diagnostic performance

JAFROC FOMs for mammography alone, DBT plus mammography and MRI plus mammography for each radiologist as well as the pooled FOM of all radiologists are listed in Table 2. FOMs for mammography alone, DBT plus mammography and MRI plus mammography ranged from 0.883 to 0.926, 0.931 to 0.943, and 0.972 to 0.990, respectively. The diagnostic performance of DBT plus mammography was significantly higher than that of mammography alone in two of three radiologists (radiologist 1 and 2, respectively, P = 0.0003 and P = 0.0032). Furthermore, the diagnostic performance of DBT plus mammography (FOM = 0.937) was significantly higher than that of mammography alone (FOM = 0.900, P = 0.0013) according to the pooled analysis. The diagnostic performance of DBT plus mammography was significantly lower than that of MRI in two of three radiologists (radiologists 2 and 3, P = 0.0338 and P < 0.0001). In the pooled analysis, the diagnostic performance of DBT plus mammography (FOM = 0.937) was significantly lower than that of MRI plus mammography (FOM = 0.978, P = 0.0006).

Table 2 JAFROC analysis of mammography alone, tomosynthesis plus mammography and MRI plus mammography

The sensitivities of the three imaging sets for each radiologist as well as for all radiologists are listed in Table 3. The sensitivity of mammography alone, DBT plus mammography and MRI plus mammography ranged between 76.1 to 81.5 %, 84.8 to 90.2 % and 97.3 to 98.9 %, respectively. In the pooled data, the sensitivity of DBT plus mammography (88.2 %; 95 % CI, 83.4-91.8 %) was significantly higher than that of mammography alone (78.3 %; 95 % CI, 72.5-83.1 %; P < 0.0001) and lower than that of MRI plus mammography (97.8 %; 95 % CI, 95.5-99.0 %; P < 0.0001) (Fig. 1). Among 172 patients, radiologists assigned only one suspicious (more than C4A) finding in 108 patients regardless of imaging modality and two or more suspicious findings were assigned in 76 patients for either mammography alone, DBT plus mammography, or MRI plus mammography. When we evaluated the sensitivities of each modality in subgroup of patients who have only one suspicious lesion versus patients who have two or more lesions (Supplementary table 1), overall sensitivity was significantly higher in DBT plus mammography than mammography alone in both subgroups. The sensitivities were higher in MRI plus mammography than DBT plus mammography in both groups across radiologists; one radiologist in subgroups with a single lesion and three radiologists with non-single lesions reached statistical significances.

Table 3 Sensitivity and positive predictive value of mammography alone, tomosynthesis plus mammography and MRI plus mammography
Fig. 1
figure 1

Images of a 56-year-old woman with invasive ductal carcinoma in her left breast. a Mediolateral oblique conventional digital mammography shows no significant abnormality. b Mediolateral oblique DBT images show a significant architectural distortion with a mass in upper breast. c Sagittal T1-weighted contrast-enhanced subtraction MR image shows an irregular mass

In contrast to sensitivity values, DBT plus mammography showed less false-positive findings and higher PPVs than MRI plus mammography (Table 4, Figs. 2 and 3). There were a total of 68 false-positive findings that were assessed as BI-RADS 4A or greater by at least one radiologist on any imaging interpretation session. These lesions remained stable during the follow-up period. When comparing the number of false-positive lesions between mammography and DBT plus mammography, 11, 12 and 2 false-positive lesions were made in mammography, by radiologists 1, 2 and 3, respectively. Among them, 5 (45.4 %), 8 (66.7 %) and 0 lesions were assessed as negative in DBT plus mammography. In contrast, 8, 10 and 2 additional false positive lesions were detected in DBT plus mammography by radiologist 1, 2 and 3. As for the PPV for mammography alone, DBT plus mammography, and MRI plus mammography, they ranged from 92.2 to 98.7 %, 90.7 to 97.5 % and 84.8 to 93.8 %, respectively. The PPV of DBT plus mammography was not significantly different from mammography alone in all three radiologists (all P > 0.05) but was higher with borderline statistical significance than with MRI plus mammography in two of three radiologists (P = 0.0577 and P = 0.0588, for radiologist 1 and radiologist 3). In the pooled data, the PPV of DBT plus mammography (93.3 %; 95 % CI, 90.4-95.4 %) was significantly higher than that of MRI plus mammography (89.6 %; 95 % CI, 85.9-92.4 %, P = 0.0282) but was not significantly different from that of mammography alone (94.5 %; 95 % CI, 90.9-96.8 %, P = 0.2978).

Table 4 False-positive findings of mammography alone, tomosynthesis plus mammography and MRI plus mammography
Fig. 2
figure 2

False-positive finding on MR images of a 56-year-old woman. No abnormality was detected on a mediolateral oblique digital mammography and b DBT images. c Sagittal T1-weighted contrast-enhanced subtraction MR images show a 2-cm non-mass enhancement that was assessed to be a suspicious finding by all radiologists. This lesion was excised and was finally diagnosed as fibrocystic disease at histological examination

Fig. 3
figure 3

Images of a 56-year-old woman with contralateral, additional cancer (shown) in her right breast and an index cancer in her left breast (not shown). All radiologists were not able to identify additional cancer either on a digital mammography or b DBT images. c Sagittal T1-weighted contrast-enhanced subtraction MR images show a 0.9-cm enhancing mass, which was assessed to be a suspicious finding by all radiologists. This lesion was confirmed as a 1-cm invasive ductal carcinoma at histological examination

Discussion

In our study, we found that MRI plus mammography had higher diagnostic performance with higher cancer detection rate than DBT plus mammography; however, DBT plus mammography showed higher PPV with a lower number of false-positives compared with MRI plus mammography. The FOM of DBT plus mammography was significantly lower than that of MRI plus mammography (P = 0.0006), and this lower diagnostic performance was attributed to the lower sensitivity of DBT plus mammography compared with MRI plus mammography.

The inferiority of DBT in terms of sensitivity may be inherent and was, in fact, expected as DBT plus mammography is performed without a contrast agent and thus unable to reflect the neoangiogenesis of malignant lesions which can be illustrated with gadolinium-enhanced MRI [18, 19]. Despite of the improved lesion visibility provided by DBT via reduction of the superposition of breast tissue, DBT is still an X-ray projection technique and thus cannot completely enhance a lesion perimeter obscured by surrounding tissues as shown in our results. This finding is also consistent with a previous report in that there was a modest reduction in sensitivity of DBT in dense breast tissue compared with in non-dense tissue, while the sensitivity of MRI did not differ according to breast tissue density [14]. Nonetheless, the advantages of MRI need to be balanced with its disadvantages of high false-positive rates and low PPV, as was observed not only in our study but also in other previous studies [20, 21], and false-positive examinations can lead to an increased number of biopsies and conversion to unnecessary mastectomy [16]. In a previous comparison study involving DBT and MRI with a cancer-only population, the addition of MRI to combined mammography with DBT and ultrasound did not significantly improve sensitivity. However, in this study, not only DBT, but ultrasound information was also provided so that the direct comparison between DBT and MRI was impossible to infer [14]. There were several other studies evaluating the role of DBT for MRI-detected additional lesions [22, 23]. According to those results, DBT increased detection and characterisation of MRI-detected additional lesions, suggesting combined use of both imaging modalities might improve overall diagnostic performance for breast lesion evaluation.

As in previous studies, DBT plus mammography showed clearly higher diagnostic performance compared with mammography alone. These findings support previous studies and validate the benefit of DBT in terms of sensitivity to digital mammography [11, 2427]. To the contrary, we found that the PPVs of DBT plus mammography in our study were not significantly higher than those of mammography alone which is not concordant with the results of screening type studies, which show that DBT provided lower recall rates and improvement in specificity [25, 26, 28]. We believe that this discordance may be associated with our respective study designs depicting at least one abnormality of interest for a patient. Furthermore, underestimation of actual false-positive lesions is also possible due to our unique study design including a cancer-only study population [29]. However, in a previous study using FROC analysis, six of eight radiologists also showed higher estimated maximum false-positive rates [30].

There are several limitations to our study. First, the patients included in this study were all breast cancer patients, and high-risk or average risk women who did not have breast cancer were not included. In addition, the radiologists were aware that all patients were likely to have at least one breast cancer prior to image interpretation. This may have introduced a bias. Thus, our results may not be generalised to a screening population. Second, as we included a cancer-only patient group, JAFROC methodology was used to assess the diagnostic performances. FROC takes into consideration the location of the suspected abnormality and allows for more than one location to be identified as suspicious, and as a result, the FROC approach enables detection of the differences of diagnostic performances within-subject (location-based)and possibly to mimic the population, which has both benign and malignant lesions. However, the current FROC approaches ignore the possible difference in the distributions of numbers and ratings of false-positive marks, and different type of satisfaction of search in positive versus negative images (or cases) [29]. Thus the sensitivities and PPVs in our study might not reflect the real clinical practice. To address the issue of independence of lesion, we performed subgroup analysis, and the higher sensitivities of MRI than DBT, and lowest sensitivity in mammography were noted in both the single-lesion group and non-single-lesions group. Third, not all lesions were surgically confirmed and lesion verification is strictly associated with the status of test positivity, thereby potentially resulting in the overestimation of the actual sensitivity of both image modalities by reducing some false-negative lesions. As the pathological reference was available for all the cancer cases, less than half of the benign cases were confirmed by pathological examination, even though all benign lesions remained stable during the follow-up period.

In conclusion, DBT provided lower diagnostic performance than MRI as adjunctive imaging to mammography. However, DBT was found to be a valuable imaging modality with higher diagnostic performance than mammography and higher PPV than MRI. If validated in larger studies, this may potentially assist the new preoperative workup in future practice.