Introduction

Breast cancer is the most frequent cancer among women worldwide [1]. Cancer staging is essential to define patient’s therapy, management, and prognosis. Nevertheless, about 10–15% of these cancers are missed by conventional imaging tools. Moreover, too many biopsies are performed for benign findings [2]. Therefore, greater diagnostic accuracy and more highly specific diagnostic tools are needed. Magnetic resonance imaging (MRI) provides more sensitivity than conventional imaging techniques in the evaluation of breast lesions [3], while specificity can vary among centres [4, 5]. Positron emission tomography computed tomography (PET-CT) with 2-[fluorine-18]-fluoro-2-deoxy-D-glucose (FDG) has limited sensitivity for small and low-grade lesions and lymph-node involvement [6]. However, PET-CT devices with time of flight (TOF) and point-spread function (PSF) promise higher resolution [7]. There is previous evidence demonstrating the value of prone positioning in FDG-PET of patients with breast cancer [8], and preliminary results of fused prone FDG-PET and MRI suggest improved diagnostic accuracy [9, 10]. However, the clinical role of PET-MRI for initial staging of breast cancer has not been established yet.

Therefore, the aim of this study was to assess the diagnostic accuracy of MRI, prone FDG PET-CT with TOF and PSF reconstruction and MRI fused with FDG PET-CT in primary invasive breast cancer staging.

Materials and methods

Patient population

Forty-five women with newly diagnosed invasive breast cancer referred for staging between November 2012 and February 2015 were included. The institutional Ethics Board approved this single-site, retrospective study. Inclusion criteria were women with primary breast cancer aged between 18–80 years referred for MRI and FDG PET-CT staging. Pregnant patients, and those with inflammatory breast carcinoma were excluded. In addition to patient age, menopausal status and hormone replacement therapy (HRT), tumour size, histology and molecular subtype were recorded. The median period between the breast biopsy and FDG PET-CT was 2 days (range 0–23). MRI and PET-CT were performed as temporally close as possible; thus, time interval was no more than 7 days. The final diagnosis of identified lesions was based on 14-gauge core biopsy in 52 (49%) lesions, vacuum-assisted biopsy in three (2%), fine-needle aspiration (FNA) biopsy in 16 (15%), surgical specimens in 19 (18%), and ≥24 months of follow-up (range 24–40) in 17 (16%).

MRI

MRI examinations were performed in a 1.5-T Aera (Siemens Medical Solutions, EEUU) with a dedicated 4-channel breast coil (Noras, Höchberg, Germany) with axial T2-weighted short-tau inversion recovery (STIR) sequences (TR/TE 5000/57 ms); T1 = 170 ms. After intravenous administration of 0.15 mmol/kg of gadolinium (Magnevist®, Bayer, Germany) a dynamic flash 3D postcontrast study (5 frames, 60-s time resolution, TR/TE 4.82/1.85 ms, fat saturated) was performed.

FDG PET-CT imaging

Images were acquired on a Biograph mCT scanner (Siemens Medical Solutions, Hoffman States, IL, USA). The PET system consists of four rings of 48 lutetium oxyorthosilicate (LSO) blocks covering an axial field of view (FOV) of 21.8 cm, combined with a 64-slice spiral CT scanner. A detailed description of this PET-CT system and its performance characterisation can be found elsewhere [11]. Patients were required to fast for 6 h before the injection of 4.4 MBq/kg of FDG. Blood glucose levels were <6.7 mmol/l. An unenhanced CT from the base of the neck to the upper abdominal region was obtained with Care Dose4D (120 reference mAs) at 120 kVp, rotation time 0.5 s, pitch index 1, slice thickness 3 mm with a 1.5-mm slice increment. CT scan was acquired with the patient in the prone position using the same breast MRI coil. PET TOF acquisition was performed (2 bed at 3 min) and was reconstructed applying a PSF iterative algorithm (3 iterations, 21 subsets) into a 200 × 200 matrix.

Image interpretation

MR images were retrospectively evaluated by a dedicated breast radiologist (16 years of breast MR experience) in a workstation with TrueD software (Siemens, Germany) according to the American College of Radiology Breast Imaging Reporting and Data System (BI-RADS) lexicon [12]. Lesions were classified as positive when BI-RADS was ≥4. FDG PET-CT images were retrospectively interpreted by a board-certified nuclear medicine physician (18 years of FDG PET-CT experience) on a dedicated PET workstation equipped with fusion software (Syngo VG30, Siemens, Germany). The intensity of FDG uptake was scored using a 5-point visual score (1 no uptake; 2 less than normal tissue in the contralateral breast, benign; 3 slightly higher than normal tissue in the contralateral breast, probably benign; 4 moderately higher than normal tissue in the contralateral breast, probably malign; 5 markedly higher than normal tissue in the contralateral breast, highly suggestive of malignancy). Lesions were classified as positive when visual score was ≥4. The maximum standardised uptake value (SUVmax) was calculated using volumes of interests. Both specialists retrospectively read the anonymised images, were blinded to the results of the other test and did not know pathologic diagnosis of the lesions or following status.

Fused FDG PET and MR images

PET and MRI data were transferred to a workstation equipped with dedicated software (True D, Siemens, Germany) to perform the 3D volume registration, fused and then landmarks such as patient contour, bones or heart were used for the correct fusion. Visual inspection of the fusion was performed using anatomic criteria that included similar size and breast contour, as well as good structure overlap. The correlation of images was classified by consensus as excellent (total correspondence of the different structures), acceptable (<10-mm displacement) and poor (>10-mm displacement). Subsequently, lesions were re-evaluated, a 5-point scale was used and they were categorised by consensus of both specialists as 1 negative; 2 benign; 3 probably benign; 4 suspicious for malignancy; 5 highly suggestive of malignancy. Lesions were considered positive when they received a score ≥ 4.

Statistical analysis

Quantitative data were expressed as mean ± standard deviation (SD) when distribution was normal or median and interquartile range (IQR). Qualitative data were described using relative frequencies and their respective 95% confidence intervals (CI). The correlation between tumour size and SUVmax was assessed using Spearman correlation coefficient. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy of MRI, PET-CT, and MRI fused with prone FDG-PET were calculated using data from the reference standard. Unpaired Student or Mann–Whitney test were used to compare variables between groups and chi-square analyses to assess differences in frequencies. The areas under the receiver-operating-characteristic (ROC) curves (AUC) were obtained based on MRI, PET-CT and MRI fused with prone FDG-PET. Comparison between the AUC was carried out using a nonparametric approach. P values <0.05 were considered to represent a significant difference.

Results

Forty-five consecutive women (90 breasts), mean age 52 ± 8 years (range 34–79), 20 menopausal (HRT in 8), were evaluated. Cancer was histologically proven in 49 breasts (31 focal, 9 multifocal and 9 multicentric), as four patients had bilateral disease (one contralateral DCIS) (Table 1). In addition to the 49 known index lesions, 23 malignant satellite lesions and 35 benign findings were detected. The most common benign lesion was fibroadenoma, found in 12 (11%) cases. Mastectomy was performed in three patients (one bilateral) and breast-conserving surgery in 13 (2 bilateral). The remaining 29 patients (1 bilateral) received systemic treatment.

Table 1 Baseline characteristics of 49 primary tumours in 45 patients

Of the 107 lesions assessed, 22 were found in the contralateral breast. An additional unsuspected contralateral synchronous DCIS was detected by both modalities, and 21 lesions were benign. Eleven patients with 12 findings underwent follow-up for at least 2 years, and ten lesions underwent FNA: inflammatory (5), fibroadenoma (2), hyperplasia (2), xanthogranuloma (1). Lesion diameter on MRI ranged from 3 to 84 mm (3—29 mm in benign lesions, and 5–84 mm in malignant lesions), with a median of 18.5 mm (IQR 10–27.5 mm). The median size of malignant lesions (20 mm, IQR 15–36) was significantly higher (p < 0.001) than benign lesions (median 6 mm, IQR 4–13).

Summary of the results for breast MRI, prone FDG PET-CT, and breast MRI fused with prone FDG-PET are showed in Table 2. Receiver-operating characteristic (ROC) curves are illustrated in Fig. 1. AUCs were 0.850 for FDG PET-CT, 0.953 for MRI and 0.983 for fused images (p 0.002). That for fused images showed higher values than for MRI, but differences were not statistically significant (p 0.112). AUC for PET showed significantly lower values than for MRI (p 0.016) and fused images (p 0.001). Concordant results between MRI and FDG PET-CT were observed in 87 (81%) lesions. The 20 (19%) discrepant lesions are presented in Table 3. Six of seven lesions were FP on MRI correctly categorised as negative on PET, and nine of ten FP lesions on PET that were not seen on MRI. There were three TP lesions on MRI that did not demonstrate FDG uptake on PET and two on PET that were not identified on breast MRI.

Table 2 Summary of diagnostic performance for breast MRI, prone FDG-PET-CT and breast MRI fused with prone FDG-PET
Fig. 1
figure 1

Receiver operating characteristics (ROC) curves for BI-RADS MRI, FDG-PET and fused MRI-PET as predictors of malignancy. BI-RADS Breast Imaging Reporting and Data System, MRI magnetic resonance imaging, FDG 2-[fluorine-18]-fluoro-2-deoxy-D-glucose, PET positron emission tomography, AUC area under the ROC curve, CI: confidence interval

Table 3 Discrepant lesions in the retrospective reading of breast MRI and prone FDG PET-CT studies

Evaluation of MRI and fused images of primary tumour

On a per-lesion-based analysis of the 107 lesions (Table 4), MRI alone was TP in 70 (66%), true negative (TN) in 28 (27%), false-negative (FN) in two (1%) and FP in seven (6%); one FP corresponded to a patient with HRT. One BI-RADS-3 lesion was malignant at histology [invasive ductal carcinoma (IDC) 9 mm]. Six BI-RADS-4 lesions (35%) were benign and 11 (65%) were malignant. Only one (2%) of 60 BI-RADS-5 lesions was benign (fibroadenoma). An example of an FP lesion on MRI is shown in Fig. 2.

Table 4 The diagnostic performance of MRI BI-RADS for detecting primary breast cancer on a per-lesion-based analysis
Fig. 2
figure 2

Axial slices (a) at the level of a 13-mm BI-RADS-4 lesion detected in the left breast (thick arrow) in a 45-year-old woman with IDC luminal A (index lesion, thin arrow). Prone PET scan (b) demonstrated no FDG uptake (score 1). The lesion was categorised as benign (score 2) by consensus on fused images (c). At core needle biopsy, the finding was inflammatory lesion. BI-RADS Breast Imaging Reporting and Data System, IDC invasive ductal carcinoma, PET positron emission tomography, FDG 2-[fluorine-18]-fluoro-2-deoxy-D-glucose, PET positron emission tomography

Fusion was considered of excellent quality visually in 29 patients (64.4%) and acceptable in 15 (33.3%). Fused images correctly characterised 103 (96%) lesions as TP in 70 (65.5%), TN in 33 (31%), FP in one fibroadenoma and in one sclerosing adenosis, and FN in two (2%) satellite lesions of a patient with multicentric IDC. Fused images incorrectly scored one fibroadenoma with positive FDG uptake not initially noted as positive on breast MRI; but it was helpful in correctly characterising six benign lesions, which were identified on MRI as BI-RADS 4 (Table 5). Considering only lesions classified as BI-RADS 4 on MRI, those positive on fused images had a higher percentage of malignancy than those that did not show increased metabolism, with the result being statistically significant (p 0.02). MRI and fused images were statistically independent (McNemar test p < 0.01).

Table 5 Diagnostic performance of breast MRI and MRI with prone FDG-PET for detecting breast tumours

PET in suspicious lesion evaluation

The median size of positive lesions was 20 mm (IQR 15–36), being higher for TP (21 mm, IQR 15–36) than for FP (13 mm, IQR 8–29), but these differences were not statistically significant (p 0.08). PET correctly identified 69 lesions (Table 2). There were three FN results (Table 3), corresponding to a 5-mm IDC luminal B/human epidermal growth factor receptor 2 (B/HER-2) in a patient with multifocal disease (score 1), a 10-mm IDC luminal B in a patient with multicentric disease (score 1) and a 19-mm DCIS in a patient with multifocal disease (score 3; Fig. 3). In seven FP on MRI, PET was FP in one finding but correctly identified six TN findings scored as MRI BI-RADS 4. Two TP of increased metabolic activity on PET were not seen on MRI.

Fig. 3
figure 3

Axial slices of a 19-mm lesion detected only on MRI (a) and categorised BI-RADS 5 (thick arrow) in a 48-year-old woman with IDC (index lesion, thin arrow), demonstrated score 3 on PET scan (b). The lesion was categorised as malignant (score 4) by consensus on fused images (c) and confirmed by core needle biopsy to be DCIS. BI-RADS Breast Imaging Reporting and Data System, IDC invasive ductal carcinoma, PET positron emission tomography, DCIS ductal carcinoma in situ

SUV values of lesions proven TP (median 2.6; range 0.8–18.1) were higher (Mann–Whitney U, p 0.08) than those proven FP (median 1.6; range 0.8–7.6), whereas the median SUV value in the contralateral breast was 0.7 (range 0.5–0.9). A statistically significant association between elevated SUVmax and Ki-67 expression was found (p 0.003). FDG uptake was higher in lesions > 50 mm in diameter than in smaller lesions (p 0.001). Spearman’s correlation coefficient for tumour size and SUVmax was 0.56 (p < 0.001). FDG uptake was higher in lesions with tumour grade III than tumour grade I or II, but the differences were not statistically significant.

Discussion

We present results of a retrospective study comparing MRI, prone FDG PET-CT and MRI fused with PET imaging in primary tumour staging of patients with breast cancer. MRI had higher sensitivity than FDG PET–CT; however, specificity, PPV, NPV and accuracy increased in fused images. Regarding clinical applicability of the study findings: as in the presurgical staging, it is very important to achieve a high PPV and accuracy to avoid underestimating the clinical stage; thus, evaluation based on fused images might be more accurate.

The diagnostic value of PET-MRI has been previously addressed: Pinker et al. compared PET-CT and MRI for primary breast lesions [13]. According to their data, differentiation of benign and malignant breast tumours improves when several MRI and PET parameters are combined. In addition, PET-MR could lead to a reduction of unnecessary breast biopsies recommended by MRI. The authors found that the AUC for MRI was 0.925 and of combined PET-MR was 0.935; however, the difference was not statistically significant (p 0.317). The results of our study—obtained with the same breast MRI coil on PET-CT, TOF and PSF—are slightly better, as the AUC for MRI was 0.953 and MRI fused with PET 0.983; However, there was no statistically significant difference between evaluation with MRI alone and fused images (p 0.112).

In breast cancer patients, MRI has the highest sensitivity for primary tumours [14]. Moy et al. fused prone PET breast scans with MRI in 23 patients and found a sensitivity and specificity of 92% and 52%, respectively, and 63% and 95%, respectively, for fused images [10]. However, inaccurate coregistration of images due to position changes in the breast between MRI and PET was seen to occur for most patients. Subsequently, FDG-PET and MR fusions were helpful in image guided biopsy [15]. The PET-CT procedure using the same breast MRI coil resulted in our images being precisely fused. Obviously, with a fully integrated PET-MR unit capable of simultaneous acquisition of breast PET and MRI, all imaging information can be obtained in a single session, which not only provides more comfort to patients but makes studies shorter and significantly reduces patient exposure to radiation [16].

Previous studies demonstrated the lower sensitivity of FDG-PET for low-grade tumours and lesions <10 mm because of the limited spatial resolution of PET scanners [6]. In our sample, only one DCIS was FN on FDG PET-CT. In addition, FDG uptake was lower in tumours with positive hormone-receptor expression compared with less-differentiated tumours with triple-negative subtype, as previously reported [17]. We observed a wide range or FDG uptake values, reflecting the heterogeneity of glucose metabolism within breast cancer. The SUVmax has been shown to be significantly influenced by histological grade and hormonal receptor status of the primary tumour [18]. FDG-PET demonstrates elevated glucose metabolism associated with malignancy that results in increased FDG uptake through up-regulation of membrane glucose transporters and increased enzymatic activity of hexokinase [19]. Although FDG-PET has proven to be highly useful in the diagnosis and staging of a variety of malignancies, sensitivity ranges between 64 and 96% for breast cancer [20]. Sensitivity for PET-CT in our study was 96%. Differences between results of our study and the literature may be explained by the increased detectability with TOF and PSF reconstruction implementation. PSF-based PET reconstruction effectively improves spatial resolution and image quality. Prieto et al. described the characteristics of the PET-CT tomograph with the combination of TOF-acquisition and PSF reconstruction, improving not only image quality with greater contrast but also spatial resolution and sensitivity, enabling detection of very small lesions [7]. Positron emission mammography promises to obtain a more accurate metabolic evaluation of breast lesions, thanks to a reduced distance between the patient’s breast and detection plates [21] and in-plane special resolution of 2 mm [22].

The main limitations of our study are the small patient number, which limits the statistical analysis, and its retrospective design (we retrospectively read the anonymised images), which unavoidably introduces selection bias. Furthermore, diffusion-weighted imaging sequences were not available in all patients and were not reviewed. This is a limitation, because it could have increased MRI specificity. On the other hand, prone PET requires an extra device and extra radiation CT, which could be avoided with integrated PET-MRI devices.

In conclusion, this study shows that MRI fused with FDG-PET is more accurate than FDG-PET in primary tumour staging of breast cancer patients and increases MRI specificity. Larger prospective studies are needed to define the role of fused PET-MRI in staging, as well as changes in treatment and patient outcome.