Introduction

The term breast cancer includes a complex and heterogeneous variety of pathologies, with different histological subtypes, as described according to the WHO classification [1]. However, these histological differences are not helpful in predicting clinical behavior or response to treatment [2].

Various characteristics are useful in distinguishing clinical behavior of different types of cancer, like histological grade, presence of lymph nodes metastasis or vascular invasion, but there is still need for more information regarding behavior and prognosis of breast cancer. In the past few years, the evaluation of biomarkers has been suggested with this aim. Biomarkers more frequently used include: estrogen (ER) and progesterone receptors (PgR), HER2 status and the expression of the proliferation index Ki67. The information obtained with these markers allows a classification of breast cancer in subtypes with different characteristics and which would receive specific treatments. Molecular markers seem to be a strong predictor of prognosis and response to therapy [2, 3].

Dynamic contrast-enhanced breast magnetic resonance imaging (MRI) has been increasingly used, thanks to its ability to give information on both morphology and vascular pattern, and to its high sensitivity and specificity [46]. New MRI techniques, like diffusion-weighted imaging (DWI), also allow to obtain functional information that can be related to tumor biology [7]. Both qualitative and quantitative information can be obtained using DWI. Specific software products are in fact able to calculate the apparent diffusion coefficient (ADC). From ADC maps, quantitative information on motion of hydrogen molecules in the tissues can be obtained. These values tend to be lower in malignant lesions, where motion is restricted.

Several studies showed the capabilities of DWI to differentiate benign from malignant lesions [810]. Malignant lesions present a lower ADC value compared to benign lesions and normal breast tissue [8, 9]. Recommended thresholds to distinguish benign from malignant breast lesions vary from 0.90 to 1.81 × 10−3 mm2/s in the literature [11, 12]. Some studies [13, 14] showed how changes in the ADC might help in the definition of response to neoadjuvant chemotherapy earlier than other imaging modalities. According to these findings, the hypothesis that DWI could be related to the expression of biomarkers has been developed: if that was the case, DWI could be used not only to diagnose breast cancers, but also to differentiate more aggressive breast diseases.

The aim of our study was to evaluate whether quantitative ADC values correlate with different levels of Ki67 expression, histology, grade and clinical–pathological subtype in breast cancer. Inter- and intra-reader variability was also evaluated in a subset of patients.

Materials and methods

Patients’ selection

All patients with a diagnosis of breast cancer that met the inclusion criteria and underwent breast MRI in our Institution between April 2013 and November 2013 were included in the study. Approval to this study was obtained from the Institutional Review Board and informed consent was waived due to the retrospective study design.

Inclusion criteria were: diagnosis of invasive breast cancer at image-guided needle biopsy; breast MRI performed before or at least 2 weeks after biopsy, to ensure absence of post-procedural artifacts; surgery performed within 3 weeks after MRI; and availability of complete immunohistochemistry pattern with biomarkers (hormonal receptors status, HER-2, Ki67/Mib1). Exclusion criteria were: diagnosis of ductal carcinoma in situ without invasive component and presence of significant artifacts in the DWI sequence. Examinations excluded because of significant artifacts were those in which image quality was too low to clearly identify the target lesion.

MRI study

Breast MRI examination was performed on a 1.5 T magnet (Magnetom, Avanto Siemens Medical System, Erlangen, Germany; software NUMARIS 4 version Syngo MR B17) with dedicated, bilateral, four-channel coil. The MRI study consisted of: DWI sequence acquired before contrast medium injection, STIR T2-weighted sequence and a T1-weighted sequence one time before and five times after intravenous contrast medium administration (0.1 ml/kg Gadobenate Dimeglumine 0.5 M). The T1-weighted sequence was a 3D fast low-angle shot sequences with repetition time 9 ms, echo time 4.76 ms, field of view 340 × 340 mm, slice thickness 2 mm and matrix 512 × 512.

DWI was acquired in the transverse plane using Single-Shot Echo Planar Imaging (SS-EPI), with fat suppression with SPAIR technique, TR 7100 ms, TE 84 ms, FOV 330 × 165 mm, matrix 164 × 85 pixel, in plane spatial resolution 2 × 2 mm2, slice thickness 4 mm, 24 slices, NEX 5, b values 0 and 1000 s/mm2, acquisition time of 2′29″.

Pathological analysis

Pathological analysis was conducted on surgical specimens after surgery. Lesions were classified according to the WHO system [1]. Grading was defined according to the Elston-Ellis classification system [13] and hormonal receptor status was defined using immunohistochemistry with monoclonal antibody approved by UK NEQUAS—breast hormonal receptor module.

HER-2 status was evaluated according to published guidelines [16], considering positive a result 3+ at immunohistochemistry evaluation on more than 30 % of the cancer cells (HercepTest). When result was equivocal, FISH was used.

Ki67 proliferation index was measured with the monoclonal antibody Mib1, by reporting the percentage of reactive cells between the 2000 cells selected randomly from the periphery of the lesion.

Image analysis

Two readers with more than 5 years of experience in breast imaging and breast MRI reviewed the images in consensus to define the target lesion and measure ADC values. One of the two readers repeated the measurements in 40 cases, the first 20 consecutive Ki67 positive and the first 20 consecutive Ki67 negative patients, to assess intra-observer variability. Wash out period was more than 2 months. To evaluate inter-reader agreement, a third reader with more than 2 years of experience in breast MRI was asked to perform measurements on the 40 cases in a separate session. All readers were blinded to histological subtype and biomarkers status of the lesions.

MRI examinations were evaluated on a dedicated workstation (Syngo MultiModality Workplace—Leonardo, Siemens Healthcare, Erlangen, Germany). For each patient, the study coordinator selected the target lesion. When more than a malignant lesion was present, only the one with more suspicious features was selected. The study coordinator assisted the readers in measuring the target lesion. Readers manually placed the region of interest (ROI) in the solid portion of each lesion. Care was taken to avoid areas of T2 shine-through, such as cystic or necrotic portions of the tumor, shown as areas of high signal intensity on T2w images and ADC map. The ADC value was automatically calculated when the ROI was drawn, and only mean ADC values were considered. The range of the diameter of the ROI varied between 6 and 12 mm. Small ROIs were used in small lesions to safely avoid surrounding tissue.

Statistical analysis

Since the ADC values did not follow a normal distribution using Shapiro–Wilk test, we used median and interquartile range as summary statistic.

Patients were divided into two groups according to the Ki67 percentage: <20 % was considered low (Ki67-negative), while ≥20 % was considered high (Ki67-positive), as according to the St. Gallen Consensus Meeting [3]. Medians of the ADC values of the two groups were compared using Mann–Whitney test.

The ADC values were also stratifying in different subgroups according to: histology (IDC vs. ILC vs. rare types), grade (G1 vs. G2 vs. G3 and G1 + G2 vs. G3) and clinical–pathological classification (Luminal A vs. Luminal B-HER2 negative vs. Luminal B-HER2 positive vs. HER2-enriched vs. Triple Negative and in particular Luminal A vs. Luminal B-HER2 negative). Kruskal–Wallis test was used for multiple comparisons and Mann–Whitney test was used to compare two groups. The α level was 0.05.

ADC values were also used to calculate a receiver operating characteristic (ROC) curve, to evaluate the area under the curve (AUC). The optimal cutoff able to distinguish the two subtypes of patients according to low or high Ki67 level was retrospectively calculated from the data distribution on the ROC curves.

Inter- and intra-reader agreement was assessed with interclass correlation coefficient (ICC) and Bland–Altman plots.

Statistical analysis was performed using statistical software commercially available (MedCalc Software version 9.1.0.1, Ostend, Belgium).

Results

A total of 118 patients with a diagnosis of breast cancer at needle biopsy were evaluated. Three were excluded due to artifacts on MR images. In all three cases, artifacts were due to errors in the suppression of signal from fat, in patients with almost entirely fatty breasts. 115 patients with a known invasive malignant lesion (mean age 57.8 years old, range 37–81 years old) were included in the study.

Lesion size of the target lesions ranged from 8 mm to 90 mm (mean 20.4 mm).

Histology of the lesions was: 85 invasive ductal carcinoma (73.9 %), 17 invasive lobular carcinoma (14.8 %), 13 other subtypes (11.3 %: 6 invasive ducto-lobular carcinoma, 4 mucinous carcinoma, 1 invasive ductal carcinoma cribriform type, 1 papillary carcinoma, 1 apocrine ductal carcinoma). Histological grade was: G1 in 21 patients (18.3 %), G2 in 59 patients (51.3 %) and G3 in 35 patients (30.4 %).

At breast MR, the majority of the lesions (108; 93.9 %) appeared as mass like, while 7 (6.1 %) were non-mass like.

Overall, median ADC value was 0.93 × 10−3 mm2/s (interquartile range 0.83–1.06).

Ki67 was positive in 53 cases (46.1 %) and negative in 62 (53.9 %) of the cases. Twelve cancers (10.4 %) were HER2 positive. Hormonal receptors status, HER2-status and clinical–pathological analysis are shown in Table 1.

Table 1 Characteristics of the tumors included in the study

The measures of ADC were significantly different according to Ki67 (p < 0.0001), with median of ADC values of 0.86 × 10−3 mm2/s (interquartile range 0.75–0.92) for Ki67 positive and 1.03 × 10−3 mm2/s (interquartile range 0.92–1.13) for Ki67 negative. Examples are given in Figs. 1 and 2.

Fig. 1
figure 1

A 57-year-old patient with multifocal IDC grade 2. At the immunohistochemical analysis, Ki67 expression was 75 % (>20 %), ER was positive (90 %), PgR was negative and HER2 status was equal to 3+. The subtype was classified as Luminal B HER2-positive. a T1w image before contrast medium (CM) injection; b T1w image with CM shows multifocal cancer with intense enhancement; c subtracted image; d diffusion-weighted image at b = 0 s/mm2 in which the two lesions demonstrate the same signal intensity of the normal breast parenchyma; e diffusion-weighted image at b = 1000 s/mm2 in which the two lesions appear as hyperintense; f ADC map: the biggest lesion, included in the study, shows an ADC value of 0.73 × 10−3mm2/s

Fig. 2
figure 2

A 51-year-old patient with IDC grade 2. At the immunohistochemical analysis, Ki67 expression was 5 % (<20 %), and ER was positive (90 %), PgR was positive (90 %) and HER2 status negative. The subtype was classified as Luminal A. a T1w image without contrast medium (CM) injection; b T1w image with CM shows a lesion with intense enhancement; c subtracted image; d diffusion-weighted image at b = 0 s/mm2 in which the lesion was not visible; e diffusion-weighted image at b = 1000 s/mm2 in which the lesions show a mild hyperintense signal; f ADC map: the lesion shows an ADC value of 1.13 × 10−3 mm2/s

No significant differences in ADC values were found when comparing different histological subtypes (ILC vs. IDC vs. rare histotypes, p = 0.157). Median ADC values for the three groups were 0.93 × 10−3, 0.93 × 10−3 and 1.07 × 10−3 mm2/s, respectively.

When evaluating the three different grades (G1 vs. G2 vs. G3), a significant difference was found (p = 0.005). Median ADC for the three grades was: 1.06 × 10−3 mm2/s for G1, 0.93 × 10−3 mm2/s for G2, 0.86 × 10−3 mm2/s for G3. In the comparison between G1 and G2 vs. G3 lesions, Mann–Whitney test showed a significant difference (p = 0.0015); median in the two subgroups was 0.96 × 10−3 and 0.86 × 10−3 mm2/s, respectively.

ADC values were also significantly different in the various subgroups according to clinical–pathological differentiation (p < 0.0001). Median of ADC was: 1.03 × 10−3 mm2/s for Luminal A, 0.87 × 10−3 mm2/s for Luminal B-HER2 negative, 0.75 × 10−3 mm2/s for Luminal B-HER2 positive, 0.88 × 10−3 mm2/s for lesions HER2 enriched and 0.88 × 10−3 mm2/s for triple negative lesions.

When comparing median of ADC for Luminal A vs. Luminal B-HER2 negative, the difference was significant (p < 0.0001). Results are summarized in Table 2.

Table 2 Median ADC values of the different subgroups of our population

The analysis of the ROC curve obtained comparing ADC values in lesions with low Ki67 and lesions with high Ki67 showed an AUC of 0.776 that defines the test as moderately accurate in the differentiation (Fig. 3). Using a cutoff value for ADC of 0.95 × 10−3 mm2/s, we obtain a sensitivity of 83 % and a specificity of 66.1 % in distinguishing lesions with high proliferation activity from those with low proliferation index.

Fig. 3
figure 3

ROC curve obtained comparing ADC values for Ki67-positive and Ki67-negative lesions

Interclass correlation coefficient showed only a moderate intra- and inter-reader agreement (r = 0.623 and r = 0.5482, respectively). Bland and Altman plots (Fig. 4) showed an intra-reader bias of 0.05 (range 0.32/−0.22) and an inter-reader bias of 0.04 (range 0.34/−0.26).

Fig. 4
figure 4

Bland–Altman plots for intra-observer (a) and inter-observer (b) variability, calculated on a subset of 40 patients, 20 consecutive patients with negative Ki67 and 20 consecutive patients with positive Ki67. R1 consensus reading, R2 third reader, I first reading, II second reading

Discussion

We found significant differences in ADC values according to different Ki67, grade and clinical–pathological classification. In particular, cancers with a high proliferation index (Ki67 ≥ 20 %) and a high grade presented lower ADC values.

The median of the ADC values of malignant lesions analyzed in this study (0.93 × 10−3 mm2/s) was comparable to the values obtained in other published studies that range from 0.89 × 10−3 to 1.07 × 10−3 mm2/s [8, 1719].

Ki67 is a marker expressed in all phases of the cellular cycle, except G0, and often used to measure the proliferation activity of cells [20, 21]. High levels of Ki67 are associated with less cell differentiation and worst prognosis [22]: this difference in ADC values has the potential to differentiate more aggressive disease. The same applies for G3 cancers compared to G1 and G2. This could be related to the lower cell differentiation and higher cellularity of high-grade lesions [2325].

Studies published on the correlation between ADC and various tumor characteristics show a wide variability in design and results. Different cutoffs have been used to distinguish low Ki67 from high Ki67 with values that go from 14 to 20 %. Choi [26], using a cutoff of 20 %, obtained results similar to that of our study, with a lower ADC value for high Ki67 (0.89 × 10−3 vs. 0.93 × 10−3 mm2/s).

Onishi [27], in a small subset of patients, used a cutoff level of 14 %, and found a correlation between ADC values and proliferation index only for mucinous carcinomas.

On the other hand, Martincich [19] did not find significant differences in the ADC values between high and low proliferation index, with median ADC of 1.08 × 10−3mm2/s and 1.03 × 10−3mm2/s, respectively. In this study, though, the two populations were very different (171 vs. 21 cases) and the cutoff value for Ki67 was 14 %. Same results as Martincich were obtained also by Jeh [28], using a cutoff of 15 %, and Kim [29], who did not use a cutoff value but interpreted Ki67 as a continuous. No differences were found when ADC value was correlated with histology: same results were also found by Martincich and Kim [19, 29].

The level of expression of Ki67 is also used to distinguish Luminal A and Luminal B-HER2-negative cancers: Luminal A type is characterized by a low proliferation index compared to Luminal B-HER 2 negative. In our study, ADC values were higher in patients with Luminal A type compared to Luminal B-HER2 negative, thus confirming our hypothesis. To our knowledge, this is the only study that addresses this topic; Martincich [19] found a significant difference in a multivariate analysis of subgroups, but only evaluated tumors HER2 enriched vs. other subgroups.

Using an ADC cutoff value of 0.95 × 10−3 mm2/s, we obtained a sensitivity of 83.0 % and a specificity of 66.1 % in differentiating Ki67-positive from Ki67-negative cancers.

Interestingly, Mori et al. [30] evaluated Ki67 in patients with Luminal type breast cancers, and obtained results similar to that of our study, using a different subset of cases and a different method to position ROIs. Using an ADC threshold of 1.097 mm2/s, sensitivity, specificity, positive predictive value, and negative predictive value were 82 % (36 of 44), 71 % (30 of 42), 75 % (36 of 48), and 79 % (30 of 38), respectively.

A tendency towards lower values of ADC in high-grade cancers has already been found in various studies, with or without significant differences in the two groups [19, 29, 3133].

On the other hand, Kamitani et al. [34] compared ADC values with stage, vascular invasion, lymph node status, and hormonal receptor status, nuclear grade and HER2. They found higher ADC values for node-positive cancers, ER-negative, PR-negative cancers, but no significant differences were found when analyzing vascular invasion, nuclear grade and HER2 status.

Intra- and inter-reader variability showed no systematic difference and an acceptable range when using Bland–Altman plots. Other studies showed a low inter-reader variability in evaluating ADC of breast lesions [35, 36].

In our study, we found a correlation between high proliferation index and low ADC values: it is likely that ADC values calculated in breast cancer might be useful to evaluate response to neoadjuvant chemotherapy, and also as to predict the response to treatment [13, 14, 37, 38]. This could be an interesting application for the future, which needs further studies to be confirmed.

The most relevant limitations of this type of studies are the absence of a standardization of the immunohistochemistry panel used to define the proliferation index and the other markers used to classified breast cancer and the absence of a consensus in the cutoff value to define Ki67 positivity. Another important problem, when evaluating DWI, is the lack of standardization in the sequences: different vendors, in fact, use different parameters and there is no standardization in the modalities to measure ADC. Three cases (2.5 %) were excluded because of artifacts in DWI and ADC maps. Artifacts were related to incorrect fat suppression, and thus could have eventually been avoided by a more careful sequence planning. When possible, sequence with artifacts was repeated, thus allowing for an overall high image quality in diffusion-weighted images.

In conclusion, DWI could become an important tool to study breast pathologies, thanks to its rapid acquisition times and the ability to provide important information concerning aggressiveness and biology of the disease. Suboptimal measurement reproducibility and readers’ agreement are still present and could determine a limitation in this application of DWI.