Introduction

Recently, diffusion-weighted imaging (DWI) has been an increasingly investigated magnetic resonance imaging (MRI) technique for the evaluation of breast lesions [13]. DWI measures the random movement of water molecules (i.e. Brownian movement). DWI depicts the diffusivity of the examined tissues, providing a surrogate marker for tissue microstructure and cell density [4]. This diffusivity can be quantified by calculating the apparent diffusion coefficient (ADC). Low ADC values indicate a diffusion restriction, i.e. caused by high cellular density, as can be seen in many malignant breast tumours [5].

In addition to the application of DWI for the differentiation of invasive and non-invasive breast cancers [2] or for the identification of viable lesion parts for 1H-MR spectroscopy [6], several studies have shown the ability of ADC to differentiate between benign and malignant breast lesions [7, 8]. A recent meta-analysis has shown a wide range of results, with sensitivities ranging from 52 to 100 % and specificities from 33 to 100 % [9].

The reasons for this wide range are most likely attributable to different hardware (field strength, gradients) [10, 11], examination parameters (e.g. different b-values) [12], the DWI field of view [13] or even different post-processing systems [14].

ADC is usually measured by placing regions-of-interest (ROIs) inside the breast lesion. To date, no consensus about how to choose these ROIs exists, although studies on DWI in colorectal cancer have shown that ROI-placement significantly influences ADC-values [15].

Different ROI-placement approaches have been described for DWI in breast imaging: 2D-ROIs covering the part of the lesion displaying the lowest ADC-values [3, 16]; 2D-ROIs covering the whole lesion in one slice [17]; or 3D-ROIs covering the whole lesion [4, 18].

Before the body of evidence on quantitative DWI of the breast can be used to provide general diagnostic thresholds for clinical practice, it is mandatory to obtain data about measurement-dependent ADC disparities.

Furthermore, different authors have investigated either minimum ADC (ADCmin) [19, 20] or mean ADC (ADCmean) [21, 22], which will most probably lead to different measured ADC-values and may also influence diagnostic performance.

Thus, the aim of this study was to:

  1. a)

    Investigate whether different ROI-placement approaches influence ADC-values;

  2. b)

    Compare the diagnostic performance and clinical feasibility of these approaches to differentiate benign from malignant breast lesions using ADCmin, ADCmean, and maximum ADC (ADCmax);

  3. c)

    Investigate the measurement times and the inter- and intra-reader reproducibility for these measurement approaches.

Materials and methods

The institutional review board of the Medical University Vienna approved our study. Written, informed consent was obtained from all patients.

Patients

Between September 2007 and December 2011, 626 patients with suspicious findings on mammography or ultrasound, i.e. asymmetric density, architectural distortion, breast mass or microcalcifications (BI-RADS 0, further imaging warranted; BI-RADS 4, suspicious abnormality; BI-RADS 5, highly suggestive for malignancy) underwent breast MRI at our institution. Of these, 150 consecutive patients who fulfilled the following inclusion criteria were retrospectively evaluated: breast lesion visible on MRI; lesion size ≥5 mm; histopathological proof of the assessed lesion; and no diagnostic or therapeutic intervention prior to the MRI examination. Of these, three patients were excluded due to technical failure of DWI, resulting in a study population of 147 patients with 149 breast lesions (mean age: 53.2 years, SD 13.4, range 24–86).

Imaging

All patients underwent 3 T-MRI (Tim Trio, Siemens, Erlangen, Germany) in the prone position, using a four-channel breast coil (InVivo, Orlando, FL, USA). In pre-menopausal women, MRI was performed in the second week of the menstrual cycle [23].

A standardized MRI examination was applied, including a T2w-sequence, a pre-contrast diffusion-weighted sequence and a dynamic contrast-enhanced (CE) (0.1 mmol/kg body weight Gd-DOTA, Dotarem®, Guerbet, France) T1w-sequence [4].

For DWI, an axial, three-acquisition-direction, trace diffusion-weighted, double-refocused, single-shot echo-planar imaging (EPI) sequence with inversion recovery and fat-suppression was used (TR/TE/TI 13700/83/220 ms; FOV 340x117 mm; 40 slices at 3.5 mm; matrix 192x64 [50 % oversampling]; two averages; b-values 50 and 850 s/mm2; bandwidth 1446 Hz/pixel; 3:19 min) [12].

Monoexponential ADC-maps were automatically calculated by the MR imaging system software from the DWI-images at the two b-values, b1 and b2, using the formula \( =\frac{ \ln \left(\frac{\mathrm{S}2}{\mathrm{S}1}\right)}{\mathrm{b}1-\mathrm{b}2} \) with S the signal intensity at the given b-value.

Data analysis

Three radiologists independently performed all ROI-placements:

  • Reader 1 (R1): 4 years of experience in breast MRI.

  • Reader 2 (R2): 2 years of experience in breast MRI.

  • Reader 3 (R3): 3 months of experience in breast MRI.

All were blinded to the histopathological results. R1 repeated the measurements three months after the initial reading.

A second person recorded the measurement time for each reader. Time recording was started when the reader had definitely identified the lesion of interest.

High b-value DWI images were visually assessed for hyperintensities corresponding to enhancing lesions on CE-MRI. Lesion size was determined on the CE-MRI images and was defined as the largest diameter of the lesion. OSIRIX (Pixmeo SARL, Bernex, CH) [24], was used for the assessment of the ADC-maps.

Three different ROI-placement approaches were applied by each reader (Figs. 1, 2 and 3).

Fig. 1
figure 1

Schematic representation of the different ROI-placement approaches. Image (a) shows a breast lesion with heterogeneous ADC-values. The different ROI (white lines) placement approaches were: a small-2D ROI covering the part of the lesion with the lowest ADC-values (b); a large 2D-ROI covering the whole lesion on one slice (c); and a 3D-ROI covering the whole lesion on all slices (d). ADC apparent diffusion coefficient, ROI region of interest

Fig. 2
figure 2

Grade 3 invasive ductal cancer (thick white arrows) verified by histopathology in the right breast of a 39-year-old woman. Axial ADC-map (a) and corresponding axial contrast-enhanced T1 image (b). Image (c) shows the small 2D-ROI (black/white arrow), as well as the ROI covering the whole lesion in this slice (thin white arrow) on the ADC-map. Three-dimensional rendering of the whole lesion volume covered by the 3D-ROI (D). ADC apparent diffusion coefficient, ROI region of interest

Fig. 3
figure 3

Fibroadenoma (thick white arrows) verified by histopathology in the right breast of a 45-year-old woman. Axial ADC-map (a) and corresponding axial contrast-enhanced T1 image (b). Image (c) shows the small 2D-ROI (black/white arrow), as well as the ROI covering the whole lesion in this slice (thin white arrow) on the ADC map. Three-dimensional rendering of the whole lesion volume covered by the 3D-ROI (d). ADC apparent diffusion coefficient, ROI region of interest

  1. a)

    One manually drawn 2D-ROI, covering the area, visually assessed, with the lowest ADC-values inside the lesion (small 2D-ROI).

  2. b)

    One 2D-ROI, covering the whole lesion on the slice, visually assessed, with the lowest ADC-values (large 2D-ROI).

  3. c)

    Multiple, manually drawn large 2D-ROIs on each slice containing the lesion of interest, which were then combined to create a 3D-ROI using the OSIRIX ROI-volume-tool.

Partial volume effects due to surrounding tissue and necrosis, as identified from the T2w- and CE-MRI, were avoided as far as possible by sparing the lesion borders (Figs. 1, 2 and 3) and drawing the ROIs around obvious necrosis. Clip artifacts were not an issue, as all MRI examinations were performed before any biopsy.

ADCmin, ADCmean, ADCmax and ROI sizes were determined. The reference standard was histopathology in all cases. Histopathology specimens were obtained either by image-guided biopsy or open surgery. Histopathological analysis was performed by two board-certified pathologists who specialized in breast pathology. Diagnostic procedures were performed according to the European guidelines for quality assurance in breast cancer screening and diagnosis [25].

Statistical analysis

Statistical analysis was performed using SPSS 19.0 (IBM Corp., Armonk, NY, USA), MedCalc 12 (MedCalc Software, Ostend, Belgium), and the online receiver operating characteristics (ROC) software StAR [26]. All calculations were performed on a per-lesion basis. Nominal data, such as lesion type, are presented using absolute frequencies and percentages. Differences in ROI-sizes between readers were calculated using a paired samples t-test. Differences in ADC and diagnostic performance between the different measurement approaches were calculated from the averaged results of all three readers (one result per reader). After testing for normal distribution using the Kolmogorov-Smirnov test for independent variables and the McNemar test for paired sample variables, metric data like ADC are presented using median and interquartile ranges. Differences in average ADC-values for different groups and subgroups (e.g. stratified by size) were calculated using the Mann-Whitney U and Wilcoxon’s matched pairs test. ROC-analysis was performed to compare diagnostic performance between the readers, ROI-types, ADC-parameters and between mass and non-mass lesions.

Pearson’s correlation coefficient was used to correlate measurement time and lesion size. Intra-class correlation (ICC) was used to determine inter- and intra-reader agreement. A p-value ≤0.05 was considered a significant result.

Results

Detailed histopathological results are displayed in Table 1.

Table 1 Histopathological results of the patient cohort, by enhancement pattern

There were 29 (19.5 %) non-mass and 120 (80.5 %) mass lesions according to CE-MRI. Median lesion size was 23 mm (range: 5–106 mm). Malignant lesions (5–106 mm, median 26.0 mm) were significantly larger (p < .001) than benign lesions (5–60 mm, median 14.5 mm). Non-mass lesions (median size 31 mm) were significantly larger (p = .009) than mass lesions (22 mm).

Comparison of ADC-values of the different ROI-placement approaches

The results for the comparison of the different approaches are displayed in Fig. 4.

Fig. 4
figure 4

Boxplots showing the minimum, mean and maximum ADC-values of the different ROI placement approaches. ADC-values for benign and malignant lesions were significantly different for all ROI placement approaches and parameters (p < .001), except for maximum ADC of the 3D-ROI (p = .125); ADC values are displayed in × 10-3 mm2/s. ADC apparent diffusion coefficient, ROI region of interest, Min minimum, Max maximum

ADC-values of benign and malignant lesions were significantly different for all minimum, mean and maximum values, independent of the ROI-placement approach (p < .001), except for the ADCmax of the 3D-ROI (p = .125).

ADCmin, ADCmean and ADCmax of benign (p < .001-.003) and malignant (p < .001) lesions differed significantly between the three ROI-placement approaches. The only exception was for the ADCmean of the large 2D-ROI versus the 3D-ROI (p = .346).

ROC analysis

The results of the ROC analysis are displayed in Table 2, Fig. 5 and Supplemental Table 1. Overall, areas under the receiver operating characteristic curve (AUC) ranged between .583 (ADCmax of the 3D-ROI) and .956 (ADCmin of the large 2D-ROI). Generally, ADCmin performed best, independent of the ROI-placement approach, followed by the ADCmean using 2D-ROIs, while the ADCmean of the 3D-ROIs and ADCmax performed significantly worse.

Table 2 Areas under the ROC curve and inter- and intra-reader ICCs for the different measurement approaches
Fig. 5
figure 5

ROC curves and AUC (in brackets) for the minimum, mean and maximum ADC of the different ROI placement approaches – small 2D-ROI (a), large 2D-ROI (b) and 3D-ROI (c). In general, minimum and mean ADC of the 2D-ROIs performed best, while maximum ADC performed worst. Min minimum, Max maximum, ROC receiver operating characteristics, ROI region of interest, AUC area under the curve, ADC apparent diffusion coefficient

ADC performed worse in non-mass (AUC .637–.728), than in mass lesions (AUC .959–.97) when using ADCmean, independent of the ROI-placement approach (0.004 < p < 0.016), and when using ADCmax obtained from the small 2D-ROIs (AUC 0.642 vs. 0.913, p = .015). For the other combinations of ROI-placement approaches and ADC-parameters, AUC did not differ significantly (.051 < p < .509) between mass and non-mass lesions (Supplemental Table 2).

When looking at the mass lesions only, ADCmin and ADCmean performed comparably well (AUC .952–.982), independent of the ROI-placement approach (.063 < p < .098), with the exception of ADCmin obtained from the large 2D-ROI (AUC .982), which performed significantly better than ADCmin obtained from the 3D-ROI (AUC .952, p = .022) (Supplemental Table 3). In the non-mass lesions, ADCmin obtained from the large 2D- and 3D-ROI performed best (AUC .841), but the difference was only significant compared to ADCmean obtained from the 3D-ROIs and ADCmax obtained from the large 2D-ROIs (Supplemental Table 4).

When stratified by size, the AUC difference between the ROI-placement approaches was minimal in lesions ≤1 cm (n = 20, .881–.940). In the lesions >1 cm (n = 129), AUCs differed between the ROI-placement approaches, with results comparable to those of the whole population (.554–.962).

Measurement time

Median measurement times differed significantly (p < .001) between the ROI-placement approaches, with 7 s (range: 3.3–23.7 s) for the small 2D-ROIs, 9.3 s (3.7–31 s) for the large 2D-ROIs, and 34.3 s (3.7–432 s) for the 3D-ROIs. Measurement times correlated significantly with lesion size (p < .001).

Inter- and intra-reader reproducibility analysis

The results of the reproducibility analysis are displayed in Table 2.

Inter- and intra-reader agreement was generally high for ADCmin (ICC .737–.861) and ADCmean (.721–.885), and lower for ADCmax (.301–.78).

ROI size

The size of the small 2D-ROIs (range: 1–149 mm2) showed significant inter- (p < .001) and intra-reader (p = .018) differences, with R1 generally drawing smaller ROIs (mean 4.9 mm2 and 4.1 mm2) than R2 (21.2 mm2) and R3 (13.8 mm2).

Concerning the large 2D-ROIs, R1 drew significantly smaller ROIs during his first reading session than R2 (p = .05). Otherwise, there was no significant difference in ROI size between the readers (.054 < p < .32). In the 3D-ROIs, there were no significant differences between the readers (.077 < p < .987).

Discussion

Our study demonstrates that the results of ADC-value measurements in breast tumours are significantly influenced by ROI-placement. ADCmin and ADCmean acquired from 2D-ROIs perform best for the differentiation of benign and malignant breast lesions, and are highly reproducible, with rapid measurement. 3D-ROIs and ADCmax perform significantly worse.

The results of this study show that the acquired ADC of benign and malignant breast lesions is highly dependent on ROI-placement. This reflects the internal structural heterogeneity of breast lesions, which leads to substantial differences in measured ADC-values, depending on whether only small lesion areas with relatively homogeneous ADC, larger lesion areas, or even the whole lesion are included in the ROI. A similar effect has been described in rectal cancer [15]. Recently, it has been reported that ADC-values of breast lesions were significantly different between small and large 2D-ROIs, with the small ROIs generally performing better than large ones [2729]. Thus, the ROI-placement approach seems to have an influence on the diagnostic performance of DWI for the breast and other tumours.

Furthermore, the selection of ADCmin, ADCmean or ADCmax has a substantial influence on diagnostic performance. Our study shows that ADCmin and ADCmean obtained from 2D-ROIs are the most accurate parameters for the differentiation of benign and malignant breast lesions.

Considering low ADC-values as diagnostic for breast cancer, it may be assumed that ADCmin represents the most malignant part of the tumour. Our study design was not able to demonstrate what causes minimum ADC-values through direct histopathological correlation. However, Guo et al. have shown that there is a general inverse correlation between the ADC and the cellular density of breast lesions [30].

Even though the tumour part with the lowest ADC is always covered when using a 3D-ROI, in our study, ADCmin obtained from the small, as well as the large 2D-ROI, performed better. This could be attributed to one potential pitfall of ADCmin: it is susceptible to partial volume effects. The risk of inadvertently measuring areas of suppressed fat is higher in large ROIs that cover multiple slices. Therefore, the borders of the ROI need to be placed carefully when using this approach.

While ADCmin represents only the most relevant part of the tumour, ADCmean also includes the information from voxels that contain less malignant or even necrotic tumour parts. Based on our results, this effect is negligible in the 2D-ROIs, but becomes more problematic in 3D-ROIs. However, this averaging effect makes ADCmean less susceptible to partial volume effects.

ADCmax is a parameter that has rarely been investigated. In our study, ADCmax performed significantly worse than ADCmin and ADCmean, independent of the ROI-placement approach, which is in accordance with a recent study by Hirano et al. [20]. This again can be attributed to the structural heterogeneity of many breast lesions, leading to the inclusion of less malignant or even necrotic lesion parts, and thus, altering the results of the measured ADCmax.

Therefore, ADCmin and ADCmean are reliable parameters for the differentiation of benign and malignant breast lesions, while ADCmax should not be considered.

Problematic entities in breast imaging are non-mass lesions. Previous studies have shown that ADC performs worse for the differentiation of benign and malignant breast lesions in non-mass lesions than in mass lesions [31, 32], even though Iima et al. [33] have shown that ADC has the potential to differentiate high-grade from low-grade DCIS.

ADC also performed worse in non-mass than in mass lesions in our study; however, this difference was only significant when using ADCmean (independent of ROI placement approach) and ADCmax obtained from the small 2D-ROIs. The inferiority of ADCmean in non-mass lesions is probably due to partial volume effects: nearly half the malignant non-mass lesions in our study were DCIS, which frequently presents as a non-mass lesion. This tumour type stems from the ductal epithelium of the breast and consists of neoplastic cells, but does not infiltrate through the basal membrane. Thus, the tissue between ducts is less affected by tumour growth, leading to an increased ADCmean, while ADCmin remains low.

Based on our results, it was not possible to find an optimal combination of ROI-placement approach and ADC-parameter for the differentiation of benign and malignant non-mass lesions. While ADCmin obtained from the large 2D- and the 3D-ROI performed best, their performance was not significantly better than most of the other possible combinations. This can possibly be attributed to the relatively low number of non-mass lesions (n = 29) in our patient collective.

One crucial factor for the usefulness of a quantitative imaging method is its reproducibility [34]. In our study, ADCmin and ADCmean generally showed high inter- and intra-reader agreement, independent of the ROI-placement approach. This is in agreement with other studies, which reported that the reproducibility of ADC measurements is high. Bogner et al. [35] showed good inter-reader agreement when using small 2D-ROIs and ADCmean. Dijkstra et al. [36] found no significant difference between the ADC-values of breast lesions, as assessed by three different readers, using whole tumour ROIs. However, the authors did not report whether the whole lesion was measured two- or three-dimensionally. Two-dimensional ROIs covering the whole lesion on one slice have been reported to provide the best reproducibility when measuring the ADCmean of breast lesions [27, 37], which is in agreement with our results. To our knowledge, only Nogueira et al. have reported better inter-reader agreement for small 2D-ROIs [28]. However, none of these studies have reported significance levels for the differences in reproducibility.

In rectal cancer, Lambregts et al. [15] reported the best reproducibility when using 3D-ROIs that covered the whole lesion (ICC .91), compared to large- (ICC .53) and small 2D-ROIs (ICC .63). This can probably be attributed to the different tumour morphology of rectal cancer compared to breast lesions. While rectal cancer often is a rather clearly demarcated tumour growing into the lumen of the rectum, breast lesions may present as mass or non-mass lesions, the latter being much harder to exactly demarcate. Furthermore, in that study, the small 2D-ROI was not placed in the region with the lowest ADC, but in the most solid tumour part based on the T2w-images. In our study, agreement was only weak-to-moderate when measuring ADCmax, further limiting the usefulness of this parameter.

Measurement time is another important factor that limits the applicability of an imaging method. In our study, median measurement times were lowest for the placement of the small 2D-ROIs. Placement of the large 2D-ROIs took significantly longer; however, in absolute values, the difference was only 2 s, which can probably be considered negligible. Placement of the 3D-ROIs often took up to several minutes, especially when dealing with large, heterogeneous tumours, substantially reducing the feasibility of 3D-ROIs in a clinical setting. Measurement times correlated significantly with tumour size for all ROI-placement approaches: in large tumours, it took the readers time to identify the region with the lowest ADC-values to place the small ROI, and it was often necessary to draw complex ROIs around necrotic lesion parts to place the large 2D-ROI and 3D-ROI. Thus, measurement time also contributes to the fact that 2D-ROIs are more usable than 3D-ROIs.

In the small 2D-ROIs, ROI sizes showed significant inter- and intra-reader differences. The readers were not given any specifications about minimum or maximum ROI size and the ROI was to be placed to cover the region with the lowest ADC identified by visual assessment. Especially in large, heterogeneous tumours, it is quite probable that the readers did not always identify the same region as the one with the lowest ADC, thus leading to differently sized small 2D-ROIs. Furthermore, the readers were free to adjust the windowing levels to their convenience, which may have influenced the readers’ perception of the extent of the region with the lowest ADC. Finally, since lesion borders were to be avoided to minimize partial volume effects, different extents of this “safety margin” may have altered ROI size.

In the large 2D-ROIs, ROI size was significantly different only between one pair of readers. This can also be attributed to the effects mentioned above, since this ROI was placed on the slice with the lowest overall ADC, again identified by visual assessment. No significant differences were found between the readers in the 3D-ROIs, since no visual identification of a specific tumour part was necessary and the often-large extent of the ROI reduced the effect of the aforementioned safety margins.

We performed DWI using two b-values (50 and 850 s/mm2). Despite some discussion about whether the use of more b-values could better represent the signal decline in DWI with increasing b-values, and lead to more accurate ADC-values, it has been shown [12, 38], that this does not alter the capability of ADC to differentiate between benign and malignant breast lesions. A recent meta-analysis by Dorrius et al. [9] has, however, shown that in 1.5 T, the used b-values significantly influence the measured ADC, which is another factor that has to be taken into account when trying to compare quantitative results in breast DWI.

One limitation of this study is its retrospective nature. This implies that the actual ADC measurements were not used in clinical decision-making. Consequently, our study does not provide any data about whether patient management would be altered by different ROI-measurement approaches. In addition, a spatial correlation of areas with specific (e.g. low or minimum) ADC-values and histopathological findings, such as cellularity, fibrosis or vascularization, could not be performed. This, however, is difficult to realize in clinical practice, as there is, as yet, no accurate DWI-guided core needle biopsy designed to subsample specific regions in suspicious breast lesions. Rather, MRI-only lesions are biopsied using 9-G vacuum-assisted biopsy, which is targeted based on contrast-enhanced images [23], as most DWI sequences show some amount of spatial distortion [35, 39]. A systematic approach in the defined group of patients, as employed in this study, allowed us to investigate and compare different ROI measurements and identify those most promising for clinical application. A prospective approach would therefore not have led to different results given the rationale and methods of this study.

Another limitation is that we did not evaluate the performance of non-manually drawn, geometric ROIs, as has been performed in some previous studies [16, 40, 41]. While these ROIs might have an advantage with regard to the speed of ROI placement, it would seem that placing a strictly geometric ROI in an irregularly shaped lesion would increase the pitfall of partial volume effects, thus leading to altered measurement results.

Furthermore, our patient collective contains a high relative number of malignant lesions compared to benign lesions. This is due to the fact that our institution is a major breast care centre and many patients are referred either because of symptomatic breast tumours or for second-look examinations after imaging abnormalities were detected in other institutions. Thus, our study population was not recruited from a screening population, leading to an increased fraction of malignant lesions.

In conclusion, the results of ADC-value measurements in breast tumours are significantly influenced by ROI-placement. ADCmin and ADCmean acquired from 2D-ROIs performed best for the differentiation of benign and malignant breast lesions, and are highly reproducible, with rapid measurement. 3D-ROIs and ADCmax perform significantly worse.