Introduction

Dynamic magnetic resonance imaging (MRI) of the breast was reported to be a highly sensitive method for the detection of invasive breast cancer. However, with this method, a large number of false positive results were found [1]. Therefore, signal-intensity (SI) time course after injection of contrast agent was determined to evaluate the perfusion characteristics of enhancing breast lesions. Previous studies reported that certain types of SI time courses are correlated with benign or malignant histology [25]. These authors described absent or minor contrast enhancement with a continuous signal increase predominantly in benign breast lesions whereas malignant lesions typically showed rapid and intense contrast enhancement followed by a wash out phase. Difficulties in the differentiation of benign or malignant lesions were found when a rapid contrast increase was followed by a plateau phase.

Neural networks can be used to analyze complex patterns of SI time series in dynamic MRI [69]. The present study investigated the application of neural networks in dynamic breast MRI. Since one lesion might consist of different cell populations, a subdifferentiation of SI time courses within the lesion might improve the accuracy of the method. Hierarchical vector quantization (VQ) is a method to create subpopulations of pixels with similar SI time courses. With this method, several clusters of prototypical time series can be calculated. The purpose of the study was to evaluate whether a subclassification of the different time series within one lesion could improve the accuracy of dynamic breast MRI in indeterminate breast lesions.

Materials and methods

Patients

From 1997 to 2003, we performed dynamic breast MRI in 399 patients with indeterminate mammographic breast lesions. All patients were consecutively selected after clinical examination, mammography in standard projections (craniocaudal and oblique mediolateral projections), and ultrasound. Only lesions classified Breast Imaging and Reporting Data System (BIRADS) III were selected for dynamic breast MRI. In addition, at least one of the following criteria had to be present: nonpalpable lesion, previous surgery with intense scarring, or location difficult for biopsy (i.e., close to chest wall). Patients with contraindications for MRI (i.e., pacemaker, metallic implants) were excluded from the study. All patients gave their informed consent for dynamic breast MRI.

In a retrospective analysis, the acquired MRI studies were evaluated. A number of 273 lesions showed no or minor contrast enhancement in dynamic MRI. In two of these lesions, a ductal carcinoma in situ (DCIS) was confirmed. All other lesions could be verified as benign in follow-up examinations or biopsies. These lesions were not evaluated with neural networks. We only included lesions with an initial signal increase of at least 50% for VQ. MRI data sets with major movement artefacts or missing follow-up data (n = 34) were excluded.

MR data sets of 92 breast lesions in 88 patients were further evaluated with VQ. The mean age of these patients was 53 (median 52) years at the time of data acquisition. According to the study protocol, histological evaluation of the lesions was required for data correlation. In 13 patients, follow-up examinations of at least 24 months documented the benign nature of the lesion.

Magnetic resonance imaging

MRI was performed with a 1.5 Tesla system (Magnetom Vision, Siemens, Germany) equipped with a dedicated surface coil to enable simultaneous imaging of both breasts. The patients were placed in a prone position. First, transversal images were acquired with a short T1 inversion recovery (STIR) sequence (TR = 5,600 ms, TE = 60 ms, FA= 180°, TI = 150 ms, matrix size 256×256 pixels, slice thickness 4 mm). Then a dynamic T1-weighted gradient echo sequence (3-D fast low-angle shot sequence; TR = 12 ms, TE = 5 ms, FA 25°, matrix size 256×256 pixels, FOV 350 mm, effective slice thickness 4 mm) was performed in transversal slice orientation. The direction of phase encoding was selected right to left in order to avoid pulsation artefacts. The dynamic study consisted of six measurements each with a measurement time of 83 s and an interval of 110 s. The first frame was acquired before injection of paramagnetic contrast agent (gadopentetate dimeglumine, 0.1 mmol/kg body weight, Magnevist, Schering, Germany) immediately followed by the five other measurements. The initial localization of suspicious breast lesions was performed by computing subtraction images, i.e., subtracting the image data of the first from the fourth acquisition.

Conventional data analysis

As a clinical standard method to analyze dynamic MRI of the breast, a region of interest (ROI) surrounding the contrast enhancing lesion was defined. For all voxels belonging to this ROI, an average SI time curve was computed. Kuhl et al. reported in 1999 different patterns of SI time courses and correlated them with the histological diagnoses of the respective lesions [5]. They defined three different types of SI time curves. Type I represented a linear (Ia) or asymptotic (Ib) function, where the SI increases during total data acquisition. Type II showed a rapid initial signal increase with a postinitial plateau phase. Type III was characterized by a wash-out phenomenon during the intermediate or late postcontrast phase. The correlation of these different types with the histological findings showed a predominance of type I in the group of benign lesions whereas type II and type III lesions were predominantly correlated with malignant lesions.

In the present study, we selected only lesions with an initial contrast enhancement ≥50% for comparative analysis between the standard evaluation method and VQ. Therefore, we used a semiautomatic segmentation method to define an ROI including all voxels of a lesion with an initial contrast enhancement of ≥50%. The initial contrast enhancement was calculated according to the following equation:

$${{\left( {SI\,_{{2nd frame post - contrast}} - SI\,_{{pre - contrast}} } \right)}} \mathord{\left/ {\vphantom {{{\left( {SI\,_{{2nd frame post - contrast}} - SI\,_{{pre - contrast}} } \right)}} {SI _{{pre - contrast}} x 100 [\% ]}}} \right. \kern-\nulldelimiterspace} {SI _{{pre - contrast}} x 100 [\% ]}$$

.

The center of the lesion was interactively marked on one slice of the subtraction images. In a second step, a region-growing algorithm included all adjacent contrast-enhancing voxels. For conventional data analysis, we calculated the mean initial signal increase and the mean postinitial signal course of all voxels included in the ROI. The postinitial signal course was calculated according to the following equation:

$${{\left( {{\text{SI}}\,_{{5th frame post - contrast}} - {\text{SI}}\,_{{maximum\,1st to 2nd frame post - contrast}} } \right)}} \mathord{\left/ {\vphantom {{{\left( {{\text{SI}}\,_{{5th frame post - contrast}} - {\text{SI}}\,_{{maximum\,1st to 2nd frame post - contrast}} } \right)}} {{\text{SI}}\,_{{maximum 1st to 2nd frame post - contrast}} \,x\,100\,{\left[ {\raise0.5ex\hbox{$\scriptstyle 0$}\kern-0.1em/\kern-0.15em\lower0.25ex\hbox{$\scriptstyle 0$}} \right]}}}} \right. \kern-\nulldelimiterspace} {{\text{SI}}\,_{{maximum 1st to 2nd frame post - contrast}} \,x\,100\,{\left[ {\raise0.5ex\hbox{$\scriptstyle 0$}\kern-0.1em/\kern-0.15em\lower0.25ex\hbox{$\scriptstyle 0$}} \right]}}$$

Subsequently, we evaluated the curves according to the classification described above. For further statistical evaluation, we defined two different thresholds of postinitial signal change (±5% and ±10%) to classify a plateau.

Algorithm of vector quantization

VQ represents a fast clustering technique grouping image pixels together based on the similarity of their intensity profile in time. In the clustering process, a time course with n points is represented by one point in an n-dimensional Euclidean space, which is subsequently partitioned into clusters based on the proximity of the input data. These groups or clusters are represented by prototypical time series called codebook vectors (CV) located at the center of the corresponding clusters [6, 7, 1012]. VQ approaches determine the cluster centers wi by an iterative adaptive update based on the following equation:

$${\overrightarrow{w}} _{i} {\left( {t + 1} \right)}\, = \,{\overrightarrow{w}} _{i} {\left( t \right)} + \varepsilon {\left( t \right)}a_{i} {\left( {{\overrightarrow{x}} {\left( t \right)},C{\left( t \right)},\kappa } \right)}{\left( {{\overrightarrow{x}} {\left( t \right)} - {\overrightarrow{w}} _{i} {\left( t \right)}} \right)}$$
(1)

where ε(t) represents the learning parameter, a i a codebook C(t)-dependent cooperativity function, κ a cooperativity parameter, and χ a randomly chosen feature vector.

The fuzzy clustering technique based on deterministic annealing represents a powerful tool for the analysis of MRI SI time courses. The update equation for the CVs based on this VQ approach can be derived from Eq. (1). The cooperativity function a i is given by:

$$a_{i} {\left( {{\overrightarrow{x}} } \right)} = \frac{{e^{{ - \frac{{E_{i} {\left( {{\overrightarrow{x}} {\left( t \right)}} \right)}}}{{2\rho ^{2} {\left( t \right)}}}}} }}{{{\sum\limits_{i = 1}^N {e^{{ - \frac{{E_{i} {\left( {{\overrightarrow{x}} {\left( t \right)}} \right)}}}{{2\rho ^{2} {\left( t \right)}}}}} } }}}$$
(2)

ρ is the “fuzzy range” of the model, and defines a length scale in data space and is annealed to repeatedly smaller values in the VQ approach. In parlance of statistical mechanics, ρ represents the temperature T of a multiparticle system by T = 2ρ2. The cooperativity function a i is the so-called softmax activation function, and accordingly, the outputs lie in the interval [0, 1] and they sum up to one. The resulting learning rule for fuzzy clustering based on deterministic annealing is given below.

$$\ifmmode\expandafter\vec\else\expandafter\vecabove\fi{w}_{i} {\left( {t + 1} \right)} = \ifmmode\expandafter\vec\else\expandafter\vecabove\fi{w}_{i} {\left( t \right)} + \varepsilon {\left( t \right)}\frac{{e^{{ - \frac{{E_{i} {\left( {\ifmmode\expandafter\vec\else\expandafter\vecabove\fi{x}{\left( t \right)}} \right)}}} {{2\rho ^{2} {\left( t \right)}}}}} }} {{{\sum\limits_{i = 1}^N {e^{{ - \frac{{E_{i} {\left( {\ifmmode\expandafter\vec\else\expandafter\vecabove\fi{x}{\left( t \right)}} \right)}}} {{2\rho ^{2} {\left( t \right)}}}}} } }}}{\left( {\ifmmode\expandafter\vec\else\expandafter\vecabove\fi{x}{\left( t \right)} - \ifmmode\expandafter\vec\else\expandafter\vecabove\fi{w}_{i} {\left( t \right)}} \right)}$$
(3)

This learning rule describes a stochastic gradient descent on an error function that is free energy in a mean-field approximation. The algorithm starts with one cluster representing the center of the whole data set. Gradually, the large clusters split up into smaller ones representing smaller regions in the feature space. This represents a major advantage over fuzzy c-means clustering since this algorithm does not employ prespecified cluster centers.

Data processing

The Digital Images in Communication and Medicine (DICOM) images were converted into a binary data format. Further processing was performed with a software application based on IDL (IDL Research Systems, Inc. Boulder, CO, USA). In a first step, the respective lesion had to be segmented from the total image information. In an automated step, all pixels with an initial signal increase of at least 50% were marked. Subsequently, the respective lesion was extracted with a region-growing algorithm. With this method, a voxel within the lesion was marked by mouse click, and all connected voxels, also those from neighbor slices, were selected (Fig. 1a,b). So, the 3-D extension of the lesion was taken into account. Limitations became apparent when the lesion was connected with diffuse contrast enhancement, as sometimes present in mastopathic tissue. In such cases, interactive ROI definition was performed.

Fig. 1
figure 1

Semiautomatic region of interest (ROI) definition: a all pixels with a signal increase = 50% were marked. b The respective lesion was extracted with a region-growing algorithm

The algorithm of VQ was applied to the selected voxels within the respective lesion, each voxel representing a time series of relative SI change according to

$$x{\left( \tau \right)} = \frac{{S{\left( \tau \right)} - S{\left( {\tau = 1} \right)}}} {{S{\left( {\tau = 1} \right)}}}$$

, where \(S{\left( \tau \right)}\), \(\tau \in {\left\{ {1, \cdots ,6} \right\}}\) represent the measured raw SI time series, i.e., the signal intensities of the precontrast scan at \(\tau = 1\) serve as a reference in the sense of an implicit normalization. By this, our approach may get less sensitive to changing between different MR scanners and/or protocols; nevertheless, a residual dependance on contrast agent dosage may be observed.

Minimal free-energy VQ was implemented in the computer language C [11]. The details of the method were described previously [7]. We did not apply additional normalization to the SI time curves in order to include information about the signal amplitude. The computer application assigns the SI time courses of all voxels to a number of prototypical time series ((CV), Fig. 2) using the method of minimal distance. As the mean lesion diameter was 14 mm comprising a median of 81 voxels, we defined the number of CVs to be n=4. Those voxels assigned to one CV were superimposed with the morphological images ([cluster assignment maps (CAM)], Fig. 3).

Fig. 2
figure 2

Example of ductal cancer: four prototypical signal intensity (SI) time series [codebook vectors (CV)]

Fig. 3
figure 3

Example of ductal cancer: corresponding cluster assignment maps (CAM) of the voxels assigned to the four codebook vectors (CV) (displayed in Fig. 3) in one representative slice

Quantitative analysis and statistics

All data were summarized in a postscript file. Lesion size was described as the number of voxels with an initial SI increase of ≥50%. The interobserver variability of the semiautomatic segmentation procedure was assessed in a group of 20 lesions. Therefore, ten benign and ten malignant randomized lesions were segmented by two unbiased observers (MS, TS). The resulting lesion sizes were compared.

Four CVs were calculated for each lesion. For each CV, the initial signal increase and the postinitial signal course was calculated. Subsequently, the CVs were classified according to their signal course: type I (continuous increase), type II (plateau), or type III (wash out). The CV of each lesion with the highest classification level (most typical for malignancy) was used for further statistical evaluation.

The results of VQ were compared with conventional analysis. Therefore, the initial signal increase was varied in three steps (≥50%, ≥75%, ≥100%) as a threshold for malignancy (Fig. 4). We defined two different thresholds of the postinitial signal change (±5% and ±10%) to classify a plateau. Then two thresholds of malignancy were defined according to the classification of the SI time course (≥type II or type III only). Sensitivity, specificity, and accuracy were calculated for the conventional method and VQ taking all combinations of the thresholds described above into consideration.

Fig. 4
figure 4

Classification of signal intensity (SI) time curves

Results

All lesions (n=92) with an initial signal increase ≥50% after contrast injection were included in the comparative analysis of the conventional method and cluster analysis. Histological findings were malignant in 51 and benign in 28 lesions. Another 13 lesions were classified as benign by follow-up examinations. The size of each lesion was measured on the mammographic images. The mean size of all 92 lesions was 1.4 cm (benign 1.2±0.7 cm, malignant 1.5±1.2 cm). In the MRIs, the benign lesions showed a median number of 61 voxels compared with 97 voxels in the malignant lesions. According to the Tumor, Node, Metastases (TNM) classification, the malignant lesions were subdivided in 7 Tis, 2 T1a, 18 T1b, 18 T1c, 3 T2, 2 T3, and 1 T4 tumors.

In this study, lesion segmentation was semiautomatically performed. The evaluation of the total study was performed with only one observer. In order to test the stability of lesion segmentation, we analyzed the interobserver variability in a sample of 20 lesions. Fig. 5 shows a plot of the lesion sizes (number of voxels included in each lesion) determined by two unbiased observers. A high correlation (R2 = 0.98) was found.

Fig. 5
figure 5

Correlation of lesion size defined by two unbiased observers with semiautomatic lesion segmentation

The results of the conventional method in comparison with VQ were correlated with the histological findings (Table 1 and 2). The analyses shown in these tables represent a threshold of malignancy ≥type II, including all lesions with an initial signal increase ≥50% and a plateau phase defined with a postinitial signal change of ±10%. Sensitivity increased with VQ in DCIS, in invasive ductal, and especially in lobular carcinoma. Specificity slightly decreased with VQ in mastopathic epithelial proliferations, fibroadenomas, and scars.

Table 1 Comparison of the conventional method and vector quantization (VQ) in the detection of malignant lesions
Table 2 Comparison of the conventional method and vector quantization (VQ) in the evaluation of benign lesions

In a second step, we tried to optimize the accuracy of both methods. The results for different thresholds of malignancy are summarized in Tables 3 and 4.

Table 3 Results of the conventional analysis method for detection of breast cancer
Table 4 Results of the vector quantization (VQ) method for detection of breast cancer

With the conventional method, highest accuracy in detecting breast cancer was achieved when the threshold of the initial signal increase was ≥75%, and when the plateau phase (type II) in the postinitial course was within a range of ±10%. Under these conditions, accuracy was 71% when the threshold of malignancy was defined ≥type II. With VQ, a maximum accuracy of 75% was achieved when the threshold of the initial signal increase was ≥75%, when the plateau phase (type II) in the postinitial course was within a range of ±5%, and when the threshold of malignancy was defined ≥type II. The increase of accuracy with VQ resulted predominantly from an increase of sensitivity.

Discussion

MR mammography as a highly sensitive method for the detection of invasive breast cancer is limited by a high rate of false positive findings [4, 1317]. Especially in premenopausal breast parenchyma, cyclical-phase dependency of contrast enhancement was observed [18, 19]. Intraepithelial mastopathic proliferations can show increased contrast enhancement, which can result in false positive findings. On the other hand, malignant lesions could be masked by diffuse contrast enhancement of the surrounding parenchyma [20, 21]. The detection and characterization of focal lesions in dynamic breast MRI depends on the experience of the observer. The goal of our study was to develop a computer-aided method for the evaluation of focal contrast-enhancing lesions. Therefore, we compared an established classification method with semiautomatic analysis of SI time series based on VQ.

Conventional evaluation of SI time series of breast lesions requires the elimination of movement artefacts. With cluster analysis, reduction of movement artefacts becomes even more important, as VQ is based on the analysis of pixel time series. Therefore, patients received detailed information prior to the MRI examination and were carefully positioned into the surface coil. Programs for movement correction are limited by the deformability of the breast tissue and were not used in this study [22, 23].

For VQ, hierarchical methods such as minimal-free-energy VQ and nonhierarchical methods such as self-organizing maps (SOMs) or fuzzy clustering are available. Deterministic annealing by the minimal-free-energy VQ algorithm can serve as a useful method to unveil the properties of image sequences by gradually resolving the fine-grained structure of the data set in an unsupervised repetitive cluster-splitting process. Potential advantages of the minimal-free-energy VQ compared with nonhierarchical methods have been reported [7]. Deterministic annealing allows data exploration on different scales of clustering resolution by gradually increasing the number of cluster centers in a self-organized, solely data-driven procedure. A further advantage of deterministic annealing seems to be an increase of result quality concerning cost function criteria [24, 25]. Finally, minimal-free-energy VQ offers a high reproducibility, meaning that the results are less dependent on the selection of starting conditions, compared with nonhierarchical methods.

Prior to the clustering process, the number of CVs has to be defined. If the number of CVs is chosen too small, interesting phenomena involving only a small subset of pixels cannot be detected in a suitable clustering resolution. If it is chosen too large, a too-detailed subclassification can mask the major properties of the results. There is no general answer to this problem of so called “cluster validity.” In our study, we evaluated groups of voxels that were preselected by segmentation using the initial signal increase. The lesions mainly included small numbers of voxels. Therefore, we defined a number of four CVs for subdifferentiation.

Lesion segmentation was semiautomatically performed. User-dependent steps during lesion segmentation included the selection of the lesion, which had to consider previously obtained clinical, mammographic, and sonographic findings. In general, the margins of the lesions were automatically defined. Limitations of this method were observed in some lesions, which were surrounded by diffuse contrast-enhancing tissue. In these cases, voxel connections between the lesion and the surrounding tissue had to be interactively removed. However, the interobserver variability between different observers showed a high correlation concerning lesion size.

The conventional method was compared with the results of VQ. With the conventional method, the best accuracy (71%) in detecting breast cancer was achieved when a threshold of the initial signal increase ≥75% was defined for malignancy. In addition, lesions were classified as malignant by an SI course ≥type II (plateau or wash out). The plateau was defined as a postinitial signal change of ±10%. With VQ, a maximum of accuracy (75%) was observed when the threshold of the initial signal increase was ≥75%, the SI course was classified ≥type II, and the plateau was defined as a postinitial signal change of ±5%. The slight improvement using VQ was mainly achieved by an increase of sensitivity, especially in invasive lobular carcinoma and DCIS. However, specificity slightly decreased.

For the detection of invasive carcinomas, previous studies reported sensitivities from 90% to almost 100% whereas specificity was between 75% and 85% [2628]. In DCIS, detection rates of 70–80% were described [2931]. In our investigation, sensitivity was lower compared with other studies. When comparing different results, one has to distinguish between studies that include predominantly highly suspicious lesions (BIRADS 4 to 5) and others that include subtle indeterminate lesions (BIRADS 3). In our study population, (n=399) there was a high number of true negative lesions, which showed absent or minor contrast enhancement. For further evaluation with VQ, we included only lesions with an initial signal increase of ≥50% (n=92). This subpopulation mainly contained small lesions.

In the present study, only the SI time series were used for lesion classification. The evaluation of signal intensity time curves is of importance in the differentiation of benign from malignant lesions in breast MRI. Morphologic information including shape, borders, internal structure, presence of septation, or rim enhancement can improve the diagnostic accuracy [2]. Especially, a segmental or linear enhancement pattern was found to be highly specific for intraductal neoplastic changes [32]. Szabó et al. applied an artificial neural network, including signal kinetics and morphological characteristics. They found that margin type, time-to-peak enhancement, and wash-out ratio showed the highest discriminative ability among diagnostic criteria [33]. In the present study, we investigated predominantly small lesions in which morphologic criteria can hardly be evaluated. Therefore, kinetic information seems to be more important for evaluation. Further studies should evaluate if additional information could be achieved by analysis of the CAMs and morphological criteria.

A higher number of data points in the SI time course could be another approach to further improve the classification of dynamic MR mammography by neural networks. Other authors investigated dynamic MRI with a repetition every 23 s over 32 cycles [34]. For discrimination between malignant and benign lesions, the network with the highest number of input nodes performed best, but even with a three-input model, nearly equivalent classification results were found.

Conclusion

The aim of the study was to analyze the use of VQ for the evaluation of dynamic MR mammography. Measurements were performed in a population with mainly small indeterminate mammographic lesions. A slight improvement using VQ was found compared with the conventional evaluation method. Therefore, VQ could be used as a basis for computer-aided diagnosis of breast cancer with MR mammography.