Introduction

Breast cancer is the number one cause of deaths in women and it corresponds to approximately 15% of all cancer deaths among women. Worldwide, 570,000 women died of breast cancer in 2015 [1].

Numerous experiments have established that an early detection of cancer eases the treatment, reducing risks, as well as the mortality percentage in 25% [2]. For early detection, mammography is an imaging tool with high sensitivity and it is the most recommended by the guidelines of the World Health Organization [3, 4].

A mammogram is the best diagnostic tool to find a cluster of microcalcifications (MCs) in a glandular duct. MCs appear as white specks [5], and they are early signs of breast cancer. Microcalcification lesions are difficult to detect by human vision since microcalcification size is between 0.5 to 2 mm. For this reason, the false positive rate of a radiologist is reported at 15% [6] and the false negative rate at 20% [7].

This work proposes an algorithm, which detects microcalcifications on mammograms based on morphologic processing, learning machines and a very small set of features. These arguments encourage the implementation of autonomous diagnostic tools to detect early risks of breast cancer by finding the presence of MCs on mammograms so that patients can follow specialized treatment.

Related work

The work in [8], proposed a microcalcification detecting system which applies a Swarm Optimization Neural Network (SONN). Features applied to this classifier were extracted using texture energy measures obtained through a convolutional kernel. The method in [9] proposed a biological adaptive model of contrast detection. This model is based on the human visual system (HSV) to adapt the contrast according to the HSV model. Before applying this model, the image is filtered based on anisotropic diffusion and curvilinear structure using local energy and phase congruency. The aim is to reduce false positives due shot noise or curvilinear structures. In another method [10], the microcalcification segmentation is based on the geodesic active contours (GAC) technique associated with anisotropic texture filtering. Authors of [11] proposed microcalcification detection using the two-dimensional discrete wavelet transform. Before segmenting the region of interest, an enhacement step is applied by Logarithm transformation for dynamic range manipulation. To extract ROIs, a binarization step with automatic threshold and morphological operations are applied; followed by unsharp masking to enhace the ROI. As a final step, a discrete Wavelet transform is applied to detect microcalcifications.

Previous works are characterized by the classifier and database that are used to test the proposed approach. Well known public databases are Digital Mammogram Database, (MIAS) [12] and the Digital Database for Screening Mammography, (DDSM) [13]. Works that have used only one of these databases are: [6, 8, 9, 11, 14]. Works that used both datasets are [17, 19,20,21]. Some works did not use a public database such as [10, 15, 16]. Methods based on KNN classifiers are found in [16,17,18] and those based on SVM are found in [8, 16, 19]. Both classifiers are used in [16]. Other classifiers used for MC detection are Fuzzy C-means with Features (FCM-WF) [15] and Adaboost [6].

From this analysis, it is observed that there are significant differences in terms of databases and the number of images used by each method. One difficulty is that different works do not specify what images of the database were used so that it is not possible to tell if hard images are taken out of the testing set, thus making any comparison unfair. Six of the reviewed works on Table 3, do not specify which images left off from their analysis.

This work improves over current state of the art on significant reduction of false positives on dense mammograms by using an annulus model and a set of few features that leads to an overall improved method to detect microcalcifications on mammograms, compared to previous work.

Material and methods

Mammogram databases

We use the two most popular public databases MIAS [12], and DDSM [13]. The first database contains 322 medio–lateral (MLO) mammograms at a spatial resolution of 50 μm/pixel and 8 bits/pixel. Mammograms are also classified in terms of breast density type; fatty, fatty-glandular, and dense. As it is shown in the Table 1, this database contains 207 normal images (without microcalcification clusters) and 20 images with clusters of microcalcifications (5 fatty, 6 fatty-glandular, 9 dense). Images with microcalcifications are provided along with their corresponding Ground Truth (GT), while normal images do not have GTs. The ground truth specifies information regarding ROIs which are clusters with microcalcifications. Those 20 images, with specified ground truth, contain 25 regions of interest which include microcalcification clusters. From these 25 ROIs, 268 individual microcalcifications were extracted as it is explained in the section “Microcalcification Extraction”. Three images (1 fatty, 1 fatty-glandular, and 1 dense) have no clusters of microcalcifications, thus they have no Ground Truth specified; only distributed isolated microcalcifications and that is why they are not considered for analysis. The remaining images from the database correspond to images with other type of lesions without microcalcifications.

Table 1 MIAS breast density

The DDSM database was digitized by four different scanners. Table 2 shows information related to this database. This Table does not contain the same type of detailed information as Table 1 because DDSM images in the database do not include breast density type.

Table 2 DDSM for abnormality

Generation of candidates without microcalcifications

The aim of this part of the method is to generate a training set of 21 × 21-pixel image patches, which correspond to candidates. The number of patches with microcalcifications (candidates to be detected as True Positives) is the same as the number of patches without microcalcifications (candidates to be detected as True Negatives).

To generate patches without microcalcifications, normal images are randomly selected, and from each randomly selected normal image a patch is randomly extracted. Each normal image is tagged with a number, and to generate a set of randomly selected normal images, n random image tags are generated by means of a discrete uniform probability density function, f(x) = 1/m; where x ∈ {1, 2, …, m} is an image tag number; and m is the total number of images from each mammogram density. Parameters, (n, m), are (n = 121, m = 76), (n = 97, m = 65) and (n = 50, m = 66) for dense, fatty-glandular and fatty mammograms, respectively. For each mammogram density parameter n is chosen so that the number of candidates, which are MCs, is the same as the number of candidates, which are not. According to Table 2, there are 121 MCs in 9 dense mammograms, 97 MCs in 6 fatty-glandular mammograms, and 50 MCs in 5 fatty mammograms.

Similarly, one pair of random numbers, (r, c), is generated for each randomly selected normal image. Parameters r and c are coordinates of the center of a randomly chosen 21 × 21-pixel patch on the given mammogram.

Ground truth region extraction

The proposed approach, to solve the problem of detecting microcalcifications (MC), is separated into three main stages, extraction of abnormal clusters or regions of interest, extraction of individual candidates from abnormal clusters, and classification of candidates.

Extraction of abnormal regions of interests provided by GT images of the database which were specified by a radiologist, by giving (1) coordinates (x, y) of the center of each cluster of interest, and (2) an approximate radius, in pixels, of a circle enclosing an abnormal cluster, as it is shown in Fig. 1. Rather than enclosing a cluster by a circle (Fig. 1 b), a square is used (Fig. 1 c).

Fig. 1
figure 1

a Digital mammogram from MIAS; b) Ground Truth with a white circle enclosing cluster of microcalcifications; c) Ground Truth with a red square enclosing a cluster of microcalcifications

Extraction of microcalcifications candidates from abnormal clusters

Extraction of microcalcifications, from an abnormal cluster, is separated into 4 stages: segmentation (Beucher Gradient and Enhancement of Gradient Image), binarization, feature extraction, and classification. Fig. 2 shows a mammogram region after different operations are applied to detect microcalcifications.

Fig. 2
figure 2

Microcalcifications cluster, segmentation, binarization, feature extraction, and classification

Segmentation

Microcalcification clusters are obtained from the GT where regions with microcalcifications are specified. These clusters are regions of different sizes. The first two blocks (Original image and Ground Truth region extraction), in Fig. 2, correspond to the mammogram along with a MC cluster specified by the GT, and the purpose of segmentation block is to localize must changing borders. This is accomplished in two steps: Beucher Gradient application and enhancement of gradient image.

Beucher gradient

Because of the fact that dilation of gray-level images enhances bright regions and suppresses dark regions while eroding enhances dark regions and suppresses bright regions, where the area of the suppressed region is smaller than that of the specified structuring element b(r, c), both operations are combined, through the use of the high-pass filter, Beucher Gradient [22]. The erosion of a gray-level image f(r, c) by a structuring element b(r, c) at location (r, c) is obtained by selecting the minimum value of f − b inside the region of intersection over which both functions f and b are defined according to

$$ \left[f\circleddash b\right]\left(r,c\right)=\underset{\left(x,\kern0.5em y\right)\in b}{\min}\left\{f\left(r-x,c-y\right)-b\left(x,\kern0.5em y\right)\right\} $$
(1)

The dilation of a gray-level image f(r, c) by a structuring element b(r, c) at location (r, c) is defined by finding the maximum value of f + b inside the common region between both, function f and structuring element b, according to

$$ \left[f\oplus b\right]\left(r,c\right)=\underset{\left(x,y\right)\in b}{\max}\left\{f\left(r-x,c-y\right)+b\left(x,\kern0.5em y\right)\right\} $$
(2)

By considering flat structuring elements with zero entries, eroding or dilating of a gray-level image with a structuring element consists in finding the minimum or maximum value of the image inside the region bounded by the intersection of the image and the structuring element.

The morphological gradient, Beucher Gradient, is the arithmetic difference between the dilated and the eroded version of the gray level image of interest f(r, c), by a structuring element b(r, c),

$$ g\left(f\left(r,c\right)\right)=\left[f\oplus b\right]\left(r,c\right)-\left[f\circleddash b\right]\left(r,c\right) $$
(3)

The result of applying Beucher Gradient on a mammogram is shown in the upper right part of Fig. 2.

Enhancement of gradient image

To improve the quality of the filtered image, a 3×3 median filter is applied, a non-linear filtering technique to remove noise while preserving edges. To enhance edges, a process, called unsharp masking, is applied, where a smoothed version of the image, fsmooth(r, c), is subtracted from the original image, subtracting away the low-frequency components of the signal, and yielding the high-frequency content,

$$ {f}_{high- pass}\left(r,c\right)=f\left(r,c\right)-{f}_{smooth}\left(r,c\right) $$
(4)

where the high-pass image component can be used for sharpening by adding it to the original image. Thus, the complete unsharp masking operator is given by

$$ {f}_{sharpen}\left(r,c\right)=f\left(r,c\right)+A\times {f}_{high- pass}\left(r,c\right) $$
(5)

where A is a scaling constant, set to 0.7. The result of applying median filtering, followed by unsharp masking is shown in the lower right part of Fig. 2.

Binarization

Thresholding is applied to generate a binary image as it is depicted in the block of binarization of Fig. 2. The MC are characterized by abrupt border changes, the enhanced gradient image of the segmentation block represents these changes; thus, the top 10% of the gradient values are most probable to represents MC borders. Therefore, the threshold value, for binarization, is established at 90% of the highest gray value of the enhanced GT region of interest.

One impairment of binarization is that remaining noise might be misclassified as a candidate to microcalcifications. To reduce the likelihood of the occurrence of these misclassifications, ROIs, with radii smaller than 0.1 mm, are eliminated using opening with a disk-like structural element of 0.2-mm diameter. The reason for choosing a structuring element of 0.2-mm diameter is based on the consideration that the diameter, of the smallest microcalcification, is 0.2 mm. The opening of a binary image f(r, c), by a structuring element b(x, y), is given by f ∘ b = (f ⊝ b) ⊕ b, and it eliminates objects smaller than the structuring element. Resolution of digital mammograms, for both databases, is 50 μm per pixel. Thus, the size of the structuring element, in pixels, is \( \frac{0.2\ mm/ diameter}{50\ \mu m/ pixel}=4\ \frac{0.2\ mm/ diameter}{50\ \mu m/ pixel} \)

Another consideration is that the diameter, of the largest microcalcification, is 1 mm. Thus, the size of the circle, which encloses a candidate to microcalcification, is \( \frac{1\ mm/ diameter}{50\ \mu m/ pixel}=20\ \frac{1\ mm/ diameter}{50\ \mu m/ pixel} \), and the area of the corresponding square is chosen as 21×21 pixels. Each MC candidate is in a 21×21 image patch and its center is established at the position of the highest gray level value.

To recover the complete shape of candidates, at all locations of interest, an algorithm for extraction of connected components is used. Another motivation for extraction of connected components is to assign a label to each region of interest for sub-sequent automatic extraction of properties from each labeled ROI, mainly the position of the highest gray level value inside the region.

Feature extraction

Also, in the feature extraction block of Fig. 2 it is shown the extraction of features from a candidate to microcalcification. It is useful to visualize a microcalcification in the three-dimensional space, as a gray level function of coordinates (x, y), as it is observed in Fig. 3. This three-dimensional reconstruction provides an approximation of the projection of an actual microcalcification into a set of intensity values on a digital mammogram. The three-dimensional reconstruction of a microcalcification consists of a prominent peak in relation to local surroundings on the mammogram. Thus, it is feasible the modeling of a microcalcification based on a set of surface levels.

Fig. 3
figure 3

Visualization of one microcalcification

To detect real microcalcifications, four features are extracted from a candidate. Information is obtained from three different surface levels assigned to each ROI, by using a mask, which contains the distribution of these surface levels. Fig. 4 a) shows a ROI with 21 X 21 pixels and with its center at the maximum intensity value. Information, for each surface level of the ROI, is extracted by overlapping the ROI with a mask which shows the distribution of each of the three surface levels. Fig. 4 b) shows the mask along with the distribution of each surface level. The surface level distribution consists of three concentric annuli with respective radii R, R + 2 and R + 4. This work uses R = 3, by considering known sizes of microcalcifications. Each annular region, Aannulus, provides information of interest regarding each surface level. Each annular region is labeled by an integer number in {1, 2, 3}.

Fig. 4
figure 4

a Region of interest b) with corresponding annulus mask

After overlapping the mask with one ROI, information from the three annular regions is used for extraction of a four-entry feature vector, f = [f1, f2, f3, f4]T, according to,

$$ {f}_1=\mathit{\max}\left({A}_{annulus1}\right)-\mathit{\max}\left({A}_{annulus2}\right) $$
(6a)
$$ {f}_2=\mathit{\max}\left({A}_{annulus1}\right)-\mathit{\max}\left({A}_{annulus3}\right) $$
(6b)
$$ {f}_3= mean\left({A}_{annulus1}\right)- mean\left({A}_{annulus2}\right) $$
(6c)
$$ {f}_4= entropy\left({A}_{annulus1}\right)- entropy\left({A}_{annulus2}\right) $$
(6d)

where functions max(), mean(), and entropy() are the maximum, mean and entropy values, respectively, of the corresponding annular region intensity values.

The first and the second feature f1, f2, represents the difference between the peak intensity value, in the first annular region, and the peak, on the second and third annulus, respectively. For the third feature, f3, is the difference between first and second mean values. Another feature is the entropy value.

Classification

The classification of true microcalcifications is depicted in the classification block of Fig. 2. To decide whether a ROI is a microcalcification or not, the classifier is implemented by using KNN and SVM.

KNN classifier

The KNN is a non-linear classifier. To assign a class to an unknown feature vector x, K feature vectors, out of set of N training feature vectors {xi; i = 1, …, N}, are identified as the nearest neighbors to the unknown x. Each one of the k nearest neighbors, xi, belongs to a corresponding class, \( {\mathcal{C}}_i \), where the number of classes is two (normal and abnormal). Out of the K nearest neighbors to x, the number of nearest neighbors, ki, that belong to class \( {\mathcal{C}}_i \) (i = 1, 2), are identified, where k = k1 + k2. The class, assigned to x, is the one with the largest ki.

SVM classifier

An SVM is an optimal classifier which is geometrically represented by a separating hyperplane which is the furthest away from each class after training this classifier with labeled data. The SVM, in this work, used a Gaussian Kernel function, with (1) one output, which provides two possible outcomes, corresponding to two different classes (microcalcification or abnormal region, normal region), and (2) four inputs according to the size of the feature vector used.

Performance evaluation

To compare works that detect microcalcifications, it is essential to compare efficiency among different proposed methods. To evaluate the performance of the proposed method, True Positive Rate (TPR) or sensitivity, False Positive Rate (FPR), specificity and accuracy are used as figures of merit. TPR, also known as sensitivity or recall or detection alarm, is the probability that the outcome of a diagnosis is positive given that the patient presents breast cancer, and it is given as,

$$ TPR=\frac{TP}{TP+ FN} $$
(7)

where true positives (TP) are those microcalcifications correctly identified and false negatives (FN) are those microcalcifications incorrectly rejected. False Positive Rate (FPR), also known as false alarm, is defined as the probability that the outcome of a breast cancer diagnosis is positive given that the patient is healthy according to

$$ FPR=\frac{FP}{TN+ FP} $$
(8)

where true negatives (TN) are those cases correctly rejected and false positives (FP) are those artifacts incorrectly detected as microcalcifications. Specificity is defined as 1 − FPR.

Accuracy specifies the percentage of breast cancer diagnosis which are correct,

$$ Accuracy=\frac{TP+ TN}{TP+ TN+ FP+ FN} $$
(9)

The receiver operating characteristic (ROC) curve compares operating characteristics, TPR vs. FPR by plotting them at different plotting settings. The area under the curve (AUC) is equal to the probability that a classifier ranks a randomly chosen positive higher than a randomly chosen negative one.

Cross-validation is k-fold Cross Validation (k-fold CV) where the training set is randomly divided into k sub-sets or folds, of equal length. One of the folds is used as a validation test while the remaining k – 1 folds are used for training. This process is repeated k times and for each fold all performance parameters are estimated. An overall performance parameter (specificity, sensitivity, accuracy) is computed by averaging the k estimates of the parameter of interest.

Experimental results

Experimental set up and efficiency

Experimental analyses were carried out to evaluate the proposed method by using the public databases MIAS [12] and DDSM [13] with a MATLAB R2016a implementation. Experiments were executed on a laptop computer with an AMD A10-4600 M processor at 2.3 GHz, and 8 GB RAM. The cross-validation process is 10-fold CV.

Comparison with other methods

Table 3 shows the performance of different methods, including the proposed one, in terms of TPR or sensitivity, FPR, accuracy and AUC where different public databases are used. The purpose of Table 3 is to show the different databases, and performance measures, used by the scientific community, working on MC detection. Some methods do not report some performance measurements. Our approach achieves the highest metric values in terms of TPR, accuracy and AUC, and it also reaches the lowest FPR values.

Table 3 Performance of different methods for microcalcification detection

Conclusions

Important aspects to the solution of this problem are the reduced number of features (four features), low computational cost, the use of a microcalcification model based on annular regions, features which are independent of image resolution, high performance results. The proposed method promises a good future because of its simplicity for implementation and the advantage of needing a reduced number of features.

The proposed method uses all available mammograms, with MCs, from each database. It also analyzes all the available microcalcifications. To account for FP and TN, normal candidates are randomly generated from the set of MIAS normal images so that the number of ROIs with MC and that of ROIs without MC are equal. Another highlight is the achieved false positive rate in different density mammograms.

The detection of microcalcification candidates, based on the high-pass filter Beucher Gradient, makes the proposed method achieve high performance in detecting microcalcifications on dense mammograms since it locates microcalcifications on areas of low contrast, which is a condition of dense mammograms. Besides, background noise is considerably reduced in dense mammograms and this reduction is higher than that on mammograms with other density type which allows the improvement of feature extraction based on the annulus model.

After comparing the proposed approach with other recent methods, our approach achieves the best performance in terms of true positive rate (TPR), false positive rate (FPR), accuracy, and area under the ROC curve; even though other methods are not applied to all available abnormal images, from a database; and the fact that these other works do not specify image selection for experiments. Methods, for MC detection on dense mammograms, show very low performance; however, we give the best performance during MC detection on dense mammograms with 0.9752 for TPR, 0 for FPR, 0.9876 for accuracy, and 0.9951 for AUC. The proposed method outperforms others because of the benefits of using the annulus-based microcalcification model for feature extraction.