Introduction

Breast cancer is the most common cancer found in woman today, with one in eight women in North America developing breast cancer during her lifetime. Now, after slow increase in breast cancer rate for many years, breast cancer death rate is decreasing. The decline in the breast cancer mortality rate has been attributed to both improvements in breast cancer treatment and early detection. Detection of early invasive breast cancer is important to reduce breast cancer mortality. Women with invasive cancers of 1 cm or smaller have a 95 % chance of survival at 10 years, while those with invasive cancers 1–2 and 2–5 cm in size have 85 and 60 % survival chance at 10 years, respectively [1]. Many studies believe that the goal of early detection should be 1 cm or smaller. Screening mammography is currently the primary imaging modality available for the early detection of breast cancer.

Cancerous tumors have kinds of abnormalities in mammograms like mass lesions. Mass is a space-occupying lesion on mammogram that tends to have curved and convex boundaries. Masses should be described according to three features consisting of shape, margin, and density [2]. Margin is the interface between a lesion and the surrounding tissues and is the most important factor to determine the significance of a mass. Well-defined and circumscribed masses whose margins form a sharp and abrupt transition with the surrounding tissue more than 75 % are almost always benign. The vast majority of breast cancers have an irregular interface which causes ill-defined or spiculated margins. Diagnosis is on identifying such abnormal regions, as well as on classifying the type of mass or tumor that caused the abnormality. Up to now, detection of masses in mammogram images remained a challenging problem. The lack of clear boundaries causes the masses to mix with the inhomogeneous tissues in the breast. Furthermore, the difficulty could be increased due to the masses located in areas of very dense parenchymal tissue. Though generally the density and gray levels of the masses are higher than the surrounding tissues, some mass images have unclear lesion boundary and the contrast between the mass edges and the backgrounds is low. Moreover, in certain cases, mass images might be covered and hidden by the background tissues. These phenomena increase complexity of detecting mass boundaries.

The American College of Radiology (ACR) recognizes that breast density has a high impact on mammographic screening. The ACR’s BIRADS lexicon describes four categories of breast parenchymal density. ACR1 must not be more than 25 % of fibroglandular tissue composing the breast in mammography, the highest range of ACR2 is 49 %, ACR3 must be up to 75 %, and the last type which is ACR4 describes between 75 and 100 % of the breast occupied by fibroglandular tissue. It is well known that higher breast density results in lower sensitivity for mammography.

With the aim of improving the accuracy and efficiency of screening programs for detection of early signs of breast cancer, a number of research projects are focusing on developing methods for computer-aided diagnosis (CAD) to assist radiologists in diagnosing breast cancer. One of these methods is based on fractal approach that has been used in a large number of research works. Fractal approach is based on identification of geometric structures that exhibit self-similarity at different scales. The concept of fractal was found to be suitable to explain naturally occurring shapes either complex or simple, such as geometry of leaves, trees, mountains, clouds, and the cratered face of the moon as well as biological structures such as cancer tumors [3]. Therefore, fractal analysis is useful in studying the complexity of 2D contours. Considerable research has been carried out on application of fractal on breast cancer detection. Chen et al. [4] described fractal characteristics to differentiate benign from malignant lesions focusing on the study of texture features. Tourassi et al. [5] calculated the fractal dimension (FD) of mammographic region of interest (ROI) using the circular average power spectrum technique. They observed that the presence of architectural distortion disrupts self-similarity properties and thereby alters the FD of breast parenchyma. Guo et al. [6] presented a detailed study of fractal based methods for texture characterization of mammographic mass lesions and architectural distortion. Rangayyan et al. [3] demonstrated the usefulness of fractal analysis via frequency domain approach applied to contours of breast masses. Raguso et al. [7] computed the FD using four different methods and a few shape factors from the contour of each mass. Rangayyan and Nguyen [8] computed the fractal dimension of the contours of breast masses using four different methods, including the ruler method and box counting method applied to 1D and 2D representations of the contours. Abdaheer et al. [9] presented a fractal based on Fourier transform method to classify different contours. With the help FFT method, breast tumors are classified based on their shapes. All of these researches improved the method of CAD based on self-similarity and fractal method. The self-similarity in tissues is one of the important features in diagnosis of abnormalities. Therefore, using fractal method can be useful to improve the results on detection and diagnosis of abnormalities as the assistance of radiologists.

In this paper, we used fractal methods to discriminate benign from malignant masses on the basis of detection of mass contours. The methods were objective versus subjective methods which are used by radiologists. The proposed method is described in Section “Materials and Method” followed by illustration of the results of detection and classification of masses in section “Result and Discussion”. The paper finally is concluded by description of final results.

Material and Methods

Fractal methods are very useful in extraction of features located in different scales. Therefore, we used fractal in investigating masses in mammograms for detection and diagnosis. At first, in mass detection, we defined new FDs to calculate the threshold to discriminate mass from background tissue. Secondly, by using detected mass as ROI, we extracted efficient features of mass contours to classify different masses to either benign or malignant. The overall block diagram of this study is expressed in Fig. 1.

Fig. 1
figure 1

Block diagram of the proposed method

Fractal

The concept of fractals was first introduced by Mandelbrot to describe objects with irregular shapes not represented by the traditional Euclidean dimension [10]. In fractal geometry, figures can be described by the iteration algorithm. They are self-similar and have a fine structure which reveals new details at every level of magnification. There are several definitions for the fractal dimensions including Hausdorff, divider, length-related, and similarly dimension. One of the most widely used dimensions is box counting. The formulation of this dimension is as follows.

Let F be a nonempty and bounded subset of space R n, with ρ = {i : 1, 2, 3, …} indicating covers of the set F. N δ (F) denotes the number of covers, such that

$$ {N}_{\delta }(F)=\left\{\rho :{d}_i\le \delta \right\} $$

Where d i stands for the diameter of the ith cover. N δ (F) is the smallest number of subsets which covers the set F having diameters d i not greater than δ.

The box dimension of set F is defined as follows [11]:

$$ { \dim}_BF=\underset{\delta \to 0}{ \lim}\frac{ \log {N}_{\delta }(F)}{- \log \delta } $$

In this paper, new dimensions of fractal are defined and applied in determining the contours of masses with accurate threshold for discrimination of mass from other areas.

New Fractal Dimensions in Detection of Mass Boundary

Selection of threshold level is an important step in fractal based diagnosis that influences the results considerably. In this paper, a suitable solution for discrimination of lesion is presented. The proposed method enables accurate discrimination of the lesions from background tissue automatically in contrast to common approach utilized by radiologists that are done manually and subjectively. To determine a suitable threshold to discriminate lesions from background tissue, at first, we define ∆ as part of the image that have intensity value greater than I which is the normalized intensity of the image (range 0 and 1). In this manner, with increasing the parameter I, we have function Δ I with I as dependent variable. Then, we have:

$$ {N}_{\delta}\left({\varDelta}_I\right)=\left\{{\rho}_I:{d}_i\le \delta \right\},{\rho}_I=\left\{i:1,2,3,\dots \right\} $$

Where δ and N δ (Δ I ) are the largest diameter and smallest number which covers the set Δ I , respectively. We can now define dimΔ as the limit of N δ (Δ I ) as shown in Eq. (1).

$$ \dim \varDelta =\underset{\delta \to 0}{ \lim }{N}_{\delta}\left({\varDelta}_I\right) $$
(1)

In this manner, we are able to create set of parameters that are functions of fractal features as the following.

$$ \begin{array}{c}\hfill {d}_1(I)=\frac{d\left( \dim \varDelta \right)}{ dI}\hfill \\ {}\hfill {d}_2(I)=\frac{d^2\left( \dim \varDelta \right)}{d{I}^2}\hfill \\ {}\hfill {I}_1=\mathrm{Max}\left({d}_1(I)\right)\hfill \\ {}\hfill {I}_2= \min \left({d}_2(I)\right)\hfill \end{array} $$

Where I 1 is the threshold level for discriminating dense parts of image from background tissues and I 2 is threshold of discrimination between lesions and other parts of the image such as background tissue and dense parts of image. The plot of d 1(I) and d 2(I) for one mammogram are shown in Fig. 2 in which I 1 and I 2 are also shown.

Fig. 2
figure 2

a The curve of d 1(I). b The curve of d 2(I)

New Fractal Features in Diagnosis of Mass Types

Following the detection mass lesion as described in the last section, we define new fractal features for classification of masses based on their contour shapes. Defining features is based on extracting information that was meaningful by radiologists for classifying different type of lesions from each other. For extracting this information such as spiculation and smoothness in boundary located in different scales, FDs are very useful which can extract data in such scales. FDs are able to extract global features as well as features in detail that are mentioned by radiologists. So, we try to extract these features by new fractal features as described below.

N2 Feature

For classification of benign and spiculation masses, we have used roughness in the boundary of masses as the main differentiating feature. Spiculated masses have rough variation in boundaries whereas the benign masses are round and with smooth variation. So, the variation of FDs in different scales was utilized to extract important information for classification. N2 as shown in Eq. (2) describes the amount of changes in FDs in different scales in respect to the maximum number of FDs which is measured in the smallest scale. This feature especially in low scales has the information of spiculation with high resolution.

$$ N2=1-\frac{ NB}{ \max (NB)}=1-\frac{ NB}{ NB(1)} $$
(2)

Where NB indicates FDs of boundary of the mass.

Spiculation Measure Feature

Another feature which leads to improved results in classification of different masses is spiculation measure feature (SMF) as given in Eq. (3). SMF calculates the ratio of square of FDs of mass boundary to FDs of the inside surface of mass. This feature measures the amount of complexity and the roughness in spiculated masses or the smoothness in benign ones. In masses with more spiculation, this feature exhibits a high increase particularly in low scales and makes a great array which for classification with smooth masses without any large array is suitable and performs high ability of performance.

$$ \left(\mathrm{Spiculation}\ \mathrm{Measure}\ \mathrm{Feature}\right)\mathrm{SMF}=\frac{{\mathrm{NB}}^2}{\mathrm{NS}} $$
(3)

Where NS is FDs of inside surface of the mass.

Hybrid of Fractal and Morphology Methods

In this part, fractal and morphology methods were used jointly for classification benign and malignant tumors. The main objective for using morphological filter in this part is to remove roughness and spiculation which are present around the contour. In smooth masses, morphology filter introduces a little change in boundary whereas the change in spiculated masses is large. Morphology filter was used to modify the mass boundary where fractal method was utilized for measuring such change. Parameter A1 was defined for measuring the change before and after morphology filter as shown in Eq. (4). nb and mb are FDs of mass boundary before and after applying morphology filter, respectively.

$$ A1=\frac{\mathrm{nb}-\mathrm{mb}}{\mathrm{nb}} $$
(4)

A morphological filter is constructed by different morphological transform having different structuring elements. We also tested performance of different morphological transform to improve the performance of classification, and also, we found opening morphological transform with order five of disk-shaped structuring elements is more suitable for our application.

In mathematical morphology, opening is the dilation of the erosion of a set A by a structuring element B and defined as follows.

$$ AoB=\left(A\ominus B\right)\oplus B $$

Where o shows the opening operation. ⊕ and ⊖ represent the dilation and the erosion morphological operation, respectively.

Results and Discussion

Data Sets

In this study, for application of the proposed method, a set of mammographic images were acquired that were carefully selected by a radiologist from data base in a local hospital having a resolution of 70 μm per pixel and 4,096 × 3,328 pixel size. We also applied our algorithm on a set of images obtained from the Mammographic Image Analysis Society (MIAS [12]) database. The images in the database are digitized at a resolution of 50 μm per pixel, with 1,024 × 1,024 pixel size and at 256 gray levels. All mammogram images are in MLO views. These mammographic images include well-defined (circumscribed benign or CB), spiculated malignant (SM) and ill-defined malignant (IM) mass lesions which have been proved by biopsy, and as shown in Table 1, we classified them according to breast parenchymal ACR density. Four mammograms with ACR 4 which were in our database were eliminated from the study because of overlapping of masses with tissues. Four data sets were constructed from the database. Data set no. 1 was prepared by using images containing CB and SM masses whereas second data set was obtained from CB and IM masses. Data set nos. 3 and 4 were included early masses for evaluating the performance of our method in early diagnosis. Forty-seven percent of SM tumors and 43 % of IM tumors used in this paper belong to early malignant tumors. Selecting early malignancy is based on the size of tumor of less than 1 cm as described in Section “Introduction”. By combining CB and early SM masses, data set no. 3 and by combining CB and early IM masses, data set 4 was formed.

Table 1 The number of mammogram images according to their mass types and their density in ACR term

Detection of Mass Contours by New Dimensions of Fractal

Using the proposed new FDs, we were able to detect mass contour by determining the threshold level that was used for discriminating the mass from the background tissue. To evaluate the accuracy of the detection results, threshold levels of these mammograms were also determined by an expert opinion of a radiologist which was used as reference values. In this manner, the radiologist changed the threshold levels manually in order to choose the best level for discriminating mass from the background tissue. The difference between the calculated threshold by the radiologist and the calculated threshold by the new fractal method is designated as an error. The errors were evaluated in mean square error (MSE) and tabulated in terms of different levels of the image density and different types of the tumors as shown in Tables 2 and 3, respectively.

Table 2 MSE in threshold level as a measure of accuracy of detection mass contours with different ACR level
Table 3 MSE in threshold level as a measure of accuracy of detection mass contours with different tumor types

Reviewing of the error values shown in Tables 2 and 3 for various mass types, the following important observations can be made:

  • As shown in Table 2 with respect to all types of masses, MSE increases with increasing density of images. This can be explained by reduction of contrast between tumor and surrounding tissue. Dense masses may also be covered with the dense background tissue which makes it hard to detect dense masses leading to an increase in MSE values.

  • Examining masses based on tumor types, either individually for each ACR density according to Table 2 or considering all the images in accordance with Table 3, MSE for the IM masses is higher than MSE for other mass types. This is due to unclear boundaries in the IM masses which results in high detection error as compared with CB and SM mass types having more clarity in their boundaries.

  • Examining the low error levels of the CB and SM masses, it was found that the similarity of mass and their overlapping with their surrounding tissue masses was leading to the error like obscured masses belonging to CB masses.

As an illustrated example, in Fig. 3, we have shown an application of the proposed method on lesions with different types of masses.

Fig. 3
figure 3

Detection of some mass contours by new dimensions of fractal, database MIAS, original mass (left) detected mass contour (right) a mdb015, CB mass b mdb178, SM mass c mdb184, SM mass d mdb271, IM mass

Mass Classification

Detection of masses in mammographic images is followed by their classification into malignant and benign using the features extracted from mass contours. These features were extracted from mass contours by new fractal features as described in Section “New Fractal Features in Diagnosis of Mass Types”. Classification was done using SVM [6, 13] on four data sets introduced in section “Data Sets”. We used C-SVM classifier with linear kernel function and penalty term C equal to 1. We utilized Matlab software code for the proposed method. The numbers of training and testing sets are shown in Table 4. Selection of training sets and testing sets was randomly and iterated it five times in five folding manner. For final classification, averaging of five individual iterations was used. Table nos. 5 to 8 illustrate classification results applied on four data sets in terms of the features N2, SMF, and A1. They were derived by considering both individually as well as their different combinations. We also used receiver operating characteristic (ROC) analysis as presented in Table 9. The results of SVM method and ROC analysis show high performance of classification. Evaluating the results of SVM classifier by cross-validation was found that there is no meaningful difference between the results of cross-validation and the results of five folding manner discussing in this paper.

Table 4 Number of training and testing set for each data set
Table 5 Classification rates of benign and malignant for each feature

Using N2 for classification of ill-defined tumors, results show better performance than other features. As shown in Tables 5, 6, 7, 8, and 9, in classification of data sets 2 and 4 corresponding to ill-defined and early ill-defined masses, respectively, a good performance was achieved using N2 either alone or in combination with other features. The use of combined features with N2 did not improved results as compared with N2 individually. In classification of early SM and benign masses (data set 3), using feature individually, N2 yields better performance than SMF and A1. The combinations of features have improved the results, and the best performance was achieved by combining N2 and A1. A1 during classification of SM masses was found to be very useful in data set 1 where the use of this feature and each combination of it has produced better results than the others. In other word, A1 is a suitable feature in extraction of spiculated information in SM and early SM masses as well as N2 in extraction of ill-defined information in IM and early IM masses.

Table 6 Classification performance for each feature
Table 7 Classification rates of benign and malignant for various combinations of features
Table 8 Classification performance for various combinations of features
Table 9 The performance of classification to benign and malignant in terms of the AUC

One of the advantages of using these fractal features is their ability in classification of early malignant tumors, which is important in diagnosing lesions in their early stages. N2 used in classification of early SM tumors with area under ROC curve (AUC) of 97 % yielded better classification performance as compared with discriminatory classification of SM tumors from benign ones. Investigating the effect of SMF feature, it is seen that the classification result of early IM and CB (data set 4) is better than the classification of IM and CB (data set 2). A1 in early malignant cases has increased the performance in classification of early IM and CB with AUC 87 %. This has improved the result of classification as compared with those of IM and CB with AUC 83 %.

Investigating the classification error for each feature, we can reach to some new observations as follows:

  • One of the reasons for error in classification is position of tumor. Some malignant cases with deeply seated mass where all boundaries of tumor did not image in mammogram caused classification error. So, images with tumors where all information of their boundaries is not available introduce errors in results.

  • Another important error of classification is in the case of malignant tumors when most boundaries of tumor are clear and smooth but having a small segment that is ill-defined or spiculated. This caused for the algorithm to consider tumor as benign one.

  • One main reason resulting to false positive cases is mass overlapping with tissue. This caused the circumscribed masses to be considered as masses with spiculation in their boundaries which leads to an error in classification. This happens in cases where there is low difference between density of mass and overlapping tissue.

For comparison of the performance of the proposed method with those of the results given in [8], we have applied our method on contours that were used in [8] which are in three data sets. These three data sets were classified into benign and malignant using features N2, SMF, and A1. In Table 10, we have shown the classification results in terms of AUC in order to compare them with results in [8]. We applied A1 with different order in disk-shaped structuring elements on these data sets where the results having order of 2, 5, and 10 in disk-shaped structuring elements are presented.

Table 10 Comparison of the AUC using different features on contours that were used in [8]

For classification of data set 1 in [8], the best performance achieved was 0.99 with SMF. This is the same as the best AUC in [8]. In the second data set of [8], we improved the results given AUC equal to 0.89 for feature A1 (with disk structuring elements in order of 10) comparing to the best AUC that is 0.82 as given [8]. Performance of classification reduced in third data set of [8] with best AUC, 0.91 in contrast to best AUC, and 0.93 in [8].

In this paper, we classify lesions only in MLO mammogram images. For future work, using the information of both MLO and CC views is recommended to reduce the lack of information of boundaries as stated above. This would help reducing the error of classification and lead to better performance results.

Conclusion

In this paper, a novel fractal method for detection and diagnosis of masses in digitized mammograms is presented. The proposed method consists of two steps. Firstly, using new FDs, discrimination of masses from background tissue was done automatically as compared with common approach practiced manually by most researches. The study shows that the proposed method gives quite satisfactory detection results in different density levels and lesion types. Error detection will increase by density level increment due to unclear lesion contours and the low contrast between the mass edges and the backgrounds which makes the detection difficult in areas of dense parenchymal tissue. In addition, in IM masses with blurry and unclear contours which are more difficult to detect than other mass types, significant results were obtained with MSE value of 0.0142. Secondly, discriminated lesion was considered as ROI where new fractal features were defined to extract information of mass contours. These features were used to identify roughness of the mass contours that is significant and meaningful for radiologists in diagnosis of spiculated masses. It was shown that not only in classification of SM and CB masses with AUC equal to 0.98 good results were achieved, but also, it was found that these features were very efficient for classification of IM masses. Classification of IM and CB masses was done with high accuracy using N2 feature either individually or in combination with other features. Furthermore, N2 has also shown a good performance in classification of early IM masses indicating N2 is the best feature for extraction of information in ill-defined mass cases.

Some of the advantages of defined features indicate significant results in classification malignant tumors in their early stages, as SMF and A1 have increased the performance in classification of early IM and CB as compared with those of IM and CB. Also, N2 improved results of classification of early SM and CB as compared to SM and CB which improved AUC from 0.94 to 0.97. Other advantages of defined features include their simplicity (using a subtraction or division) and lower computational load as compared with the complex features used in some of the papers which is significant when the data size becomes large.

One of the reasons leading to error in results was due to lack of sufficient data of mass contours in some mammograms. In this paper, the extraction of information is done only from the MLO view of mammograms. It is recommended that in future studies, it would be better to extract information from the both MLO and CC views to improve results.