Abstract
In the present work, a hybrid hierarchical framework for classification of breast density using digitized film screen mammograms has been proposed. For designing of an efficient classification framework 480 MLO view digitized screen film mammographic images are taken from DDSM dataset. The ROIs of fixed size i.e. 128 × 128 pixels are cropped from the center area of the breast (i.e. the area where glandular ducts are prominent). A total of 292 texture features based on statistical methods, signal processing based methods and transform domain based methods are computed for each ROI. The computed feature vector is subjected to PCA for dimensionality reduction. The reduced feature space is fed to the classification module. In this work 4-class breast density classification has been conducted using hierarchical framework where the first classifier is used to classify an unknown test ROI into B-I/other class. If the test ROI is predicted as other class, it is inputted to second classifier for the classification into B-II/dense class. If the test ROI is predicted as belonging to dense class, it is inputted to classifier for the classification into B-III/B-IV class. In this work five hierarchical classifiers designs consisting of 3 PCA-kNN, 3 PCA-PNN, 3 PCA-ANN, 3 PCA-NFC and 3 PCA-SVM classifiers has been proposed. The obtained maximum OCA value is 80.4% using PCA-NFC in hierarchical approach. Further, the best performing individual classifiers are clubbed together in a hierarchical framework to design hybrid hierarchical framework for classification of breast density using digitized screen film mammograms. The proposed hybrid hierarchical framework yields the OCA value of 84.1%. The result achieved by the proposed hybrid hierarchical framework is quite promising and can be used in clinical environment for differentiation between different breast density patterns.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
It has been demonstrated earlier that the increased breast density is prominent indicator for the growth of breast cancer. It is the most common life threatening form of cancer that is found in women [7, 92, 93]. For the experienced radiologists, especially in cases of dense mammogram if masses are present in the center area (i.e. the area where glandular ducts are prominent) detection of breast abnormalities is really a tedious work. In routine clinical practice, during screening mammography the radiologist may find that sometimes in a dense tissue when the lesion is not visible there are chances that the lesion is present and is masked behind the dense tissue. So it is highly recommended that if the prediction for that suspicious case is dense (B-III or B-IV) then such cases must be double screened for the presence of masked lesions.
Fundamentally, different breast tissues reflect different intensity i.e. fatty tissue represented as dark region while dense tissues represented as brighter region on digitized screen film mammographic images [51, 96]. The brief description of Breast Imaging-Reporting and data system (BIRADS) density classes and the sample digitized film screen mammogram (SFM) images of each class, randomly taken from the Digital database for screening mammography (DDSM) dataset [36] are shown in Fig. 1.
1.1 Hierarchical classification system
It is worth mentioning that the designing of computer-aided diagnosis system, hierarchical approach has been extensively used in studies [4, 32, 53, 69, 79, 80] which yields the prominent results. The hierarchical approach for the design of 4-class breast density classification system is shown in Fig. 2.
In the Fig. 3 it has been observed that classifier-1 is used to classify the input test ROI into C 1 /other class. If the test ROI is predicted as other class, it is inputted to the second classifier for classification into C 2 /other class-2. If the test ROI is predicted as belonging to other class-2, it is inputted to the third classifier for the classification into C 3 /C 4 .
There are few advantages of hierarchical classification approach (a) less number of classifiers required with respect to multiclass classifier (for 4-class classification problem six binary classifiers required in OAO approach however only three binary classifiers are required in hierarchical approach), (b) possibility to go stepwise from the general classification problem, i.e. fatty (B-I) versus other class, to more particular classification problem i.e. B-II versus dense and B-III versus B-IV class.
Therefore in the present work hierarchical framework for the classification of breast density is used. It provides the possibility to go stepwise from the general classification problem, i.e. B-I/other class, classification problem which is the identification of B-I breast density class with hierarchical framework of classifiers. In the similar manner next level of classification frameworks classify the B-II)/dense class {B-III, B-IV} and further move on for B-III/B-IV breast density class.
2 Literature review
From the study conducted in past it has been observed that the breast density classification systems have been designed for (1) 2-class (fatty tissue/dense tissue) breast density class, (2) 3-class (fatty tissue/fatty glandular tissue/dense tissue) breast density class and (3) 4-class BIRADS (fatty tissue/some fibroglandular tissue/hetero-geneously dense tissue/extremely dense tissue) breast density classes. The classification of these approaches is shown in Fig. 3.
The broad study of the literature demonstrate that the breast density classification system using SFMs can be designed using (1) segmented tissue based approaches (STBAs) [11, 12, 18, 30, 42, 52, 60, 65, 67, 68, 70] and (2) fixed size region of interest (ROI) based approaches (RBAs) [35, 39, 59]. It is well known that STBAs require additional steps viz. eliminating the background and removing the pectoral muscle. Due to these additional steps STBAs are more time consuming and complex in comparison to the RBAs.
After the depth study of literature it has been observed that most of the studies for 4-class breast density classification using SFMs carried on benchmark dataset i.e. (a) Mammographic image analysis society (MIAS) [11, 18, 65, 67, 68, 70], (b) DDSM [11, 12, 52, 67, 68] and (c) self collected mammograms by individual research group [30, 35, 39, 42, 59, 60]. It is worth mentioning that the DDSM dataset contains images which are already labeled according to BIRADS density standard by the experts, however in case of MIAS dataset as well as in case of datasets collected by authors the images have been labeled according to BIRADS standard by the participating radiologists.
2.1 Study carried out on benchmark DDSM dataset
After the extensive study of the literature it has been observed that the most of the studies carried out on benchmark DDSM dataset is using STBAs [11, 12, 67, 68]. The brief description of studies carried out for 4-class breast density classification on DDSM dataset is given in Table 1. It is worth observing that the only one study is based on RBAs [52]. The maximum classification accuracy obtained for DDSM dataset is 84.7% using the STBA [11]. The study [11] carried out on 500 digitized SFMs taken from DDSM dataset which is comprised of 125 mammograms of each class. The segmentation of breast region is performed using global thresholding method and polynomial approach proposed by Ferrari at el. [30] is used for removal of pectoral muscle. Appearance based and edge based features are extracted for each segmented mammograms and support vector machine is used for classification purpose. In this study 499 samples are used for training purpose and 1 image is tested 500 times and 84.7% classification accuracy is observed.
In study [52] the authors have attempted 4-class breast density classification using RBA on 480 digitized SFMs taken from DDSM dataset. The fixed size of ROIs i.e. 128 × 128 pixels are cropped from center location of each breast (i.e. just behind the nipple) and wavelet texture features are computed using the haar compact support wavelet filter. The study reports the accuracy of 73.7% using SVM classifier. From the literature study it may be noted that the study [52] can be only directly related to present work as it has been carried out on DDSM dataset using RBA.
2.2 Study carried out on benchmark MIAS dataset
In the literature few studies [11, 18, 65, 67, 68, 70] have been carried out on benchmark MIAS dataset for 4-class breast density classification as the images have been labeled according to BIRADS standard by the participating radiologists. It is worth observing that most of the studies conducted on using STBAs [11, 18, 67, 68, 70] and only few studies are conducted using RBAs. The maximum accuracy of 95.4% has been achieved by using STBA on studies carried out on MIAS dataset [11]. In study [11] 322 SFMs are taken. Whole breast region is segmented using global thresholding method and polynomial approach proposed by Ferrari at el. [30] is used for removal of pectoral muscle. The extracted features are based on edges and intensity appearance and used classifier is support vector machine.
The maximum accuracy achieved on MIAS dataset using the RBAs is 79.2% reported in study [65]. In this study ROI of fixed size i.e. 512 × 384 pixels are cropped from 322 digitized SFMs. The 7 intensity based and 17 GLCM texture features (for the angle 0°, 45°, 90° and 135° at inter-pixels distance 1,3,5 and 7) are extracted for each ROI. Naïve Bayes probabilistic classifier is used for the characterization between BIRADS density class. The summary of studies carried out for 4-class breast density classification on MIAS dataset is reported in Table 1.
2.3 Study carried out on self collected mammograms by individual research group
It is also found that the few studies were carried out on self collected mammograms by individual research group [35, 39, 42, 59, 60]. It is worth observing that the most of the studies carried out on self collected mammograms by individual research group are based on STBAs. The maximum classification accuracy obtained on self collected dataset is 80.0% consisting of 80 mammograms [39]. The summary of studies carried out for 4-class breast density classification on self collected by different research group dataset is reported in Table 1.
In the present work, a hybrid hierarchical framework for classification of breast density is designed which is consisting of five hierarchical classifiers i.e. 3 PCA-kNN, 3 PCA-PNN, 3 PCA-ANN, 3 PCA-NFC and 3 PCA-SVM classifiers have been proposed. Further, the best performing individual classifiers at each node are clubbed together in a hierarchical framework to design hybrid hierarchical framework for classification of breast density using digitized screen film mammograms. Various texture parameters including 11 first-order statistics (FOS) features, 13 GLCMmean features, 5 Gy level difference statistics (GLDS) features, 11 Gy level run length matrices (GLRLM) features, 30 Laws’3 features, 75 Laws’ 5 features, 30 Laws’7 features, 75 Laws’9 features and 42 2-D Gabor wavelet transform (GWT) features are computed from extracted each fixed size of ROIs i.e. 128 × 128 pixels from center area of the breast (i.e. the area where glandular ducts are prominent). Finally, a combined feature set consisting of 292 features is inputted to Principal component analysis (PCA) for feature space dimensionality reduction. The resultant feature vector is fed to the classification module.
3 Materials and methods
3.1 Experimental work flow for the design of a hierarchical framework for classification of breast density using digitized screen film mammograms
The experimental work flow followed in this work for the design of a hierarchical framework for classification of breast density using digitized screen film mammograms is shown in Fig. 4.
3.2 Description of image dataset
The image dataset used for this work comprises of 480 mediolateral oblique (MLO) view digitized screen film mammograms taken from DDSM dataset such that (1) 120 mammograms belong to B-I class (2) 120 mammograms belong to B-II class (3) 120 mammograms belong to B-III class and (4) 120 mammograms belong to B-IV class. The DDSM dataset is a standard benchmark dataset which contains four digitized screen film mammographic images for each case, comprising of left/right MLO and left/right cranial-caudal (CC) views. The overlay file of each image contains the expert evaluation of BIRADS breast density [36]. The description of dataset used for this study and its bifurcation into training and testing dataset is shown in Fig. 5.
3.3 ROI extraction module
The study carried by Li et al. [57] verified that the textural variations exhibited by the central region of the breast tissue are significant to account for discrimination between different breast density classes and also according to the participating radiologist the center area (i.e. the area where glandular ducts are prominent) is visualize for the discrimination between different breast density classes. Therefore, in this study ROIs of size 128 × 128 pixels have been cropped from the center area of the breast. The sample images belonging to BIRADS class with respected ROIs is shown in Fig. 6.
3.4 Feature extraction module
From the previous study it has been observed that the statistical texture features [11, 12, 35, 39, 42, 52, 59, 67], Law’s texture features [47, 48, 54] and 2-D Gabor wavelet transform features [1, 13, 14, 22, 23, 26, 38, 55, 71, 83, 97] are extensively used for the designing of CAD system. Accordingly in this work a wide variety of texture features are computed by using FOS features, GLCMmean features [5, 15, 33, 37, 46, 49, 57, 61, 63, 64, 77, 87, 88, 91], GLDS features [19, 29, 45, 49, 73, 86] GLRLM features [21, 25, 49, 75], Laws’ texture energy features [47, 48, 54] and 2-D Gabor wavelet transform (GWT) features [1, 13, 14, 22, 23, 26, 38, 55, 71, 83, 97].
FOS features
In this work a total of 11 first-order statistics features i.e. energy, average grey level, third moments, uniformity, mean, entropy, variance, standard deviation, skewness, kurtosis and smoothness are extracted for each ROI [49, 77].
GLCMmean features
From the exhaustive review of the literature it is observed that the texture features computed using GLCMmean contain significant information to account for variations in texture patterns exhibited by different breast density classes [49, 77]. The GLCMmean for a ROI belonging to a particular breast density class is obtained by using eq. (1).
In the similar manner GLCM mean,B-II(d = i) , GLCM mean,B-III(d = i) and GLCM mean,B-IV(d = i) are computed by varying the inter-pixel distance ‘d’ = ‘i’ from 1 to 15.
In the present work 13 GLCMmean features are computed. One of the GLCMmean feature i.e. entropy (ENT mean) is computed at inter-pixel distance ‘d’ = 10 by using eq. (2).
In the same manner, remaining 12 GLCMmean texture features (contrastglcm_mean, varianceglcm_mean, angular second momentglcm_mean, correlation, inverse difference moment, information measures of correlation-1, information measures of correlation-2, sum average, sum variance, sum entropy, difference variance, difference entropy) have been computed by varying the inter-pixel distance ‘d’ from 1 to 15. It has been observed that the features extracted at inter-pixel distance d = 10 yielded the maximum classification accuracy. Thus the GLCMmean features computed at inter-pixel distance d = 10 is considered for this study.
GLDS features
In this work a total of 5 GLDS features i.e. contrastglds, homogeneityglds, meanglds, energyglds and entropyglds are extracted for each ROI [49, 77].
GLRLM features
In this work a total of 11 GLRLM features, i.e., emphasisshort_run, emphasislong_run, emphasislow_gray_level_run, emphasishigh_gray_level_run, emphasisshort_run_low_gray_level, emphasislong_run_low_gray_level, emphasisshort_run_high_gray_level, emphasislong_run_high_gray_level, non_uniformitygray_level, non_uniformity_run_length and run_percentage are computed for each ROI [49].
Laws’ texture energy features
In this study, the Laws’ texture energy features [47, 48, 54] have been extracted using 1-D filters of different kernel width, (i.e. 3, 5, 7 and 9). These special filters of different kernel width are used to perform local averaging (L), spot detection (S), edge detection (E), ripple detection (R) and wave detection (W) in an ROI image. The brief description of the Laws’ mask and steps involved to calculate the features are shown in Fig. 7.
In this study a total of 210 Laws’ features i.e. 30 Laws’3 features, 75 Laws’5 features, 30 Laws’ 7 features and 75 Laws’9 features are computed for each ROI.
2-D GWT features
In this study, 2-D GWT multi-scale decomposition has been carried out using three magnitude value (0, 1 and 2) and seven directions (22.5°, 45°, 67.5°, 90°, 112.5°, 135° and 157.5°) gives a group of 21 (3 × 7) Gabor wavelet filter bank. The real part of Gabor wavelet filter bank is shown in Fig. 8.
Further, a set of 21 filtered images are obtained after the convolution of ROI with the real part of Gabor filter bank. Each filtered image i.e. feature image represents the texture information at a certain magnitude and direction. Compute two statistics mean and standard deviation from these 21 feature images resulting in a feature vector of length 42. Thus 2-D Gabor feature of length 42 is used for this study.
3.5 Feature space dimensionality reduction module
There might be a possibility that computed texture feature vectors (TFVs) may have redundant features which are correlated to each other thus providing no extra information. The use of redundant features for an efficient classifier design may degrade the performance of the designed system. Therefore the computed TFVs are inputted to dimensionality reduction stage using PCA [2, 28, 40, 50, 72]. In this study, to retain the optimal number of principal components (PCs) for classification task, reduced texture feature vectors have been computed for all classifiers by varying the principal components values from 2 to 15. The steps involved in the implementation of PCA algorithm are given here in Fig. 9.
3.6 Classification module
The classification module consists of three binary classifiers arranged in a hierarchical framework. These three classifiers provide stepwise classification for the generalized 4-class breast density classification problem. The first classifier is used to classify an unknown test ROI into B-I/other class. If the test ROI is predicted as other class, it is inputted to second classifier for the classification into B-II/dense class. If the test ROI is predicted as belonging to dense class, it is inputted to classifier for the classification into B-III/B-IV class. The generalized block diagram of a hierarchical framework for system is classification of breast density is shown in Fig. 10.
Mapping of higher dimension feature space to lower dimension feature space using principal component analysis algorithm is applied individually before designing each binary classifier. Initially, five different hierarchical frameworks for classification of breast density designed using three PCA-kNN classifiers (shown in Fig. 11), three PCA-PNN classifiers (shown in Fig. 12), three PCA-ANN classifiers (shown in Fig. 13), three PCA-NFC classifiers (shown in Fig. 14) and three PCA-SVM classifiers (shown in Fig. 15.) and. The performance of each binary classifier is evaluated at each node and the best classifiers (yielding the maximum accuracy) at each node are combined in a hierarchical framework for designing the hybrid hierarchical framework (shown in Fig. 16) for classification of breast density.
3.6.1 Hierarchical framework for classification of breast density using PCA-kNN classifiers
The kNN classifier is an instance based classifier in which the class of a testing instance is decided by the class of majority from its k nearest neighbors in the training set by calculating the Euclidean distance between neighboring instances [6, 11, 12, 18, 52, 59, 62, 65]. It tries to cluster the instances of feature vector into disjoint classes with an assumption of that the instances of feature vector lying close to each other in feature space represent instance belonging to the same class. The class of an unknown testing instance is selected to be the class of majority of instances among its k-nearest neighbors in the training set. The classification performance is affected by varying the parameter k. In this work, the value of k is optimized by repeated experimentation for classifier design by stepping through by 1 varying from 1 to 10, and if the same performance is achieved for more than one value of k the minimum value of k is considered.
The block diagram of hierarchical framework for classification of breast density using PCA-kNN classifiers is shown in Fig. 11.
3.6.2 Hierarchical framework for classification of breast density using PCA-PNN classifiers
The PNN classifier is a direct continuation of the theory of Bayesian classification estimation of probability density function (PDF). The architecture of PNN classifier comprises of an input layer, pattern layer, summation layer and decision layer. The PNN classification algorithm defines a probability density function (PDF) and optimized kernel width parameter for each class on the basis of training dataset [50]. The width of the radial basis kernel function (RBF) is determined by the spread parameter denoted as S p . In this work, the S p is optimized by repeated experimentation for classifier design by stepping through various values of S p ranging from 1 to 10. The PNN classifier trained with the optimum value of S p is then tested with reduced instances of testing dataset [58, 76, 81]. Instances of feature vectors consisting of optimal number of PCs obtained for the binary classification tasks (i.e. B-I/other class, B-II/dense class and B-III/B-IV) are fed to the input layer of corresponding binary PNN classifiers.
The block diagram of hierarchical framework for classification of breast density using PCA-PNN classifiers is shown in Fig. 12.
3.6.3 Hierarchical framework for classification of breast density using PCA-ANN classifiers
The architecture of ANN classifier comprises of an input layer, hidden layer and output layer. For designing each ANN classifier, corresponding neurons to the output class label is set to 1 and other neurons class label is set to 0, i.e. the learning of each ANN classifier is supervised. Adaptive learning with back-propagation algorithm is used to getting the desired input-output relationship [8, 20, 24, 34, 56, 74, 84, 89, 90, 94, 95, 98]. For the designing of an efficient hierarchical ANN classifier, the trial-and-error procedure was used for the optimization of hidden layer neurons. After the extensive experimentation with different numbers of hidden layer neurons, it was observed that with 10 neurons in hidden layer of ANN-1 to ANN-3 a reasonable tradeoff between convergence and accuracy was obtained.
Instances of feature vectors consisting of optimal number of PCs obtained for the binary classification tasks (i.e. B-I/other class, B-II/dense class and B-III/B-IV) are fed to the input layer of corresponding binary ANN (BNN) classifiers. The Block diagram of hierarchical framework for classification of breast density using PCA-ANN classifiers is shown in Fig. 13.
3.6.4 Hierarchical framework for classification of breast density using PCA-NFC classifiers
The Neuro fuzzy classifier (NFC) is a multilayer feed-forward network comprises of the input layer, membership layer, fuzzification layer, defuzzification layer, normalization layer, and output layer [3, 10, 16, 27, 31, 41, 43, 44, 66, 82, 85]. It is worth mention that fuzzy inference systems are suffers from the learning capability and neural networks have the learning capability. Thus neuro fuzzy classifier (NFC) is the prominent applications of fuzzy inference system and neural network i.e. NFC overcomes the limitations of neural network and fuzzy inference systems. Thus NFC has the capability to learn and represent knowledge according to defined rule and learning ability. In the present study, instances of feature vectors consisting of optimal number of PCs obtained for the binary classification tasks (i.e. B-I/other class, B-II/dense class and B-III/B-IV) are fed to the input layer of corresponding binary NFC classifiers. The block diagram of hierarchical framework for classification of breast density using PCA-NFC classifiers is shown in Fig. 14.
3.6.5 Hierarchical framework for classification of breast density using PCA-SVM classifiers
All three binary SVM classifiers designed for the hierarchical framework for classification of breast density are implemented using LibSVM library [17]. In SVM algorithm, training data is mapped from lower dimensional input features to higher dimensional features. Kernel functions are used for nonlinear mapping of the training data from input space to higher dimensional feature space. In this study, Gaussian radial basis kernel function based SVM classifier (available in LibSVM library) has been used for the design of computerized framework for detection of lesions in dense mammograms.
For designing the classifier, Gaussian radial basis function (RBF) kernel is used. The 10 fold cross validation approach is used to optimize the kernel width γ and regularization parameter C of radial basis function by extensive experiment carried out on training data for the values of γ ∈ {2−12, 2−11,..., 24} and C ∈ {2−4, 2−3 ,…, 215} [9, 34, 55, 78]. The block diagram of hierarchical framework for classification of breast density using PCA-SVM classifiers is shown in Fig. 15.
4 Experiments and results
The 4-class breast density classification task has been considered and hierarchical framework is designed using five classifiers (i.e. PCA-kNN, PCA-PNN, PCA-ANN, PCA-NFC and PCA-SVM) arranged in a hierarchical framework.
The brief details of experiments carried for hierarchical framework for classification of breast density using digitized screen film mammograms is reported in Table 2.
The performance of designed each hierarchical framework at each node is evaluated in terms of accuracy of binary classifier expressed as Acc_Bin_Class, overall classification accuracy expressed as OCA and Individual class accuracy expressed as ICA.
-
Experiment 1: The performance of hierarchical framework for classification of breast density using PCA-kNN classifier is given in Table 3.
-
Experiment 2: The performance of hierarchical framework for classification of breast density using PCA-PNN classifiers is given in Table 4.
-
Experiment 3: The performance of hierarchical framework for classification of breast density using PCA-ANN classifiers is given in Table 5.
-
Experiment 4: The performance of hierarchical framework for classification of breast density using PCA-NFC classifiers is given in Table 6.
-
Experiment 5 : The performance of hierarchical framework for classification of breast density using PCA-SVM classifiers is given in Table 7.
The value of OCA is obtained by adding the number of misclassifications obtained at each stage of the hierarchical framework for breast density classification using PCA-NFC classifiers yields minimum i.e. a total of 47 misclassifications consisting of 8, 6, 6, 10, 14 and 3 misclassifications for PCA-NFC1, PCA-NFC2 and PCA-NFC3 classifiers respectively, therefore, OCA for hierarchical framework using PCA-NFC classifiers is {(240–47) / 240} × 100 = {(193 / 240) × 100} = 80.4%.
By visualizing the performance of individual binary classifiers of PCA-kNN, PCA-PNN, PCA-ANN, PCA-NFC and PCA-SVM based on hierarchical framework (shown in Table 3, Table 4, Table 5, Table 6 and Table 7), some interesting facts are observed:
-
(a).
For classification between B-I/Other class the maximum accuracy of 96.2% is obtained by using PCA-SVM1 classifier in comparison with 85.0%, 84.5%, 80.4% and 94.1% as obtained by using PCA-kNN1, PCA-PNN1, PCA-ANN1 and PCA-NFC1 classifiers.
-
(b).
For further classification of other class instances into B-II/dense class the maximum accuracy of 91.1% is obtained by using PCA-NFC2 classifier in comparison with 90.5%, 87.7%, 72.7% and 88.3% as obtained by using PCA-kNN2, PCA-PNN2, PCA-ANN2 and PCA-SVM2 classifiers respectively.
-
(c).
For classification of dense class into B-III/B-IV class the maximum accuracy of 89.1% is obtained by using PCA- kNN3 classifier in comparison with 85.8%, 80.8%, 81.6% and 85.8% as obtained by using PCA-PNN3, PCA-ANN3, PCA-SVM3 and PCA-NFC3 classifiers.
4.1 Comparative analysis
The comparative performance analysis of designed hierarchical frameworks for classification of breast density using various experiments carried out in this work is reported in Table 8.
From Table 8, it can be observed that the PCA-NFC based hierarchical framework performs better in comparison with PCA-kNN, PCA-PNN, PCA-ANN and PCA-SVM based hierarchical framework for 4-class breast density classification. For classification between B-I/other class PCA-SVM1 perform best at PCs value 9. For classification between B-II/dense, PCA-NFC2 is the best at PCs value 7 and for classification between B-III/B-IV, PCA- kNN3 is the best at PCs value 11.
-
Experiment 6: Design of hybrid hierarchical framework for classification of breast density designed the best performing individual classifiers at each node a hierarchical framework
The architecture of the proposed hybrid hierarchical framework for classification of breast density designed the best performing individual classifiers at each node in hierarchical framework is shown in Fig. 16.
The performance obtained for proposed hybrid hierarchical framework for classification of breast density for digitized screen film mammograms is reported in Table 9.
From Table 9, it has been concluded that the proposed hybrid hierarchical framework yields the maximum OCA value of 84.1% with only 38 misclassifications out of 240 test instances. The proposed hybrid hierarchical framework perform best in comparison to PCA-kNN, PCA-PNN, PCA-ANN, PCA-NFC and PCA-SVM based hierarchical framework for classification of breast density using digitized screen film mammograms. The OCA obtained by hybrid hierarchical framework is 84.1% in comparison with 72.5%, 68.5%, 50.4%, 78.3 and 80.4% as obtained by PCA-kNN, PCA-PNN, PCA-ANN, PCA-NFC and PCA-SVM based hierarchical framework respectively.
5 Conclusion
During the clinical routine screening of mammography expertise observed that the breast lesions are missed in case of dense mammograms. Thus, in this study extensive experimentations have been performed for breast density classification using PCA-kNN, PCA-PNN, PCA-ANN, PCA-NFC and PCA-SVM based hierarchical framework. Among these PCA-NFC based hierarchical framework yielding the OCA value is 80.4% with 47 (47/240) misclassification out 240 test instances. However, it is observed that the hybrid hierarchical framework designed by combination of best binary classifiers at each node yields the OCA value of 84.1% with only 38 (38/240) misclassifications out of 240 test instances. The proposed hybrid hierarchical classification framework perform best in comparison to each of PCA-kNN, PCA-PNN, PCA-ANN, PCA-NFC and PCA-SVM based hierarchical classification framework for breast density classification. The result achieved by the proposed hybrid hierarchical framework for classification of breast density using digitized screen film mammograms is quite promising and indicate its effectiveness to assist radiologists in adequate scheduling of breast lesion treatment in clinical environment.
References
Acharya UR, Chowriappa P, Fujita H, Bhat H, Dua S, Koh J, Eugene L, Kongmebhol P, Ng K (2016) Thyroid lesion classification in 242 patient population using gabor transform features from high resolution ultrasound images. Knowl-Based Syst. doi:10.1016/j.knosys.2016.06.010
Agarwal RK, Karmeshu (2008) Perturbation scheme for online learning of features: incremental principal component analysis. Pattern Recogn Lett 41:1452–1460
Ahmed SS, Dey N, Ashour AS, Sifaki-Pistolla D, Balas-Timar D, Balas VE, Tavares JMR (2016) Effect of fuzzy partitioning in Crohn’s disease classification: a neuro-fuzzy-based approach. Medical & biological engineering & computing. pp. 1–15
Alivar A, Danyali H, Helfroush MS (2016) Hierarchical classification of normal, fatty and heterogeneous liver diseases from ultrasound images using serial and parallel feature fusion. Biocybernetics and Biomedical Engineering
Amadasun M, King R (1989) Textural features corresponding to textural properties. IEEE T Syst Man Cyb 19:1264–1274
Amendolia SR, Cossu G, Ganadu ML, Masala GL, Mura GM (2003) A comparative study of K-nearest neighbor, support vector machine and multi layer perceptron for thalassemia screening. Chemometr Intell Lab 69:13–20
American cancer society (last reviewed 2014) Breast Cancer Early Detection The importance of finding breast cancer early
Andre TC, Rangayyan RM (2006) Classification of breast masses in mammograms using neural networks with shape, edge sharpness, and texture features. J Electron Imaging 15:013019–013029
Azar AT, El-Said SA (2014) Performance analysis of support vector machine classifiers in breast cancer mammography recognition. Neural Comput Appl 24:1163–1177
Bhaiya LP, Goswami MS (2012) Classification of MRI brain images using neuro fuzzy model. International Journal of Engineering Inventions 1:27–31
Bosch A, Munoz X, Oliver A, Marti J (2006) Modeling and classifying breast tissue density in mammograms. In: Proceedings of the 2006 I.E. Computer Society Conference on Computer Vision and Pattern Recognition ‘CVPR’06′, New York 2:1552–1558
Bovis K, Singh S (2002) Classification of mammographic breast density using a combined classifier paradigm. In: Proceeding of medical image understanding and analysis ‘MIUA’ conference, Portsmouth. pp. 177–180
Buciu I, Gacsadi A (2009) Gabor wavelet based features for medical image analysis and classification. In: Proceeding 2nd International Symposium on Applied Sciences in Biomedical and Communication Technologies pp. 1–4
Buciu I, Gacsadi A (2011) Directional features for automatic tumor classification of mammogram images. Biomedical Signal Processing and Control 6:370–378
Castellano G, Bonilha L, Li LM, Cendes F (2004) Texture analysis of medical images. Clin Radiol 59:1061–1069
Cetişli B, Barkana A (2010) Speeding up the scaled conjugate gradient algorithm and its application in neuro-fuzzy classifier training. Soft Comput 14(4):365–378
Chang CC, Lin CJ (March 2012) LIBSVM, A library of support vector machines, available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chen Z, Denton E, Zwiggelaar R (2011) Local feature based mammographic tissue pattern modeling and breast density classification. In: Proceedings of 4th International Conference on Biomedical engineering and Informatics, Shanghai pp. 351–355
Choi YJ (2015) A generalized multiple classifier system for improving computer-aided classification of breast masses in mammography. Biomed Eng Lett 5:251–262
Chong CC, Jia JC, Mital DP (1994) Classification of multi-spectral images using BP neural networks classifiers- input coding assignment. Proceedings: IEEE Region 10’s Ninth Annual International Conference on Frontiers of Computer Technology, TENCON’94, Singapore 2:867–871
Chu A, Sehgal CM, Greenleaf JF (1990) Use of gray value distribution of run lengths for texture analysis. Pattern Recogn Lett 11:415–420
Clausi A, Jernigan M (2000) Designing Gabor filters for optimal texture separability. Pattern Recogn Lett 33:1835–1849
Danian Z, Zhao Y, Wang J (2004) Features extraction using a Gabor filter family. In: Proceedings of the sixth Lasted International conference, Signal and Image processing, Hawaii
Daponte JS, Sherman P (1991) Classification of ultrasonic image texture by statistical discriminant analysis and neural networks. Comput Med Imag Grap 15:3–9
Dasarathy BV, Holder EB (1991) Image characterizations based on joint gray level-run length distributions. Pattern Recogn Lett 12:497–502
Daugman JG (1988) Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression. IEEE T Acoust Speech 36:1169–1179
Do QH, Chen JF (2013) A neuro-fuzzy approach in the classification of students' academic performance. Computational Intelligence and Neuroscience. pp. 6
Du C, Linker R, Shaviv A (2008) Identification of agricultural mediterranean soils using mid-infrared photoacoustic spectroscopy. Geoderma 143:85–90
Enderwick CY, Micheli-Tzanakou E (1997) Classification of mammographic tissue using shape and texture features. Proceedings of the 19th Annual IEEE International Conference 2:810–813
Ferrari R, Rangayyan R (2004) Automatic identification of the pectoral muscle in mammograms. IEEE T-MI 23:232–245
Fuller R (1995) Neural fuzzy systems
Gletsos M, Maogiakakou SG, Matsopoulos GK, Nikita KS, Nikita AS, Kelekis D (2003) A computer-aided diagnostic system to characterize CT focal liver lesions: design and optimization of a neural network classifier. IEEE T Inf Technol B 7:153–162
Haralick R, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE T Syst Man Cyb 3:610–121
Hassanien AE, Bendary NE, Kudelka M, Snasel V (2011) Breast cancer detection and classification using support vector machines and pulse coupled neural network. In: Proceedings of 3rd International Conference on Intelligent Human Computer Interaction ‘IHCI 2011’ pp. 269–279
He W, Harvey S, Juette A, Denton ER, Zwiggelaar R (2016) Mammographic segmentation and density classification: A fractal inspired approach. In: International Workshop on Digital Mammography pp. 359–366
Heath M, Bowyer K, Kopans D, Moore R, Kegelmeyer PJ (2000) The digital database for screening mammography. In: Proceeding of Intenational Workshop on Digital Mammography. pp. 212–218
Hui L, Giger ML, Olopade OI, Margolis A, Lan L, Bonta I (2004) Computerized texture analysis of mammographic parenchymal patterns of digitized mammograms. Int Congr Ser 1268:878–881
Jain AK, Farrokhnia F (1991) Unsupervised texture segmentation using Gabor filters. Pattern Recogn Lett 24:1167–1186
Jamal N, Ng KH, Ranganathan S, Tan LK (2007) Comparison of computerized assessment of breast density with subjective BI-RADS classification and Tabar’s Pattern from two-view CR mammography. In: World Congress on Medical Physics and Biomedical Engineering 2006 pp. 1405–1408
Kadir A, Nugroho LE, Susanto A, Santosa PI (2012) Performance improvement of leaf identification system using principal component analysis. International Journal of Advanced Science and Technology 44:113–124
Kar S, Das S, Ghosh PK (2014) Applications of neuro fuzzy systems: a brief review and future outline. Appl Soft Comput 15:243–259
Karssemeijer N (1998) Automated classification of parenchymal patterns in mammograms. Phys Med Biol 43:365–389
Khalifa S, Komarizadeh MH (2012) An intelligent approach based on adaptive neuro-fuzzy inference systems (ANFIS) for walnut sorting. Aust J Crop Sci 6:183
Kher R, Pawar T, Thakar V, Shah H (2015) Physical activities recognition from ambulatory ECG signals using neuro-fuzzy classifiers and support vector machines. Journal of Medical Engineering & Technology 39(2):138–152
Khuzi, MA, Besar R, Wan Zaki, WMD (2008) Texture features selection for masses detection in digital mammogram. In: 4th Kuala Lumpur International Conference on Biomedical Engineering pp. 629–632
Kim JK, Park HW (1999) Statistical textural features for detection of microcalcifications in digitized mammograms. IEEE T Med Imaging 18:231–238
Kriti VJ (2015) Breast density classification using Laws' mask texture features. Int J Biomed Eng Technol 19:279–302
Kriti VJ (2016) Comparison of CAD Systems for Three Class Breast Tissue Density Classification Using Mammographic Images. Medical Imaging in Clinical Applications pp:107–130
Kriti VJ, Thakur S (2016) Application of statistical texture features for breast tissue density classification. Image Feature Detectors and Descriptors, Studies in Computational Intelligence 630:411–435
Kriti VJ, Dey N, Kumar V (2016) PCA-PNN and PCA-SVM based CAD systems for breast density classification. Applications of intelligent optimization in biology and medicine, Intelligent Systems Reference Library 96:159–180
Kumar I, Virmani J, Bhadauria HS (2015a) A review of breast density classification methods. In: Proceeding of 2nd International Conference on Computing for Sustainable Global Development ‘INDIACom – 2015 pp. 1960–1967
Kumar I, Bhadauria HS, Virmani J (2015b) Wavelet packet texture descriptors based four-class BIRADS breast tissue density classification. Procedia Computer Science 70:76–84
Lasztovicza L, Pataki B, Szekely N, Toth N (2014) Neural network based microcalcification detection in a mammographic CAD system. International Journal of Computing 3:13–19
Laws KI (1980) Rapid texture identification. SPIE Proc Semin Image Process Missile Guid 238:376–380
Lee C, Chen SH (2006) Gabor wavelets and SVM classifier for liver diseases classification from CT images. In: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics, pp. 548–552
Lee WL, Hsieh KS, Chen YC (2004) A study of ultrasonic liver images classification with artificial neural networks based on fractal geometry and multiresolution analysis. Biomed Eng-App Bas 16:59–67
Li H, Giger ML, Huo Z, Olopade OI, Lan L, Weber WL, Bonta I (2004) Computerized analysis of mammographic parenchymal patterns for assessing breast cancer risk: effect of ROI size and location. Med Phys 31:549–555
Mao KZ, Tan KC, Ser W (2000) Probabilistic neural-network structure determination for pattern classification. IEEE T Neural Networ 11:1009–1016
Masmoudi AD, Ayed NGB, Masmoudi DS, Abid R (2013) LBPV descriptors-based automatic ACR/BIRADS classification approach. EURASIP Journal on Image and Video Processing l:1–9
Miller P, Astley S (1991) Classification of breast tissue by texture analysis. In: Proceeding of BMVC-91 pp. 258–265
Mohanaiah P, Sathyanarayanam P, Gurukumar L (2013) Image texture feature extraction using GLCM approach. Int J Sci Res Publ 3:862–866
Mougiakakou SG, Valavanis IK, Nikita A, Nikita KS (2007) Differential diagnosis of CT focal liver lesions using texture features, feature selection and ensemble driven classifiers. Artif Intell Med 41(1):25–37
Mudigonda NR, Rangayyan RM, Desautels JEL (2000) Gradient and texture analysis for the classification of mammographic masses. IEEE T Med Imaging 19:1032–1043
Mudigonda NR, Rangayyan RM, Desautels JEL (2001) Detection of breast masses in mammograms by density slicing and texture flow-field analysis. IEEE T Med Imaging 20:1215–1227
Mustra M, Grgic M, Delac K (2012) Breast density classification using multiple feature selection. Automatika: Journal for Control, Measurement, Electronics, Computing and Communication 53:362–372
Neagoe VE, Latin LF, Grunwald S (2003) A neuro-fuzzy approach to classification of ECG signals for ischemic heart disease diagnosis. In: AMIA Annual Symposium Proceedings pp. 494–498
Oliver A, Freixenet J, Zwiggelaar R (2005) Automatic classification of breast density. Proceedings of the IEEE International Conference on Image Processing 2:1258–1261
Oliver A, Freixenet J, Marti R, Pont J, Perez E, Denton ERE, Zwiggelaar R (2008) A novel breast tissue density classification methodology. IEEE T Inf Technol B 12:55–65
Owjimehr M, Danyali H, Helfroush MS, Shakibafard A (2016) Staging of fatty liver diseases based on hierarchical classification and feature fusion for back-scan–converted ultrasound images. Ultrasonic Imaging 01–17. doi:10.1177/0161734616649153
Qu Y, Shang C, Shen Q (2011) Evolutionary fuzzy extreme learning machine for mammographic risk analysis. Journal of Fuzzy Systems 13:282–291
Rangayyan RM, Ferrari RJ, Desautels JL, Frere AF (2000) Directional analysis of images with Gabor wavelets. In: Proceedings XIII Brazilian Symposium on Computer Graphics and Image Processing, 2000 pp. 170–177
Sachdeva J, Kumar V, Gupta I, Khandelwal N (2012) A dual neural network ensemble approach for multiclass brain. International Journal for Numerical Methods in Biomedical Engineering 28:1107–1120
Sahiner B, Chan HP, Petrick N, Wei D, Helvie MA, Adler DD, Goodsitt MM (1996a) Classification of mass and normal breast tissue: a convolution neural network classifier with spatial domain and texture images. IEEE Trans Med Imaging 15:598–609
Sahiner B, Chan HP, Petrick N, Wei D, Helvie MA, Adler DD, Goodsitt MM (1996b) Classification of mass and normal breast tissue: a convolution neural network classifier with spatial domain and texture images. IEEE T Med Imaging 15:598–610
Sahiner B, Chan HP, Petrick N, Helvie MA, Hadjiiski LM (2001) Improvement of mammographic mass characterization using speculation measures and morphological features. Med Phys 28:1455–1465
Shan Y, Zhao R, Xu G, Liebich HM, Zhang Y (2002) Application of probabilistic neural network in the clinical diagnosis of cancers based on clinical chemistry data. Anal Chim Acta 471:77–86
Sharma V, Singh S (2014) CFS-SMO based classification of breast density using multiple texture models. Med Biol Eng Comput 52:521–529
Sharma M, Markou M, Singh S (2001) Evaluation of texture methods for image analysis. In: Proceedings of the Seventh Australian and New Zealand Intelligent Information Systems Conference pp. 117–121
Silla JR, Freitas CN (2011) A survey of hierarchical classification across different application domains. Data Min Knowl Disc 22:31–72
Sood M, Bhooshan SV (2015). Hierarchical computer aided diagnostic system for seizure classification. In: 2nd International Conference on Computing for Sustainable Global Development (INDIACom-2015) pp. 1925–1930
Specht DF (1990) Probabilistic neural networks. Journal of Neural Networks 3:109–118
Stepnowski A, Moszyński M, Van Dung T (2003) Adaptive neuro-fuzzy and fuzzy decision tree classifiers as applied to seafloor characterization. Acoust Phys 49:193–202
Sudarshan VK, Mookiah MRK, Acharya UR, Chandran V, Molinari F, Fujita H, Ng KH (2016) Application of wavelet techniques for cancer diagnosis using ultrasound images: a review. Comput Biol Med 69:97–111
Sujana H, Swarnamani S, Suresh S (1996) Application of artificial neural networks for the classification of liver lesions by image texture parameters. Ultrasound Med Biol 22:1177–1181
Sun CT, Jang JSR (1993) A neuro-fuzzy classifier and its applications. Proc of IEEE Int Conf on Fuzzy Systems, San Francisco 1:94–98
Tang J, Rangayyan RM, Xu J, Naqa E, Yang Y (2009) Computer-aided detection and diagnosis of breast cancer with mammography: recent advances. IEEE T Inf Technol B 13:236–251
Tourassi GD (1999) Journey toward computer-aided diagnosis: role of image texture analysis. Radiology 213:317–320
Vasantha M, Bharathi S, Dhamodharan V (2010) Medical image feature extraction, selection and classification. Int J Eng Sci Technol 2:2071–2076
Virmani J, Kumar V, Kalra N, Khandelwal N (2011) Prediction of cirrhosis based on singular value decomposition of gray level co-occurrence matrix and an neural network classifier. In: Proceedings of the IEEE International Conference on Developments in E-systems Engineering, Dubai (DeSe) pp. 146–151
Virmani J, Kumar V, Kalra N, Khandelwal N (2014) Neural network ensemble based CAD system for focal liver lesions from B-mode ultrasound. J Digit Imaging 27:520–537
Weszka JS, Dyer CR, Rosenfeld A (1976) A comparative study of texture measures for terrain classification. IEEE T Syst Man Cyb 6:269–285
Wolfe JN (1976) Breast patterns as an index of risk for developing breast cancer. Am J Roentgenol 126:1130–1137
Wolfe JN (1977) Risk for breast cancer development determined by mammographic parenchymal pattern. Cancer 37:2486–2492
Wu Y, Giger ML, Doi K, Vyborny CJ, Schmidt RA, Metz CE (1993) Artificial neural networks in mammography: application to decision making in the diagnosis of breast cancer. Radiology 187:81–87
Zhang X, Kanematsu M, Fujita H, Zhou X, Hara T, Yokoyama R, Hoshi H (2009) Application of an artificial neural network to the computer-aided differentiation of focal liver disease in MR imaging. Radiol Phys Technol 2:175–182
Zhang G, Wang W, Moon J, Pack JK, Jean S (2011) A review of breast tissue classification in mammograms”. In: Proceedings of ACM Symposium on Research in Applied Computation pp. 232–237
Zheng Y (2010) Breast cancer detection with Gabor features from digital mammograms. Algorithms 3:44–62
Zhou ZH, Wu Z, Wei T (2002) Ensembling neural networks: many could be better than all. Artif Intell 137:239–263
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kumar, I., Bhadauria, H.S., Virmani, J. et al. A hybrid hierarchical framework for classification of breast density using digitized film screen mammograms. Multimed Tools Appl 76, 18789–18813 (2017). https://doi.org/10.1007/s11042-016-4340-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-4340-z