Keywords

1 Introduction

The most commonly diagnosed disease among women nowadays is breast cancer [1]. It has been shown that high breast tissue density is associated with high risk of developing breast cancer [2,3,4,5,6,7,8,9,10]. Mortality rate for breast cancer can be increased if detection is made at an early stage. Breast tissue is broadly classified into fatty, fatty-glandular, or dense-glandular based on its density.

Various computer-aided diagnostic (CAD) systems have been developed by researchers in the past to discriminate between different density patterns, thus providing the radiologists with a system that can act as a second opinion tool to validate their diagnosis. Few studies have been carried out on Mammographic Image Analysis Society (MIAS) dataset for classification of breast tissue density patterns into fatty, fatty-glandular, and dense-glandular tissue types [3,4,5,6,7,8,9,10]. Among these, mostly the studies have been carried on the segmented breast tissue (SBT) and rarely on fixed-size ROIs [3,4,5,6,7,8,9,10]. Out of these studies, Subashini et al. [6] report a maximum accuracy of 95.4% using the SBT approach, and Mustra et al. [9] report a maximum accuracy of 82.0% using the ROI extraction approach.

The experienced participating radiologist (one of the authors of this paper) graded the fatty, fatty-glandular, and dense-glandular images as belonging to typical or atypical categories. The sample images of typical and atypical cases depicting different density patterns are shown in Fig. 1.

Fig. 1
figure 1

Sample of mammographic images from MIAS database, a typical fatty tissue ‘mdb132,’ b typical fatty-glandular tissue ‘mdb016,’ c typical dense-glandular tissue ‘mdb216,’ d atypical fatty tissue ‘mdb096,’ e atypical fatty-glandular tissue ‘mdb090,’ f atypical dense-glandular tissue ‘mdb100’

In the present work, a hierarchical classifier with two stages for binary classification has been proposed. This classifier is designed using support vector machine (SVM) classifier in each stage to differentiate between fatty and dense breast tissues and then between fatty-glandular and dense-glandular breast tissues using Laws’ texture features.

2 Methodology

2.1 Description of Dataset

The MIAS database consists of total 322 mammographic images out of which 106 are fatty, 104 are fatty-glandular, and 112 are dense-glandular [11]. From each image, a fixed-size ROI has been extracted for further processing.

2.2 Selecting Regions of Interest

After conducting repeated experiments, it has been asserted that for classification of breast density, the center area of the tissue is the optimal choice [12]. Accordingly, fixed-size ROIs of size 200 × 200 pixels have been extracted from each mammogram as depicted in Fig. 2.

Fig. 2
figure 2

ROI extraction protocol for mammographic image ‘mdb216’ with ROI marked

2.3 Proposed Method

Computer-aided diagnostic systems involve analysis of mammograms through computers which can be used by the radiologists as a second opinion tool for validating their diagnosis as these systems tend to improve the diagnostic accuracy by detecting any lesions that might be missed during subjective analysis [1, 3,4,5,6,7,8,9,10, 13, 14]. The block diagram is shown in Fig. 3.

Fig. 3
figure 3

Proposed classification system

Feature Extraction Module The texture descriptor vectors (TDVs) derived from Laws’ texture analysis using Laws’ masks of resolutions 3, 5, 7, and 9 have been used in the present work for design of SVM-based hierarchical classifier.

Feature Classification Module Support vector machine classifier has been extensively used for classification of texture patterns in medical images [1, 14,15,16,17,18,19]. In the present work, two binary SVM classifiers arranged in a hierarchical framework have been used for three-class breast tissue density classification. The SVM classifier is implemented using LibSVM library [20].

3 Results

Various experiments were conducted to obtain the classification performance of Laws’ texture features using hierarchical classifier built using two stages of binary SVM classifiers.

3.1 Classification Performance of Laws’ Texture Features Using Hierarchical Classifier

In this work, the performance of TDVs derived using Laws’ masks of length 3, 5, 7, and 9 is evaluated using SVM-based hierarchical classifier. The results obtained are shown in Table 1.

Table 1 Performance of TDFVs derived from laws’ texture features using hierarchical classifier

From Table 1, it can be observed that OCA of 91.3, 93.2, 91.9, and 92.5% is achieved for TDV1, TDV2, TDV3, and TDV4, respectively, using SVM-1 sub-classifier, and OCA of 92.5, 84.2, 87.0, and 90.7% is obtained for TDV1, TDV2, TDV3, and TDV4, respectively, using SVM-2 sub-classifier.

The results from Table 1 show that for differentiating between the fatty and dense breast tissues, SVM-1 sub-classifier gives best performance for features extracted using Laws’ mask of length 5 (TDV2), and for further classification of dense tissues into fatty-glandular and dense-glandular classes, SVM-2 sub-classifier gives best performance using features derived from Laws’ mask of length 3 (TDV1). This analysis of the hierarchical classifier is shown in Table 2. The OCA for hierarchical classifier is calculated by adding the misclassified cases at each classification stage.

Table 2 Performance analysis of hierarchical classifier

4 Conclusion

From the exhaustive experiments carried out in the present work, it can be concluded that Laws’ masks of length 5 yield the maximum classification accuracy of 93.2% for differential diagnosis between fatty and dense classes and Laws’ masks of length 3 yield the maximum classification accuracy 92.5% for differential diagnosis between fatty-glandular and dense-glandular classes. Further, for the three-class problem, a single multi-class SVM classifier would construct three different binary SVM sub-classifiers where each binary sub-classifier is trained to separate a pair of classes and decision is made by using majority voting technique. In case of hierarchical framework, the classification can be done using only two binary SVM sub-classifiers.