Breast Cancer Detection Based on Decision Fusion of Machine Learning Algorithms

Yadav, Rohit; Sharma, Richa

doi:10.1007/978-981-16-3660-8_50

Rohit Yadav¹⁰ &
Richa Sharma¹⁰

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1393))

Included in the following conference series:

International Conference on Advanced Informatics for Computing Research

894 Accesses

Abstract

A lot of new methods have been invented in Machine Learning since 1959 when Arthur Samuel first coined the term. The ability to learn and pattern checking in data persuaded many researchers in this field. With so many algorithms and their hybrid combinations, the task to solve a problem includes which combination of methods can produce better and efficient results. In this paper, we have used MIAS dataset for our experiment. First, we have improved the contrast of the mammograms using Contrast Limited Adaptive Histogram Equalization (CLAHE) technique. Second, Region of Interest (ROI) is selected from the images and cropped, then a CNN model used for the extraction of features. Finally, SVM and Decision tree classifier are used for the classification and voting classifier is used for the final decision. After using decision fusion based on a voting classifier, we were able to achieve 93.4% accuracy.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Hybrid Method for Breast Cancer Diagnosis Using Voting Technique and Three Classifiers

Voting Based CAD Model for Breast Cancer Classification

Feature and Decision Fusion for Breast Cancer Detection

Keywords

1 Introduction

Breast Cancer is one of the most common cancers in women. As per official reports, in India, 25%–30% of all female related deaths were resultant of this cancer [1]. A study showed that in 2018, 1.6 million new cases were registered, and 87,090 deaths were reported [2]. A major reason for this is less public awareness along with none or very fewer screenings with high testing prices. Figure 1. shows, number of cases when compared to 25 years ago shows an increase in breast cancer in the age group between 20–50 [3]. Although the exact reason for the development of breast cancer is still unknown, several lifestyle guidelines are stated, which decreases the chances of development of breast cancer. Maintaining balanced BMI with regular physical exercise and breastfeeding are several suggestions [4]. But not all reasons cannot be controlled, menstruation in younger age, menopause in the older age, late marriage, contraceptive drug are namely a few, which increases the chances of breast cancer.

With the advancement of technology in both medicine and computer science, the detection, diagnosis, and treatment of diseases have improved drastically. New methods and techniques are being discovered which aids in the medical process. For breast cancer detection, many imaging modalities exist. In hospitals, various breast imaging methods are used in early breast cancer detection and screening, including MRI [5], computed tomography (CT) [6], magnetic resonance imaging (MRI) [7, 8]. But a mammogram is gaining popularity for its low complexity and better availability. Figure 2 [9] shows all the imaging techniques which are used for breast cancer detection.

However, this paper focuses mainly on Mammograms and used MIAS dataset for experiments. Mammography is achieved through an X-ray exposure of the breast. The breast tissue can absorb X-ray radiation as it is exposed to X-ray. There are various signal levels of the breast tissue and cancer cells. But the question of classifying lumps in mammogram lands on the radiologist, whose prediction is based on experience [10] and, quality of mammogram [11]. Adding to this, breast anomalies are also hidden by the breast tissue structure which makes it more difficult to detect [12]. A major problem in mammogram images is its low contrast [13], which makes it difficult to detect lumps, and have shown a high rate of false-positive cases (regular change as cancerous) and false negative (actual abnormality not detected) [14,15,16].

Image fusion is a process which applies different methods and techniques for combining several images information either from the same platform or from different spectroscopic platforms to create a single output image. The resultant image (known as fused image) has more detailed and useful or predictable information for machine perception or human understanding [17]. Each input image might have a different focus area, and the complete information might not be presented in a single image. Image fusion process combines these images, which is more detailed than a single image. To combine multiple images, all images should be of the same area, and different angled photos result in difficulty in the fusion process.

Different medical conditions require a separate process to be followed for its treatment, image fusion can combine MRI, CT, PET, SPECT data together for better results. Better achievements have been achieved in improving clinical accuracy by using Multi-modal medical image fusion algorithms [18].

Pixel level [19], feature level and decision level fusion [20] are types of fusion techniques and is explained with help of diagram in Fig. 3(a). In this paper, we used decision level fusion for our experiment. In this method, we use the fusion process after the classification step is completed. This process is generally a combination of multiple algorithms to obtain the resultant image. When confidences are used instead of decision, it is known as soft fusion. Otherwise, it is called hard fusion. Methods of decision fusion are shown in Fig. 3(b).

With advancements in algorithms and better computational power, Machine learning has helped to solve many real-life problems. These algorithms help in managing huge amount of data along with finding correlation among data which is not possible to find manually. Deep learning is also a subfield of AI which is gaining a lot of popularity. In this paper, we have used deep learning for feature extraction and used these features for classification using SVM and Decision tree. Finally, we have used voting classifier for decision level fusion to combine the results from these two classifiers and predicted our output.

2 Literature Review

This section reviews the related work done by researchers using fusion techniques. Pixel level, feature level and decision level fusion techniques are used by researchers for different modalities like MRI, CT and mammogram [21] as well as other modalities are also used and reviewed. Multiple authors have also purposed CADx solution [22, 23], pipeline structures and frameworks for fusion and classification of breast cancer.

While using MIAS dataset for their experiment, authors implemented pixel level fusion in their experiment. They conducted their experiment on three different mammograms from dataset of Normal, Benign and Microcalcification X-rays. They tested simple average and weighted average method and documented the results based on Signal to Noise Ratio (SNR), Peak Signal to Noise Ratio (PSNR), Root Mean Square Error (RMSE), Mutual Information (MI), etc. They concluded that, using image fusion provides better results than original image [24].

While using the same MIAS dataset, authors presented their image fusion method using Particle swarm optimization (PSO). The PSO used to calculate the optimum weighted weights for fusion and compared the results with conventional DWT and genetic algorithms. They compared the results on same fusion parameters as of author [24] and concluded that genetic method based DWT provides better results than Weighted average and traditional DWT [25].

Authors of paper [26], presented local entropy maximization based image fusion technique to improve the contrast of mammograms. They used MIAS and TMCH dataset for their experiment. Using Haar wavelet they decomposed the original and CLAHE image into 3 levels. Using sliding window of 5 × 5 window size, they fused the coefficients while choosing the maximum entropy. Finally, the fused image is reconstructed using these coefficients and validated their outcomes based of edge contents (EC), edge-based contrast measure (EBCM), feature similarity index measure (FSIM) and absolute mean brightness error (AMBE). They achieved 1.87 EC, 120.1 EBCM, 0.97 FSIM and 2.01 AMBE. They compared their results with HE, BBHE and CLAHE and their method showed better results.

Using 400 mammogram images from hospitals, authors [27], have purposed a CAD system using feature fusion techniques. First, they suggested a method of mass detection based on CNN deep features and clustering with Unsupervised Extreme Learning Machine (US-ELM). Second, they establish a collection of features that incorporate deep features, morphological features, texture and density features. Third, using the merged function collection to distinguish benign and malignant breast masses, an ELM classifier is established.

Authors of paper [28], purposed a wavelet fusion along with CLAHE enhancement for their experiment. They used multi-modalities images. In first step, they enhanced the contrast of image using CLAHE and second, they used 2D wavelet transformation fusion to generate the fused image. They compared their results on parameters like SNR and found their method performs better for different medical images with low contrast.

[29] presented a CAD system in which they used DDSM dataset for their experiment. Their experiment includes merging features of MIO and CC views of mammograms for better results. While using five features namely GLRLM, GLCM and others. While using SVM as classifier and using RBF kernel as performance booster they were able to achieve 97.5% accuracy, 100% sensitivity, 97.2% specificity, 97.1% precision, 96.23% F1 score, 0.952% Mathews Correlation Coefficient and 98.74% Balanced Classification Rate.

While using DDSM dataset for their experiment, authors of paper [30], used ensemble of CNN for classification of mammograms. The implemented data cleaning by contrast fading and removed white strips in input images of dataset and in pre-processing padding, dilation and cropping is applied. Since they used CNN for their experiment, they used data augmentation to solve overfitting issues in their model. Finally, they used GoogleNet for their classification step. Their decision fusion is based on max ensemble technique. After training their model for 50000 iterations they were able to achieve 91.3% recall value in stand-alone setup and were able to increase this to 97.3% with ensemble. 94.5% F1 score and 95% precision value is achieved.

[31] while also using decision level fusion used 65 thermography images gathered from [32, 33] and [34]. They purposed a novel texture feature extraction based on Markov Random Field (MRF) model and another texture based on LBP are extracted from images. While implementing decision fusion based on HMM, they were able to achieve 8.3% false negative and 5% false positive rate.

Whereas authors of paper [35] implemented a deep feature fusion of 3 different imaging modalities together. They used mammogram dataset FFDM containing 245 unique images, Ultrasound dataset containing 1125 images and DCE-MRI dataset containing 690 images. While using publicly available VGG19 model they implemented CNN model and were able to achieve AUC = 0.89 for DCE-MRI, AUC = 0.86 for FFDM and AUC = 0.9 for ultrasound (Table 1).

Table 1. Fusion techineques overview.

Full size table

3 Materials and Methods

3.1 Datasets

In this experiment, we have used MIAS dataset. MIAS dataset consists of 161 pair of films of abnormalities and normal cases. It consists of 322 mammograms which are selected from United Kingdom National Breast Screening Program. A major factor for selection MIAS dataset is, it consists of mammograms which are cheap, low complexity and easily available in countries. MIAS dataset is available in two sizes (50 μ and 200 μ). There are other mammography datasets publicly like DDSM, TMCH,B-SCREEN [36] etc.

3.2 Pre-processing

A major problem in mammogram images is its low contrast, which makes it difficult to detect lumps, and have shown a high rate of false-positive cases (regular change as cancerous) and false negative (actual abnormality not detected). To solve this problem, we have implemented the CLAHE enhancement technique. CLAHE is a version of AHE in which we define a threshold level at which the intensities are clipped. Clip limit of 0.2 was used for this experiment and was coded in python.

3.3 Segmentation

Another issue with mammograms is the non-important area in the film. For our algorithms to achieve better results and to reduce the processing time we must trim the images to select the Region of Interest (ROI). Generally, according to the view of the breast (left or right), a majority of the portion is pixels with 0 value which should be removed. In this experiment, we have implemented a sliding vertical line from left or right depending on the image to trim it until pixels with non-zero pixel value is encountered.

3.4 Feature Extraction and Selection

This is one of the main important steps in our workflow. The output of the process heavily depends on the pre-processing techniques to enhance the images and features used for classification. Features more associated to the output class contribute better than non-associated features. In this step, we have used the power of deep learning algorithms for patterns in our input image. A CNN model is created which extracts the features and these features are used by classification algorithm (SVN and Decision Tree). A model of CNN architecture is shown in Fig. 4. A total of 1754 high-level features and 288 low-level features were used by the classification algorithm.

3.5 Classification and Decision Fusion

In this step, we have implemented SVM and Decision classifier. SVM is a machine learning algorithm which can be used for both classification and regression. Decision tree is also implemented for the classification of breast cancer. Finally, a voting classifier is used for making a decision based on these two input classifiers and final output is generated. Figure 5, shows the workflow of the process which we have used for our experiment.

4 Results

In this experiment, we have used MIAS dataset consisting of 161 pair of mammograms (322 total). CLAHE enhancement technique is used for improving the contrast of the images and the CNN model is used for the extraction of features. A total of 288 low-level and 1754 high-level features were extracted and were used for classification. Standalone SVM was able to achieve 90.3% accuracy, 87.8% sensitivity and 93% sensitivity while Decision tree was able to achieve 92.03% accuracy. After combining both the classification techniques using a voting classifier, we were able to achieve 93.4% accuracy.

5 Conclusion

Breast cancer is major affected diseases in women. After reviewing many techniques and methods in this paper we found out that a CAD system seems to be a good solution for real-life use by radiologist. With radiologist own expertise and second and helping opinion from CAD system will help to address the accuracy of diagnosis by improving the image, selecting the ROI. Further, a feature fusion along with decision fusion can be implemented to improve the results.

References

Trends of Breast Cancer in India. http://www.breastcancerindia.net/statistics/trends.html. Accessed 29 Jan 2020
Cancer Statistics - India Against Cancer. http://cancerindia.org.in/cancer-statistics/. Accessed 29 Jan 2020
Siegel, R.L., Miller, K.D., Jemal, A.: Cancer statistics, 2019. CA. Cancer J. Clin. 69(1), 7–34 (2019)
Article Google Scholar
Sarosa, S.J.A., Utaminingrum, F., Bachtiar, F.A.: Mammogram breast cancer classification using gray-level co-occurrence matrix and support vector machine. In: 3rd International Conference on Sustainable Information Engineering Technology SIET 2018 - Proceedings, pp. 54–59 (2018)
Google Scholar
Jalalian, A., Mashohor, S.B.T., Mahmud, H.R., Saripan, M.I.B., Ramli, A.R.B., Karasfi, B.: Computer-aided detection/diagnosis of breast cancer in mammography and ultrasound: a review. Clin. Imaging 37(3), 420–426 (2013)
Article Google Scholar
Chen, B., Ning, R.: Cone-beam volume CT breast imaging: feasibility study. Med. Phys. 29(5), 755–770 (2002)
Article Google Scholar
Mann, R.M., Kuhl, C.K., Kinkel, K., Boetes, C.: Breast MRI: guidelines from the European society of breast imaging. Eur. Radiol. 18(7), 1307–1318 (2008)
Article Google Scholar
Sree, S.V.: Breast imaging: a survey. World J. Clin. Oncol. 2(4), 171 (2011)
Article Google Scholar
Iranmakani, S., et al.: A review of various modalities in breast imaging : technical aspects and clinical outcomes (2020)
Google Scholar
Michaelson, J., et al.: The pattern of breast cancer screening utilization and its consequences. Cancer 94(1), 37–43 (2002)
Article Google Scholar
Elmore, J.G., et al.: Variability in interpretive performance at screening mammography and radiologists’ characteristics associated with accuracy. Radiology 253(3), 641–651 (2009)
Article Google Scholar
Verma, B., McLeod, P., Klevansky, A.: Classification of benign and malignant patterns in digital mammograms for the diagnosis of breast cancer. Expert Syst. Appl. 37(4), 3344–3351 (2010)
Article Google Scholar
Ball, J.E., Bruce, L.M.: Digital mammographic computer aided diagnosis (CAD) using adaptive level set segmentation. In: Annual International Conference on IEEE Engineering in Medicine and Biology - Proceedings, pp. 4973–4978 (2007)
Google Scholar
Zhang, G., Wang, W., Moon, J., Pack, J.K., Jeon, S.I.: A review of breast tissue classification in mammograms. In: Procedings of the 2011 ACM Research in Applied Computation Symposium RACS 2011, pp. 232–237 (2011)
Google Scholar
Bird, R.E., Wallace, T.W., Yankaskas, B.C.: Analysis of cancers missed at screening mammography. Radiology 184(3), 613–617 (1992)
Article Google Scholar
Kerlikowske, K., et al.: Performance of screening mammography among women with and without a first-degree relative with breast cancer. Ann. Intern. Med. 133(11), 855–863 (2000)
Article Google Scholar
Suthakar, J.: International journal of computer science and mobile computing study of image fusion-techniques, method and applications. Int. J. Comput. Sci. Mob. Comput. 3(11), 469–476 (2014)
Google Scholar
James, A.P., Dasarathy, B.V.: Medical image fusion: a survey of the state of the art. Inf. Fusion 19(1), 4–19 (2014)
Article Google Scholar
Mitchell, H.B.: Image Fusion Theories, Techniques andApplications (2010)
Google Scholar
Nazar, E., et al.: A comprehensive overview of decision fusion technique in healthcare: a systematic scoping review. Iran. Red Crescent Med. J. 22(10) SE-Systematic reviews (2020)
Google Scholar
Kerlikowske, K., et al.: Comparative effectiveness of digital versus film-screen mammography in community practice in the United States: a cohort study. Ann. Intern. Med. 155(8), 493–502 (2011)
Article Google Scholar
Dheeba, J., Singh, N.A., Selvi, S.T.: Computer-aided detection of breast cancer on mammograms: a swarm intelligence optimized wavelet neural network approach. J. Biomed. Inform. 49, 45–52 (2014)
Google Scholar
Singh, S.P., Urooj, S.: An Improved CAD system for breast cancer diagnosis based on generalized pseudo-zernike moment and ada-DEWNN classifier. J. Med. Syst. 40(4), 1–13 (2016). https://doi.org/10.1007/s10916-016-0454-0
Article Google Scholar
Kumar, M.P., Svecw, A., Pradesh, A.: Pixel Level Weighted Averaging Technique for Enhanced Image Fusion in Mammography, vol. 3, pp. 10–15 (2015)
Google Scholar
Kumar, M.P., Kumar, P.R.R.: Image fusion of mammogaphy images using meta heuristic method particle swarm optimization (PSO). Int. J. Appl. Eng. Res. 11(9), 6254–6258 (2016)
Google Scholar
Pawar, M.M., Talbar, S.N.: Local entropy maximization based image fusion for contrast enhancement of mammogram. J. King Saud Univ. - Comput. Inf. Sci. 33, 150–160 (2018)
Google Scholar
Wang, Z., et al.: Breast cancer detection using extreme learning machine based on feature fusion with CNN deep features. IEEE Access. 7(c), 105146–105158 (2019)
Google Scholar
Bhan, B., Patel, S.: Efficient medical image enhancement using CLAHE enhancement and wavelet fusion. Int. J. Comput. Appl. 167(5), 1–5 (2017)
Google Scholar
Shanmugam, S., Shanmugam, A.K., Muthusamy, E.: Analyses of statistical feature fusion techniques in breast cancer detection, vol. 17, no. iCAST, pp. 311–316 (2019)
Google Scholar
Sert, E., Ertekin, S., Halici, U.: Ensemble of convolutional neural networks for classification of breast microcalcification from mammograms. In: Proceedings of the Annual International Conference on IEEE Engineering in Medicine and Biology Society EMBS, pp. 689–692 (2017)
Google Scholar
Rastghalam, R., Pourghassem, H.: Breast cancer detection using MRF-based probable texture feature and decision-level fusion-based classification using HMM on thermography images. Pattern Recognit. 51, 176–186 (2016)
Article Google Scholar
EtehadTavakol, M., Lucas, C., Sadri, S., Ng, E.: Analysis of breast thermography using fractal dimension to establish possible difference between malignant and benign patterns. J. Healthc. Eng. 1, 27–44 (2010)
Article Google Scholar
EtehadTavakol, M., Ng, E., Lucas, C., Sadri, S., Gheissari, N.: Estimating the mutual information between bilateral breast in thermograms using nonparametric windows. J. Med. Syst. 35, 959–967 (2011)
Article Google Scholar
Rastghalam, R., Pourghassem, H.: Breast cancer detection using spectral probable feature on thermography images. In: 2013 8th Iranian Conference on Machine Vision and Image Processing (MVIP), 2013, pp. 116–120 (2013)
Google Scholar
Antropova, N., Huynh, B.Q., Giger, M.L.: A deep feature fusion methodology for breast cancer diagnosis demonstrated on three imaging modality datasets. Med. Phys. 44(10), 5162–5171 (2017)
Article Google Scholar
“Mammographic Image Analysis Homepage - Databases. https://www.mammoimage.org/databases/. Accessed 23 Sep 2020

Download references

Author information

Authors and Affiliations

School of Computer Science, Lovely Professional University, Jalandhar, India
Rohit Yadav & Richa Sharma

Authors

Rohit Yadav
View author publications
You can also search for this author in PubMed Google Scholar
Richa Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Papua New Guinea University of Technology, Lae, Papua New Guinea
Ashish Kumar Luhach
Namibia University of Science and Technology, Windhoek, Namibia
Dharm Singh Jat
Universiti Malaysia Pahang, Pekan, Pahang, Malaysia
Kamarul Hawari Bin Ghazali
University of Eastern Finland, Kuopio, Finland
Xiao-Zhi Gao
Saint Mary's University, Halifax, NS, Canada
Pawan Lingras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yadav, R., Sharma, R. (2021). Breast Cancer Detection Based on Decision Fusion of Machine Learning Algorithms. In: Luhach, A.K., Jat, D.S., Bin Ghazali, K.H., Gao, XZ., Lingras, P. (eds) Advanced Informatics for Computing Research. ICAICR 2020. Communications in Computer and Information Science, vol 1393. Springer, Singapore. https://doi.org/10.1007/978-981-16-3660-8_50

Download citation

DOI: https://doi.org/10.1007/978-981-16-3660-8_50
Published: 20 June 2021
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-3659-2
Online ISBN: 978-981-16-3660-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics