Abstract
Breast cancer has been one of the leading causes of death among women in the world. The death rates due to breast cancer can be reduced by early detection. The normal symptoms of breast cancer are masses or lumps that feel different from the other tissues, and there are two types of masses- benign and malignant. Benign masses are abnormal growths which are not cancerous, whereas malignant masses are cancerous. Various methods have been proposed to detect breast cancer. A detailed survey has been done in this paper. Deep learning techniques are now widely used for cancer detection, and in deep learning, features are extracted from input data using multiple layers. In this paper, transfer learning techniques have been used to detect breast cancer, and pretrained models like InceptionResnetV2, VGG-16 and VGG-19 are used. In transfer learning, pretrained models are used to solve a new problem where training can be done using a small amount of data. A comparison of those techniques have also been done. Publicly available dataset named BreaKHis dataset has been used for the study.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Breast cancer has been one of the deadliest disease affecting women. It affects the breast cells [1]. Breast cancer affects both men and women, but the breast cancer rate is found to be higher among women. Main symptoms of breast cancer are masses or lumps that are different from the surrounding tissue. Masses can be either benign or malignant. Benign masses are unusual growth that are not cancerous and are not life threatening. Benign masses do not grow outside the breast. Malignant masses are cancerous and are life threatening. Malignant masses can damage the surrounding tissues.
Detection at an earlier stage can reduce mortality rate due to breast cancer to a great extent. Mammograms, breast ultrasound, biopsy, and MRI are the techniques commonly used to diagnose breast cancer. X-ray images of the breast are taken in mammogram. Ultrasound uses sound waves to distinguish between tumour cells and benign cyst. Different images of the breast are combined in MRI to help the doctors to detect tumours and are usually performed as a follow up of mammogram and ultrasound [2]. Pictures of the interior of the breast are created using radio waves and magnet in MRI.
Traditional methods of detecting breast cancer involve analysing the breast images manually by a pathologist which is a time consuming process and often the results obtained may not be that accurate. With the technological advancements, various computer-aided design methods have been proposed for breast cancer detection with increased accuracy, of which the prominent ones are the machine learning and deep learning techniques. In machine learning, machines are provided with the ability to learn themselves without being programmed explicitly and can be supervised learning or unsupervised learning.
Deep learning is a type of machine learning in which features are extracted using multiple layers. Deep learning mimics the working of neurons in brain where different layers are used to learn from the data. Deep learning model can have any number of layers, and the number of layers determines the depth of the model. Data is given to the input layer, and output is obtained from the output layer. Between the input layer and output layer, any number of hidden layers can be present. Deep learning requires a lot of data, and the data is fed through neural networks [3]. The steps involved in deep learning are shown in Fig. 1.
The paper includes the following sections: Various deep learning methods for breast cancer detection have been reviewed in Sect. 2. Dataset used is discussed in Sect. 3. Data augmentation used is discussed in Sect. 4. Section 5 covers the methods used, and Sect. 6 is discussions and conclusion.
2 Related Works
Khan et al. [5] used GoogLeNet, VGGNet, and ResNet to extract different low-level features separately, and later, the extracted features are combined. An accuracy of 97.525% is achieved, without training from scratch, thus, improving the classification efficiency. The deep learning framework used by Khan et al. [5] is shown in Fig. 2.
Wang et al. [6] used a mass detection method for extraction of region of interest, and the features were extracted. With the extracted features and labels, classifiers are trained. The method combined objective features and subjective features, that is the doctor's experience and the mammogram features. Extreme learning machine classifier is used.
Li et al. [7] used a fully convolutional autoencoder to learn the prominent patterns among normal image patches. Then, the patches that are different from the normal patches are detected and analysed.
Perre et al. [8] used transfer learning approach. Three pretrained models- CNN-F, CNN-M, and Caffe have been used, and the model is pretrained using ImageNet dataset. Handcrafted features including intensity features, textures features, and shape features were used, resulting in improved classification efficiency.
Ragab et al. [9] used two segmentation approaches, where initially manual ROI is determined, and later, region-based and threshold segmentation were performed. For feature extraction, deep convolutional neural networks were used in which the last layer is replaced with SVM. An accuracy of 79% is achieved for manual segmentation, and for the automated ROI process, an efficiency of 94% is obtained.
A method for detecting invasive ductual carcinoma of breast cancer is proposed by Brancati et al. [10]. The area of the region of interest is identified and not the exact boundaries of the region of interest. The whole slide images are divided into patches which are marked with either invasive ductual carcinoma label or non-invasive ductual carcinoma label. Initially, training was performed using an unsupervised manner for extracting features and to reconstruct the input images. A stochastic gradient descent algorithm was used to implement back propagation algorithm. Sparsity constraint is included in hidden units to prevent overfitting. Later, supervised classification was performed using a convolutional autoencoder named supervised encoder FusionNet (SEF), where training is done only on the encoding part. This method achieved an accuracy of 97.67% and had lower standard deviation of accuracy.
Gecer et al. [11] used four fully connected convolutional neural networks that can handle images at different magnifications to remove the irrelevant details and localize the region of interest. Whole slide images are classified into five classes using another convolutional neural network. Later, labelling of the whole slide is performed pixel wise, and an overall slide level classification accuracy of 55% was obtained.
Rakhlin et al. [12] used a deep convolutional feature representation method where unsupervised feature representation extraction is performed, and deep convolutional neural networks trained on ImageNet were used. Sparse descriptors of low dimensionality are obtained followed by supervised classification using LightGBM for implementing gradient boosted trees. Two-class classification and four-class classification were performed and achieved an accuracy of 93.8 ± 2.3% for two-class classification, and for four-class classification, the accuracy obtained was 87.2 ± 2.6%.
Vesal et al. [13] used transfer learning technique in which two pretrained convolutional neural networks, namely Inception-V3 and Resnet50 are used. Yap et al. [14] investigated three different methods—U-Net, a patch-based LeNET, and a transfer learning approach with AlexNet. Two datasets are used, and the LeNET achieved an F-score of 0.91 on both datasets, and U-Net achieved an F-score of 0.89 and 0.78 and AlexNet achieved an accuracy of 0.92 and 0.88 on first and second datasets, respectively.
Spanhol et al. [15] extracted features from images and used them as the input to classifier. Output of a previously trained convolutional neural network is fed into these classifiers that are trained on problem-specific data. The pretrained BVLC CaffeNet model is used and was trained on the ImageNet dataset. Spanhol et al. [16] extracted non-overlapping grid patches either randomly or using a sliding window. The results of all the patches of image are combined using fusion rules which are sum, product, and max.
Sun et al. [17] used both labelled and unlabelled data. The proposed method is shown in Fig. 3 and is helpful in cases where it is difficult to obtain labelled data, and an accuracy of 82.43% was obtained.
Abdel-Zaher [18] proposed a deep belief path followed by a back propagation path for breast cancer detection. Deep belief path is an unsupervised path, and back propagation path is a supervised path, and back propagation neural network is constructed using Liebenberg Marquardt learning algorithm. An accuracy of 99.68% was obtained.
Ciresan et al. [19] detected breast cancer by detecting the presence of mitosis. The central coordinate of single mitosis is found out, and training is performed using that information. DNN-based pixel classifier is used for detection and obtained an F-score of 0.782. Methods used by different authors are summarized in Table 1.
3 Dataset
The publicly available dataset named BreakHis [20] is used for the experiments. BreakHis dataset is composed of 9109 microscopic images of which 2480 are benign and 5429 are malignant. Images are collected from 82 patients using varying magnifying factors. Images are of PNG format with size 700 × 460 pixels. Images consist of three channels red, green, and blue with each channel of 8 bit depth. Benign images from the dataset are shown in Fig. 4a, and malignant images from the dataset are shown in Fig. 4b.
4 Data Augmentation
Images with magnification factor 400X are used for the study. The size of the dataset is increased by applying various geometric transformations. The augmentation techniques used are rotation, width shifting, height shifting, shearing, horizontal, and vertical flipping.
5 Method
In this paper, the technique of transfer learning has been used to detect the cancer in the histology images. Three pretrained convolutional neural networks, namely VGG-16, VGG-19, and inceptionresnetv2 have been used. VGG-16 has been used in many classification problems and is easy to implement. VGG-19 is a variant of VGG-16 that includes 19 layers. The concept of batch normalization is introduced in inception resnetv2, and higher learning rate can also be used. Transfer learning is a method in which pretrained networks have been used to solve new problems. The knowledge gained by solving one problem can be used to solve another problem.
5.1 Performance Evaluation
Accuracy [6] is calculated as,
Sensitivity [6] is calculated as,
Specificity [6] is calculated as,
where TP is the true positive, TN is the true negative, FP is the false positive, and FN is the false negative.
5.2 Convolutional Neural Network
Convolutional neural network is a type of deep neural networks. It primarily includes four layers—convolution, pooling, flattening, and fully connected layer which is shown in Fig. 5.
5.3 VGG-16
VGG-16 is a 16 layer deep neural network trained on Imagenet dataset. The training is done for 30 epochs, and the optimizer used is RMSprop. The loss function used is binary cross entropy, and sigmoid activation function is used in the fully convolutional layer. Figure 6 shows the model parameters. The accuracy for the model and the loss is shown in Fig. 7.
5.4 VGG-19
VGG-19 is a 19 layers deep convolutional neural network trained on Imagenet dataset. The training is done for 30 epochs, and the optimizer used is RMSprop. The loss function used is binary cross entropy, and sigmoid activation function is used in the fully convolutional layer. Figure 8 shows the model parameters. The accuracy for the model and the loss is shown in Fig. 9.
5.5 Inception-Resnet-V2
Inception-Resnet-V2 is a 164 layers deep convolutional neural network trained on Imagenet dataset which contains millions of images. The model was trained for 15 epochs, and the optimizer used is Adam. Figure 10 shows the model parameters. The accuracy for the model and the loss is shown in Fig. 11.
6 Discussions and Conclusion
This paper analysed various methods for detecting breast cancer in histology images using deep learning techniques. Transfer learning technique is used in the proposed paper where the network is trained using the pretrained models. Three convolutional neural networks are used to analyse the histology images. The highest accuracy is obtained for VGG-16 which is 82.83%, and for VGG-19 and ResNet-50, the accuracies obtained are 73.04 and 78.57%, which can be successfully used to classify benign and malignant images. The work is going on, to further improve the accuracy by changing the network architecture and fine-tuning the hyper parameters.
References
DeSantis CE, Ma J, Gaudet MM, Newman LA, Miller KD, Goding Sauer A, Jemal A, Siegel RL (2019) Breast cancer statistics, 2019. CA A Cancer J Clin 69:438–451. https://doi.org/10.3322/caac.21583
Sharon JJ, Anbarasi LJ (2018) Diagnosis of DCM and HCM heart diseases using neural network function. Int J Appl Eng Res 13(10):8664–8668
Anbarasi LJ, Anandha Mala GS, Narendra M (2015) DNA based multi-secret image sharing. Procedia Comput Sci 46:1794–1801
Jimenez-del-Toro O, Otálora S, Andersson M, Eurén K, Hedlund M, Rousson M, Müller H, Atzori M (2017) Chapter 10 Analysis of histopathology images from traditional machine learning to deep learning, biomedical texture analysis. Elsevier
Khan S, Islam N, Jan Z, Ud Din I, Rodrigues JJPC (2019) A novel deep learning based framework for the detection and classification of breast cancer using transfer learning. Pattern Recogn Lett 125:1–6. ISSN: 0167-8655
Wang Z et al (2019) Breast cancer detection using extreme learning machine based on feature fusion with CNN deep features. IEEE Access 7:105146–105158
Li X et al (2019) Discriminative pattern mining for breast cancer histopathology image classification via fully convolutional autoencoder. IEEE Access 7:36433–36445
Perre AC, Alexandre LA, Freire LC (2018) Lesion classification in mammograms using convolutional neural networks and transfer learning. Comput Methods Biomech Biomed Eng: Imaging Vis. https://doi.org/10.1080/21681163.2018.1498392
Ragab DA, Sharkas M, Marshall S, Ren J (2019) Breast cancer detection using deep convolutional neural networks and support vector machines. PeerJ 7:e6201. https://doi.org/10.7717/peerj.6201
Brancati N, De Pietro G, Frucci M, Riccio D (2019) A deep learning approach for breast invasive ductal carcinoma detection and lymphoma multi-classification in histological images. IEEE Access 7:44709–44720. https://doi.org/10.1109/ACCESS.2019.2908724
Gecer B, Aksoy S, Mercan E, Shapiro LG, Weaver DL, Elmore JG (2018) Detection and classification of cancer in whole slide breast histopathology images using deep convolutional networks. Pattern Recogn 84:345–356. ISSN: 0031-3203
Rakhlin A et al (2018) Deep convolutional neural networks for breast cancer histology image analysis. Image Anal Recogn 737–744
Vesal S et al (2018) Classification of breast cancer histology images using transfer learning. In: International conference image analysis and recognition. Springer, Cham
Yap MH et al (2018) Automated breast ultrasound lesions detection using convolutional neural networks. IEEE J Biomed Health Inform 22(4):1218–1226. https://doi.org/10.1109/JBHI.2017.2731873
Spanhol FA, Oliveira LS, Cavalin PR, Petitjean C, Heutte L (2017) Deep features for breast cancer histopathological image classification. In: 2017 IEEE International conference on systems, man, and cybernetics (SMC). Banff, AB, pp 1868–1873. https://doi.org/10.1109/SMC.2017.8122889
Spanhol FA, Oliveira LS, Petitjean C, Heutte L (2016) Breast cancer histopathological image classification using convolutional neural networks. In: 2016 International joint conference on neural networks (IJCNN). Vancouver, BC, pp 2560–2567
Sun W, (Bill) Tseng T-L, Zhang J, Qian W (2017) Enhancing deep convolutional neural network scheme for breast cancer diagnosis with unlabeled data. Comput Med Imaging Graph 57:4–9. ISSN: 0895-6111
Abdel-Zaher AM, Eldeib AM (2016) Breast cancer classification using deep belief networks. Expert Syst Appl 46:139–144. ISSN: 0957-4174
Cireşan D, Giusti A, Gambardella L, Schmidhuber J (2013) Mitosis detection in breast cancer histology images with deep neural networks. Medical image computing and computer-assisted intervention: MICCAI. In: International conference on medical image computing and computer-assisted intervention, vol 16, pp 411–418. https://doi.org/10.1007/978-3-642-40763-5_51
Spanhol F, Oliveira LS, Petitjean C, Heutte L (2016) A dataset for breast cancer histopathological image classification. IEEE Trans Biomed Eng (TBME) 63(7):1455–1462
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sara Koshy, S., Jani Anbarasi, L. (2021). Breast Cancer Detection in Histology Images Using Convolutional Neural Network. In: Kannan, R.J., Geetha, S., Sashikumar, S., Diver, C. (eds) International Virtual Conference on Industry 4.0. Lecture Notes in Electrical Engineering, vol 355. Springer, Singapore. https://doi.org/10.1007/978-981-16-1244-2_7
Download citation
DOI: https://doi.org/10.1007/978-981-16-1244-2_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-1243-5
Online ISBN: 978-981-16-1244-2
eBook Packages: EngineeringEngineering (R0)