Keywords

1 Introduction

Breast cancer has been one of the deadliest disease affecting women. It affects the breast cells [1]. Breast cancer affects both men and women, but the breast cancer rate is found to be higher among women. Main symptoms of breast cancer are masses or lumps that are different from the surrounding tissue. Masses can be either benign or malignant. Benign masses are unusual growth that are not cancerous and are not life threatening. Benign masses do not grow outside the breast. Malignant masses are cancerous and are life threatening. Malignant masses can damage the surrounding tissues.

Detection at an earlier stage can reduce mortality rate due to breast cancer to a great extent. Mammograms, breast ultrasound, biopsy, and MRI are the techniques commonly used to diagnose breast cancer. X-ray images of the breast are taken in mammogram. Ultrasound uses sound waves to distinguish between tumour cells and benign cyst. Different images of the breast are combined in MRI to help the doctors to detect tumours and are usually performed as a follow up of mammogram and ultrasound [2]. Pictures of the interior of the breast are created using radio waves and magnet in MRI.

Traditional methods of detecting breast cancer involve analysing the breast images manually by a pathologist which is a time consuming process and often the results obtained may not be that accurate. With the technological advancements, various computer-aided design methods have been proposed for breast cancer detection with increased accuracy, of which the prominent ones are the machine learning and deep learning techniques. In machine learning, machines are provided with the ability to learn themselves without being programmed explicitly and can be supervised learning or unsupervised learning.

Deep learning is a type of machine learning in which features are extracted using multiple layers. Deep learning mimics the working of neurons in brain where different layers are used to learn from the data. Deep learning model can have any number of layers, and the number of layers determines the depth of the model. Data is given to the input layer, and output is obtained from the output layer. Between the input layer and output layer, any number of hidden layers can be present. Deep learning requires a lot of data, and the data is fed through neural networks [3]. The steps involved in deep learning are shown in Fig. 1.

Fig. 1
figure 1

Process flow of deep learning [4]

The paper includes the following sections: Various deep learning methods for breast cancer detection have been reviewed in Sect. 2. Dataset used is discussed in Sect. 3. Data augmentation used is discussed in Sect. 4. Section 5 covers the methods used, and Sect. 6 is discussions and conclusion.

2 Related Works

Khan et al. [5] used GoogLeNet, VGGNet, and ResNet to extract different low-level features separately, and later, the extracted features are combined. An accuracy of 97.525% is achieved, without training from scratch, thus, improving the classification efficiency. The deep learning framework used by Khan et al. [5] is shown in Fig. 2.

Fig. 2
figure 2

Deep learning framework used by Khan et al. [5]

Wang et al. [6] used a mass detection method for extraction of region of interest, and the features were extracted. With the extracted features and labels, classifiers are trained. The method combined objective features and subjective features, that is the doctor's experience and the mammogram features. Extreme learning machine classifier is used.

Li et al. [7] used a fully convolutional autoencoder to learn the prominent patterns among normal image patches. Then, the patches that are different from the normal patches are detected and analysed.

Perre et al. [8] used transfer learning approach. Three pretrained models- CNN-F, CNN-M, and Caffe have been used, and the model is pretrained using ImageNet dataset. Handcrafted features including intensity features, textures features, and shape features were used, resulting in improved classification efficiency.

Ragab et al. [9] used two segmentation approaches, where initially manual ROI is determined, and later, region-based and threshold segmentation were performed. For feature extraction, deep convolutional neural networks were used in which the last layer is replaced with SVM. An accuracy of 79% is achieved for manual segmentation, and for the automated ROI process, an efficiency of 94% is obtained.

A method for detecting invasive ductual carcinoma of breast cancer is proposed by Brancati et al. [10]. The area of the region of interest is identified and not the exact boundaries of the region of interest. The whole slide images are divided into patches which are marked with either invasive ductual carcinoma label or non-invasive ductual carcinoma label. Initially, training was performed using an unsupervised manner for extracting features and to reconstruct the input images. A stochastic gradient descent algorithm was used to implement back propagation algorithm. Sparsity constraint is included in hidden units to prevent overfitting. Later, supervised classification was performed using a convolutional autoencoder named supervised encoder FusionNet (SEF), where training is done only on the encoding part. This method achieved an accuracy of 97.67% and had lower standard deviation of accuracy.

Gecer et al. [11] used four fully connected convolutional neural networks that can handle images at different magnifications to remove the irrelevant details and localize the region of interest. Whole slide images are classified into five classes using another convolutional neural network. Later, labelling of the whole slide is performed pixel wise, and an overall slide level classification accuracy of 55% was obtained.

Rakhlin et al. [12] used a deep convolutional feature representation method where unsupervised feature representation extraction is performed, and deep convolutional neural networks trained on ImageNet were used. Sparse descriptors of low dimensionality are obtained followed by supervised classification using LightGBM for implementing gradient boosted trees. Two-class classification and four-class classification were performed and achieved an accuracy of 93.8 ± 2.3% for two-class classification, and for four-class classification, the accuracy obtained was 87.2 ± 2.6%.

Vesal et al. [13] used transfer learning technique in which two pretrained convolutional neural networks, namely Inception-V3 and Resnet50 are used. Yap et al. [14] investigated three different methods—U-Net, a patch-based LeNET, and a transfer learning approach with AlexNet. Two datasets are used, and the LeNET achieved an F-score of 0.91 on both datasets, and U-Net achieved an F-score of 0.89 and 0.78 and AlexNet achieved an accuracy of 0.92 and 0.88 on first and second datasets, respectively.

Spanhol et al. [15] extracted features from images and used them as the input to classifier. Output of a previously trained convolutional neural network is fed into these classifiers that are trained on problem-specific data. The pretrained BVLC CaffeNet model is used and was trained on the ImageNet dataset. Spanhol et al. [16] extracted non-overlapping grid patches either randomly or using a sliding window. The results of all the patches of image are combined using fusion rules which are sum, product, and max.

Sun et al. [17] used both labelled and unlabelled data. The proposed method is shown in Fig. 3 and is helpful in cases where it is difficult to obtain labelled data, and an accuracy of 82.43% was obtained.

Fig. 3
figure 3

Method used by Sun et al. [17]

Abdel-Zaher [18] proposed a deep belief path followed by a back propagation path for breast cancer detection. Deep belief path is an unsupervised path, and back propagation path is a supervised path, and back propagation neural network is constructed using Liebenberg Marquardt learning algorithm. An accuracy of 99.68% was obtained.

Ciresan et al. [19] detected breast cancer by detecting the presence of mitosis. The central coordinate of single mitosis is found out, and training is performed using that information. DNN-based pixel classifier is used for detection and obtained an F-score of 0.782. Methods used by different authors are summarized in Table 1.

Table 1 Methods used by various researchers

3 Dataset

The publicly available dataset named BreakHis [20] is used for the experiments. BreakHis dataset is composed of 9109 microscopic images of which 2480 are benign and 5429 are malignant. Images are collected from 82 patients using varying magnifying factors. Images are of PNG format with size 700 × 460 pixels. Images consist of three channels red, green, and blue with each channel of 8 bit depth. Benign images from the dataset are shown in Fig. 4a, and malignant images from the dataset are shown in Fig. 4b.

Fig. 4
figure 4

a Benign images, b malignant images

4 Data Augmentation

Images with magnification factor 400X are used for the study. The size of the dataset is increased by applying various geometric transformations. The augmentation techniques used are rotation, width shifting, height shifting, shearing, horizontal, and vertical flipping.

5 Method

In this paper, the technique of transfer learning has been used to detect the cancer in the histology images. Three pretrained convolutional neural networks, namely VGG-16, VGG-19, and inceptionresnetv2 have been used. VGG-16 has been used in many classification problems and is easy to implement. VGG-19 is a variant of VGG-16 that includes 19 layers. The concept of batch normalization is introduced in inception resnetv2, and higher learning rate can also be used. Transfer learning is a method in which pretrained networks have been used to solve new problems. The knowledge gained by solving one problem can be used to solve another problem.

5.1 Performance Evaluation

Accuracy [6] is calculated as,

$${\text{Accuracy}} = \frac{{{\text{TP}} + {\text{TN}}}}{{{\text{TP}} + {\text{TN}} + {\text{FP}} + {\text{FN}}}}$$
(1)

Sensitivity [6] is calculated as,

$${\text{Sensitivity }} = {\text{ TP}}/\left( {{\text{TP}} + {\text{FN}}} \right)$$
(2)

Specificity [6] is calculated as,

$${\text{Specificity }} = {\text{ TN}}/\left( {{\text{TN}} + {\text{FP}}} \right)$$
(3)

where TP is the true positive, TN is the true negative, FP is the false positive, and FN is the false negative.

5.2 Convolutional Neural Network

Convolutional neural network is a type of deep neural networks. It primarily includes four layers—convolution, pooling, flattening, and fully connected layer which is shown in Fig. 5.

Fig. 5
figure 5

Convolutional neural network

5.3 VGG-16

VGG-16 is a 16 layer deep neural network trained on Imagenet dataset. The training is done for 30 epochs, and the optimizer used is RMSprop. The loss function used is binary cross entropy, and sigmoid activation function is used in the fully convolutional layer. Figure 6 shows the model parameters. The accuracy for the model and the loss is shown in Fig. 7.

Fig. 6
figure 6

Model parameters—VGG 16

Fig. 7
figure 7

a Training accuracy and validation accuracy-VGG 16, b training loss and validation loss—VGG 16

5.4 VGG-19

VGG-19 is a 19 layers deep convolutional neural network trained on Imagenet dataset. The training is done for 30 epochs, and the optimizer used is RMSprop. The loss function used is binary cross entropy, and sigmoid activation function is used in the fully convolutional layer. Figure 8 shows the model parameters. The accuracy for the model and the loss is shown in Fig. 9.

Fig. 8
figure 8

Model parameters—VGG 19

Fig. 9
figure 9

a Training accuracy and validation accuracy—VGG 19, b training loss and validation loss—VGG 19

5.5 Inception-Resnet-V2

Inception-Resnet-V2 is a 164 layers deep convolutional neural network trained on Imagenet dataset which contains millions of images. The model was trained for 15 epochs, and the optimizer used is Adam. Figure 10 shows the model parameters. The accuracy for the model and the loss is shown in Fig. 11.

Fig. 10
figure 10

Model parameters—InceptionResnetV2

Fig. 11
figure 11

a Training accuracy and validation accuracy-InceptionResnetV2, b training loss and validation loss-InceptionResnetV2

6 Discussions and Conclusion

This paper analysed various methods for detecting breast cancer in histology images using deep learning techniques. Transfer learning technique is used in the proposed paper where the network is trained using the pretrained models. Three convolutional neural networks are used to analyse the histology images. The highest accuracy is obtained for VGG-16 which is 82.83%, and for VGG-19 and ResNet-50, the accuracies obtained are 73.04 and 78.57%, which can be successfully used to classify benign and malignant images. The work is going on, to further improve the accuracy by changing the network architecture and fine-tuning the hyper parameters.