Abstract
Convolutional Neural Networks (ConvNets) are increasingly being used for medical image diagnostic applications. In this paper, we compare two transfer learning approaches - Deep Feature classification and Fine-tuning ConvNets for Diagnosing Breast Cancer malignancy. BreaKHis dataset is used to benchmark our results with ResNet-50, InceptionV2 and DenseNet-169 pre-trained models. Deep feature classification accuracy ranges from 81% to 95% using Logistic Regression, LightGBM and Random Forest classifiers. Fine-tuned DenseNet-169 model accuracy outperformed all other classification models with 99.25 ± 0.4%.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
- Breast cancer
- Medical image diagnosis
- Histopathology
- Convolutional Neural Network
- Transfer learning
- Deep feature classification
- Fine-tuning
1 Introduction
In 2015, Out of 2.4 million cases of Breast Cancer in the US, 523,000 deaths were reported. In the US, it is estimated that approximately 260,000 new cases of invasive breast cancer will be diagnosed in 2018 [1, 2], with about 40,920 women mortalities. Worldwide, Breast cancer claims the maximum mortality rates among all cancer diseases afflicting women.
Early screening and diagnosis can improve treatment and survival rates [47]. Initial screening is generally done by breast palpation and regular check-ups using mammography or ultrasound imaging, followed by detailed diagnosis with breast tissue biopsy and histopathology analysis and clinical screening. Hematoxylin and eosin (H&E) stained biopsy tissues are analyzed under the microscope for various parameters like nuclear atypia, tubules, and mitotic counts. Visual identification using H&E stained biopsies is non-trivial, tedious and can be exceedingly subjective, with average diagnostic concordance between pathologists approximately 75% [3]. Whole slide imaging (WSI) scanners are increasingly being used for digitizing histopathology slides enabling automated image processing and machine-learned methods for image enhancement, normalization, localization of the tissue, segmentation, quantitative analysis, detection, and diagnosis.
Convolution neural networks [4,5,6,7,8, 49] are the de-facto choice for researchers in this field and have outperformed conventional machine learning algorithms in many other medical image applications [9,10,11,12] including diabetic retinopathy, bone diseases detection [44], bone fracture detection [45, 46], pneumonia detection, etc. Deep networks require large training data to generalize though and publicly available annotated breast cancer datasets are small, thereby needing special methods to be viable. Data augmentation techniques like flipping, rotation, patching etc. and transfer learning approaches are promising. Conventional machine learning with handcrafted [13,14,15,16,17] features for Medical Imaging diagnosis doesn’t generalize in the real world due to variations in tissue preparation, staining and slide digitization which has a significant impact on the tissue/image appearance. Pre-trained deep networks [18] have been used as a feature extractor in many real world applications for Diabetic Retinopathy [19], Handwritten digits recognition [20, 21], image retrieval [22, 23], Remote sensing [24, 42], Mammography breast cancer image classification [25, 26].
In ICIAR 2018 [27] Grand Challenge, 400 microscopy and whole-slide images from the BreAst Cancer Histology (BACH) extended dataset were classified into normal, benign, in-situ carcinoma and invasive carcinoma. Rakhlin et al. [28] report deep feature classification with multiple pre-trained deep networks, with the best accuracy of 93.8% on this dataset. Also, Rakhlin et al. [28] report that deep feature classification outperforms fine-tuning approach on ICIAR 2018 Grand Challenge dataset.
Habibzadeh et al. [29] use fine-tuning on pre-trained Inception (V1, V2, V3, and V4) and ResNet (V1 50, V1 101, and V1 152) to classify H&E stained microscopy images from BreaKHis dataset as benign or malignant. Their best-reported result for classifying into benign and malignant is from ResNet V1 101 with fine-tuning all layers with 98.4% confidence. Despite a lot of studies available on transfer learning and fine-tuning ConvNets [30], and to the best of our knowledge, we find no literature evaluating or comparing these two approaches, pre-trained deep feature classification and fine-tuning ConvNets on the same Breast Cancer dataset. In this paper, we evaluate these two approaches using BreaKHis dataset [31].
2 Dataset
The dataset we have used is the Breast Cancer Histopathological Database (BreaKHis) [31] which has 7,909 microscopic images of breast biopsy images collected from 82 patients across multiple magnifying factors (40x, 100x, 200x, and 400x). This dataset has two classes, 2480 benign and 5429 malignant images. Height and width of each image are 700 × 460 pixels, 3-channel RGB, 8-bit depth in each channel, PNG format. This dataset was provided to us by Fabio et al. [31] from the P&D Laboratory, Parana, Brazil (Table 1).
3 Data Augmentation and Pre-preprocessing
Data augmentation is an important step to create diverse, supplemented training dataset from small datasets to train deep networks. The training images are augmented by flipping the images along their horizontal and vertical axes and also rotating them by 90, 180, 270°. In the pre-processing step, the Mean image is calculated by averaging the images and the mean image is subtracted from all train and test images for brightness normalization. After mean subtraction, all the images are resized to (224 × 224 × 3), recommended image size for InceptionV2, ResNet-50, and DensNet-169 architectures.
4 Methods
4.1 Deep Feature Extraction and Classification
We used Pre-trained deep networks trained on ImageNet [32] – a dataset for object recognition for 1000 object classes and trained on 1.2 Million images. These pre-trained ConvNet models are used as generalized feature extractors since the top layers extract discriminant features like edges, textures etc. By removing the last fully connected output layer from the pre-trained deep network and extract feature vectors called Deep Features from the truncated network. The similar approach was used in [48]. These Deep Features are then used as input to standard classifiers like Random Forest, Logistic Regression etc., this is known as Deep Feature Extraction and Classification.
We use standard pre-trained DenseNet-169 [33], ResNet-50 [34] and InceptionV2 [35] networks from Keras distribution [36] trained on ImageNet. These pre-trained networks are used as fixed deep feature extractors for the breast cancer dataset by removing the last fully-connected (bottleneck features) and softmax classifier layers.
The extracted deep feature vectors (CNN codes) - InceptionV2 (1 × 38400), ResNet-50 (1 × 2048), DenseNet-169(1 × 94080) are then classified by traditional machine learning classifiers. We split the dataset 70% for training, 30% for testing. We build three different machine learning model to classify the deep features using Logistic Regression [37], LightGBM [38] and Random [39] Forest. The models were trained on NVIDIA Quadro K630 GPU [43] (Fig. 1 and Table 2).
4.2 DenseNet-169 Fine Tuning
Fine-tuning is another promising transfer learning technique for medical image classification, Habibzadeh et al. [29] report fine-tuned ResNet classification accuracy of 98.7% and Spanhol et al. report 90.0% accuracy using AlexNet fine-tuning. A continuation of these techniques, we select DenseNet-169 [33] to fine-tune, since they are easier and faster to train with no loss of accuracy due to improved gradient flow as compared to other networks [40, 41]. We took DenseNet-169 pre-trained on ImageNet, freeze the top layers because they capture universal features, remove the last softmax layer and replace it with an output sigmoid layer (binary classification). We fine-tune the last layer with small learning rate on cancer images as shown in Fig. 2. The dataset is divided into three parts, training (60%), validation (20%) and testing (20%). In the training phase, the data augmentation is applied to increase the training images. We use Stochastic Gradient Descent (SGD) optimizer with - learning rate = 0.0005, decay = \( 1e^{ - 6} \) and Momentum = 0.9. Each epoch operates on a batch of 16 images that are randomly sampled from the training set and the network is trained for 12 epochs. The models are trained on NVIDIA Quadro K630 GPU [43].
5 Results
We report standard classification metrics including classification accuracy, F1 score, Sensitivity (SN) & Specificity (SP). Sensitivity (SN) also called True Positive Rate, measures the proportion of actual positives (malignant) that are correctly identified as such, and represents the model’s ability to not overlook actual positives (malignant) (Tables 3 and 4).
Specificity (SP) also called the True Negative Rate, on the other hand, measures the proportion of actual negatives (benign) that are correctly identified as such, and represents the model’s ability to not overlook actual negatives (benign). ResNet-50 with Logistic Regression classifier consistently outperforms other deep feature classification models across all magnification factors. Higher magnification factors perform poorly for deep feature classification method. Fine-tuned DenseNet-169 with last layer tuning demonstrated the best accuracy among all models with 99.25 ± 0.4% (Figs. 3, 4 and Tables 5, 6).
6 Conclusion
In this paper, we benchmark two transfer learning approaches using popular pre-trained networks namely ResNet, Inception and DenseNet for Breast Cancer Benign/Malignant classification. Deep Features extracted from pre-trained ResNet-50 and logistic regression classifier performs better among all the deep network feature classification and the accuracy is 94 ± 1%. In another experiment, a continuation of the literature [29], fine-tuned the DenseNet-169 with strong data augmentation. The average accuracy of the DenseNet-169 fine-tuned model is 99.3% and it is an improvement of 3% to 5% higher than the deep network feature classification and shows better performance compared to other proposals in literature.
As per the study [28], Deep feature classification performs better when the dataset is small. Our experiment presents that Fine-tuning approach with strong augmentation techniques outperforms deep feature classification when the dataset size is moderate or large. The outcomes are expected to be more comprehensively evaluated in the future considering DenseNet-169 fine-tuned model will be used for semantic segmentation on whole-slide histopathology images.
References
Veta, M., Pluim, J., Van Deist, P.J., Viergever, M.A.: Breast cancer histopathology image analysis: a review. IEEE Trans. Biomed. Eng. 61(5), 1400–1411 (2014). https://doi.org/10.1109/TBME.2014.2303852
U.S. Breast Cancer statistics. http://www.breastcancer.org/symptoms/understand_bc/statistics
Elmore, J.G., Longton, G.M., Carney, P.A., Geller, B.M., Onega, T., Tosteson, A.N.A., et al.: Diagnostic concordance among pathologists interpreting breast biopsy specimens. J. Am. Med. Assoc. 313, 1122–1132 (2015). https://doi.org/10.1001/jama.2015.1405
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of 26th Annual Conference on Neural Information Processing Systems (NIPS), December 2012, pp. 1106–1114 (2012)
Spanhol, F.A., Oliveira, L.S., et al.: Breast cancer histopathological image classification using convolutional neural networks. In: Proceedings of 2016 IEEE International Joint Conference on Neural Network, July 2016, pp. 2560–2567. IEEE (2016). https://doi.org/10.1109/ijcnn.2016.7727519
Abu Samah, A., Fauzi, M.F.A., Mansor, S.: Classification of benign and malignant tumors in histopathology images. In: Proceedings of 2017 IEEE International Conference on Signal and Image Processing Application, September 2017, pp. 43–48 (2017). https://doi.org/10.1109/icsipa.2017.8120587
Sun, J., Binder, A.: Comparison of deep learning architectures for H&E histopathology images. In: Proceedings of 2017 IEEE Conference on Big Data and Analytics, November 2017. https://doi.org/10.1109/icbdaa.2017.8284105
Bayramoglu, N., Kannala, J., Heikkila, J.: Deep learning for magnification independent breast cancer histopathology image classification. In: 23rd International Conference on Pattern Recognition, pp. 2440–2445 (2016). https://doi.org/10.1109/icpr.2016.7900002
Santosh, K.C., Vajda, S., Antani, S.: Edge map analysis in chest X-rays for automatic pulomary abnormality screening. JCARS 11, 1637 (2016). https://doi.org/10.1007/s11548-016-1359-6
Dhandra, B.V., Hegadi, R.: Classification of abnormal endoscopic images using morphological watershed segmentation. In: International Conference on Cognition and Recognition, (ICCR-2005), Mysore, India, 22–23 December 2005, pp. 695–700 (2005). ISBN 81-7764-952-3
Dhandra, B.V., Hegadi, R.: Active contours without edges and curvature analysis for endoscopic image classification. Int. J. Comput. Sci. Secur. 1(1), 19–32 (2007)
Hiremath, P.S., Dhandra, B.V., Humnabad, I., Hegadi, R., Rajput, G.G.: Detection of esophageal Cancer (Necrosis) in the Endoscopic images using color image segmentation. In: Second National Conference on Document Analysis and Recognition (NCDAR-2003), Mandya, India, pp. 417–422 (2003)
Nahid, A., Kong, Y.: Local and global feature utilization for breast image classification by convolutional neural network. In: 2017 International Conference on Digital Image Computing: Techniques and Applications, NSW, pp. 1–6 (2017). https://doi.org/10.1109/dicta.2017.8227460
Wan, S., Huang, X., Lee, H.C., Fujimoto, J.G., Zhou, C.: Spoke-LBP and ring-LBP: new texture features for tissue classification. In: 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI), pp. 195–199. IEEE (2015). https://doi.org/10.1109/isbi.2015.7163848
Vajda, S., Karargyris, A., Jaeger, S., et al.: Feature selection for automatic tuberculosis screening in frontal chest radiograph. J. Med. Syst. 42, 146 (2018). https://doi.org/10.1007/s10916-018-0991-9
Santosh, K.C., Antani, S.: Automated chest x-ray screening: can lung region symmetry help detect pulmonary abnormalities. IEEE Trans. Med. Imaging 37(5), 1168–1177 (2018). https://doi.org/10.1109/TMI.2017.2775636
Karargyris, A., Siegelman, J., Tzortzis, D., Jaeger, S., et al.: Combination of texture and shape features to detect pulmonary abnormalities in digital chest X-rays. Int. J. Comput. Assist. Radiol. Surg. 11, 99 (2016). https://doi.org/10.1007/s11548-015-1242-x
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531 (2014)
Li, X., Pang, T., Xiong, B., Liu, W., Liang, P., Wang, T.: Convolutional neural networks based transfer learning for diabetic retinopathy fundus image classification. In: 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Shanghai, pp. 1–11 (2017). https://doi.org/10.1109/cisp-bmei.2017.8301998
Niu, X.-X., Suen, C.Y.: A novel hybrid CNN–SVM classifier for recognizing handwritten digits. Pattern Recogn. 45(4), 1318–1325 (2012). https://doi.org/10.1016/j.patcog.2011.09.021
Ukil, S., Ghosh, S., Obaidullah, S.M., Santosh, K.C., et al.: Deep learning for word-level handwritten Indic script identification. arXiv preprint arXiv:1801.01627 (2018)
Babenko, A., Lempitsky, V.: Aggregating local deep features for image retrieval. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1269–1277 (2015). https://doi.org/10.1109/iccv.2015.150
Alzu’bi, A., Amira, A., Ramzan, N.: Content-based image retrieval with compact deep convolutional features. Neurocomputing 249(2), 95–105 (2017). https://doi.org/10.1016/j.neucom.2017.03.072
Penatti, O.A, Nogueira, K., dos Santos, J.A.: Do deep features generalize from everyday objects to remote sensing and aerial scenes domains. In: Computer Vision and Pattern Recognition Workshop. IEEE (2015). https://doi.org/10.1109/cvprw.2015.7301382
Charan, S., Khan, M.J., Khurshid, K.: Breast cancer detection in mammograms using convolutional neural network. In: 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), Sukkur, pp. 1–5 (2018). https://doi.org/10.1109/icomet.2018.8346384
Jiao, Z., Gao, X., Wang, Y., Li, J.: A deep feature based framework for breast masses classification. Neurocomputing 12, 221–231 (2016). https://doi.org/10.1016/j.neucom.2016.02.060
ICIAR 2018 Grand Challenge on Breast Cancer Histology (BACH) images. https://iciar2018-challenge.grand-challenge.org/
Rakhlin, A., Shvets, A., Iglovikov, V., Kalinin, A.A.: Deep convolutional neural networks for breast cancer histology image analysis. In: Campilho, A., Karray, F., ter Haar Romeny, B. (eds.) ICIAR 2018. LNCS, vol. 10882, pp. 737–744. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93000-8_83
Habibzadeh, M., Motlagh, N.H., Jannesary, M., Aboulkheyr, H., Khosravi, P.: Breast cancer histopathological image classification: a deep learning approach. bioRxiv 242818 (2018). https://doi.org/10.1101/242818
Han, Z., Wei, B., Zheng, Y., Yin, Y., Li, K., Li, S.: Breast cancer multi-classification from histopathological images with structured deep learning model. Sci. Rep. 7, 4172 (2017). https://doi.org/10.1038/s41598-017-04075-z
Spanhol, F., Oliveira, L.S., Petitjean, C., Heutte, L.: A dataset for breast cancer histopathological image classification. IEEE Trans. Biomed. Eng. 63(7), 1455–1462 (2016). https://doi.org/10.1109/tbme.2015.2496264
Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., Lew, M.S.: Deep learning for visual understanding: a review. Neurocomputing 187, 27–48 (2016). https://doi.org/10.1016/j.neucom.2015.09.116
Huang,G., Liu, Z., van der Maaten, L.: Densely connected convolutional network. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2261–2269 (2017). https://doi.org/10.1109/cvpr.2017.243
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016). https://doi.org/10.1109/cvpr.2016.90
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Keras: The Python Deep Learning Library. https://keras.io/
Haifley, T.: Linear logistic regression: an introduction. In: IEEE International Integrated Reliability Workshop Final Report, pp. 184–187 (2002). https://doi.org/10.1109/irws.2002.1194264
Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. In: Advances in Neural Information Processing Systems, pp. 3149–3157 (2017)
Breiman, L.: Random forests. J. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/a:1010933404324
Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks. In: Advances in Neural Information Processing Systems, pp. 3320–3328 (2014)
Tajbakhsh, N., et al.: Convolutional neural networks for medical image analysis: full training or fine tuning. IEEE Trans. Med. Imaging 35(5), 1299–1312 (2016). https://doi.org/10.1109/tmi.2016.2535302
Hu, F., Xia, G.-S., Hu, J., Zhang, L.: Transferring deep convolutional neural networks for the scene classification of high-resolution remote sensing imagery. MDPI Remote Sens. 7(11), 14680–14707 (2015). https://doi.org/10.3390/rs71114680
NVIDIA Corporation: Nvidia tesla product literature (2018). https://www.nvidia.com/en-us/design-visualization/quadro-desktop-gpus/
Hegadi, R.S., Navale, D.I., Pawar, T.D., Ruikar, D.D.: Multi feature-based classification of osteoarthritis in knee joint X-ray images, Chap. 5. In: Medical Imaging: Artificial Intelligence, Image Recognition, and Machine Learning Techniques. CRC Press (2019). ISBN 9780367139612
Ruikar, D.D., Santosh, K.C., Hegadi, R.S.: Automated fractured bone segmentation and labeling from CT images. J. Med. Syst. 43(3), 60 (2019). https://doi.org/10.1007/s10916-019-1176-x
Ruikar, D.D., Santosh, K.C., Hegadi, R.S.: Segmentation and analysis of CT images for bone fracture detection and labeling, Chap. 7. In: Medical Imaging: Artificial Intelligence, Image Recognition, and Machine Learning Techniques. CRC Press (2019). ISBN 9780367139612
Ruikar, D.D., Hegadi, R.S., Santosh, K.C.: A systematic review on orthopedic simulators for psycho-motor skill and surgical procedure training. J. Med. Syst. 42(9), 168 (2018)
Sawat, D.D., Hegadi, R.S.: Unconstrained face detection: a Deep learning and Machine learning combined approach. CSI Trans. ICT 5(2), 195–199 (2017)
Jagtap, A.B., Hegadi, R.S.: Feature learning for offline handwritten signature verification using Convolution Neural Network. Int. J. Technol. Hum. Interact. (IJTHI). ISSN 1548-3908
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Sabari Nathan, D., Saravanan, R., Anbazhagan, J., Koduganty, P. (2019). Comparison of Deep Feature Classification and Fine Tuning for Breast Cancer Histopathology Image Classification. In: Santosh, K., Hegadi, R. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2018. Communications in Computer and Information Science, vol 1036. Springer, Singapore. https://doi.org/10.1007/978-981-13-9184-2_5
Download citation
DOI: https://doi.org/10.1007/978-981-13-9184-2_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9183-5
Online ISBN: 978-981-13-9184-2
eBook Packages: Computer ScienceComputer Science (R0)