Breast Cancer Detection Based DenseNet with Attention Model in Mammogram Images

Mousa, Tawfik Ezat; Zouari, Ramzi; Baklouti, Mouna

doi:10.1007/978-3-031-49333-1_19

Tawfik Ezat Mousa¹¹,
Ramzi Zouari¹¹ &
Mouna Baklouti¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14396))

Included in the following conference series:

International Conference on Model and Data Engineering

298 Accesses
1 Citations

Abstract

Breast cancer has become a very interesting topic due to the massive number of deaths among women across the world. Radiologists can diagnose breast cancer faster and more accurately because of advances in the computer-aided diagnosis (CAD) system. In this paper, we presented a new breast cancer detection system based on the integration of self attention model in the pre-trained deep neural networks DenseNet. First, we extracted automatic high-level features from breast images using DenseNet extraction layers, and thereafter attention model was applied to focus the treatment on the relevant parts of the region of interest. The experiments were conducted on a multi-class Mammographic Image Analysis Society (MIAS) database, including three classes of breast cancer images. We achieved the accuracy of 0.9939 when applying both transfer learning, data augmentation, and self attention mechanism.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Deep residual learning with attention mechanism for breast cancer classification

Article 31 August 2023

DAN : Breast Cancer Classification from High-Resolution Histology Images Using Deep Attention Network

Investigating the Impact of Attention on Mammogram Classification

Keywords

1 Introduction

Cancer is a disease that affects cells and its detection at an early stage increases the chances of recovery. In fact, breast cancer is one of the most common types among women, and breast cancer has always shown a very high incidence and mortality rate of about 10% of women at some point in their lives. It is the second-largest cause of death among females after lung cancer [1]. The World Health Organization’s International Agency for Research on Cancer (IARC) reported an anticipated increase in the number of breast cancer cases to 1.1 million by 2030, with the gap between developed and developing nations expected to widen [2]. Cancer can be described as the uncontrolled proliferation of abnormal cells that form masses. These tumors can be benign or malignant. Benign tumors remain localized and grow slowly. Malignant tumors invade nearby structures and may destroy other parts of the body [3]. Therefore, mammography is an effective imaging technique used in the detection of breast cancer. It is the most used imaging technique in screening programs [4]. It helps in detecting suspicious lesions like masses and micro calcification. However, mammography in a 2D image results in tissue overlap, which can mask the lesion, or create a false lesion, thus producing false-positive and false-negative results. In addition, mammography is known to be less sensitive to breast density (30–64%) compared to fatty breasts (76–98%), as it has been shown that women with dense breasts are more susceptible to breast cancer.

In the last decade, several works based artificial intelligence tools were developed to enhance computer-assisted breast cancer (CAD) diagnosis. These approaches have shown their ability to treat the problem of abnormal lumps and calcification in the breast and predict their growth. They help radiologists and oncologists diagnose breast cancer by providing a second opinion. In this work, we propose a new system of breast cancer detection using a pre-trained DenseNet, with the integration of attention mechanism. The combination of these two models has demonstrated their effectiveness in improving the detection performance. In fact, attention mechanism allows to increase the weights of the relevant features of the model and decreasing the others, to make a better decision. Furthermore, we applied data augmentation technique to increase the number of training images and to improve the model generalization.

The rest of this paper is organized as follows. Section 2 presents some related works in breast cancer detection field. Section 3 illustrates the different parts of the proposed methodology. Section 4 presents the experimental results and a comparison with similar works. We finish this paper by a conclusion and some prospects.

2 Related Works

In the literature, numerous approaches were proposed for breast cancer detection based on mammography images. Samee et al. [5] proposed a breast cancer detection system based on several deep learning architectures. In fact, they extracted automatic features from both AlexNet, VGG, and GoogleNet models. These features are fused to make the prediction task. This system was evaluated on INbreast database and achieved the accuracy of 98.50%. In other work, Jiang et al. [6] integrated a new dataset of breast mammograms named Film Mammography dataset number 3 (BCDR-F03). They applied both GoogLeNet and AlexNet models to classify segmented tumors found on mammograms, and obtained the accuracy of 88% and 83% for GoogleNet and AlexNet respectively. Ribli et al. [7] used Regional based CNN(R-CNN) model to detect and classify breast lesions using mammograms. They obtained the accuracy of 95% on INbreast dataset. Alruwaili et al. [8] used ResNet50 model to distinguish between malignant and benign breast cancer. Data augmentation technique was applied to increase the number of training images and prevent the model from over-fitting. The proposed model was assessed on MIAS dataset and achieved the accuracy of 89.5%. Kaur et al. [9] proposed an hybrid model where both deep learning neural networks (DNN) and Support Vector Machines (SVM) were used. SVM was implemented after the DNN classification part instead of regular dense layers. The results showed that SVM allows improving the recognition rate from 70% to 96.9% on multiclass MIAS dataset. Mohapatra et al. [10] evaluated several pre-trained deep learning models such as AlexNet, VGG16, and ResNet50 on mammogram images. Due to the limited number of training images, they applied data augmentation to address the problem of over-fitting. The experiments were done on Mini-DDSM dataset and reached the accuracy of 65% when using ResNet50. Muduli et al. [11] proposed a deep convolution neural network (CNN) model for breast cancer classification using mammograms and ultrasound images. This model facilitates the extraction of prominent features from the images with only few tune parameters. They applied data augmentation to increase the number of training images and prevent the model from over-fitting. The proposed model was evaluated on MIAS and INbreast datasets, and achieved the accuracy of 90.68% and 91.28% respectively. Rouhi et al. [12] proposed a new model of primary breast cancer detection using region growing method. Their model is based on the hybridization of cellular neural network with genetic algorithm, and achieved the accuracy of 96.47% and 95.13% on MIAS and DDSM databases respectively. In [13], transfer learning technique was applied with the pre-trained deep neural networks Inception V3, ResNet50, VGG16, and Inception-ResNet. The best result was obtained using VGG16 model which achieved the accuracy of 98.96% on MIAS database. Punithavathi et al. [14] proposed an hybrid model based on SVM and KNN classifiers. They introduced multiple categories of images to the SVM, and the final decision was done by KNN algorithm. This model produces higher diagnostic accuracy on MIAS dataset and achieved the accuracy of 99.34%. Pillai et al. [15] evaluated several pre-trained deep learning models such as EfficientNet, AlexNet, VGG16, and GoogleNet on MIAS database. They applied data augmentation to increase the number of training images and prevent the model from over-fitting. The best performance was obtained with VGG16 model and achieved the average accuracy of 75.46%. Chougrad et al. [16] applied fine-tuned Inception-v3 model on MIAS database to classify breast lesions and obtained the accuracy of 98.23%. Selvathi et al. [17] proposed a new system for breast cancer detection. Their approach consists of using a stack autoencoder architectures with softmax classifier. Moreover, they applied some processing on MIAS database images to remove noise, background, and pectoral muscle, and obtained the accuracy of 98.5%.

3 Proposed Methodology

In this section, we present our system for multi-class breast cancer detection based on mammogram images. The proposed methodology employs the pre-trained DenseNet121 model truncated at the feature extraction part, followed by an attention model to give more importance to relevant features of the Region of Interest (ROI). Thereafter, convolutions and attention modules are combined to fuse both the high-level information and the interesting semantic information. The obtained features are fed into a Global Average Pooling (GAP) to reduce the feature maps dimensions and preserve pertinent features for the classification part.

3.1 DenseNet121 Architecture

Dense Convolution Network (DenseNet) is a modern CNN architecture designed for visual object recognition with only few parameters [18]. It achieved the state-of-the-art results on several image classification datasets, such as CIFAR-10, SVHN and ImageNet [19]. The basic structure of the network mainly includes two-component modules: Dense and Transition blocks (Fig. 1). In DenseNet-121, there are a total of 4 dense blocks and 3 transition blocks. Each layer in the Dense Block is connected to all subsequent layers in a densely manner [20]. Moreover, each dense block is composed of a stack of two convolution layers with a kernel size of (1$\,\times \,$1) and (3$\,\times \,$3) respectively. In each transition block, (1$\,\times \,$1) convolution and (2$\,\times \,$2) max pooling operations are done. Table 1 shows the overall architecture of DenseNet121 model. We notice that DenseNet121 alternates dense and transition blocks. At each pass, the convolution layers of the dense block are reproduced 6, 12, 24, and 16 times respectively.

Table 1. DenseNet121 structure

Full size table

3.2 Self Attention Model

After the global average pooling layer, we implemented a Multi-Head Self Attention (MHSA) model to improve the model effectiveness (Fig. 2). In fact, MHSA is a mechanism used to provide an additional focus on a specific component in the data. It enables the network to concentrate on a few aspects at a time and ignore the rest [22]. MHSA consists of several attention layers running in parallel, instead of performing one single attention function. In particular, the input consists of queries and keys of dimension $d_{k}$ (Q and K respectively), and values of dimension d$_{v}$ (V). The output of the attention model is done by computing the scaled dot product of the queries with all keys and applying a SoftMax function to obtain the weights on the values V (Eq. 1). The attention mechanism is linearly projected h times with different learned weights ($W_{Q}$, $W_{K}$, $W_{V}$). These different representation sub spaces are concatenated into one single attention head to form the final output result (Eq. 2). We applied a particular version of attention model called self-attention, in which query, key and value inputs are the same. The calculation process follows these steps: First, we made the dot product (MatMul) of query and keys tensors and scale the obtained scores. Next, we apply a SoftMax function on these scores to obtain attention probabilities. Finally, we take a linear combination of these distributions with the value input tensors and concatenate them into one channel.

$$\begin{aligned} Attention(Q, K, V) = Softmax\left( \frac{Q\times K^T}{\sqrt{d_{k}}}\right) \times V \end{aligned}$$

(1)

$$\begin{aligned} {\left\{ \begin{array}{ll} \text { MHA}\,\, \text {(Q, K, V)} = \text {concat(head}_{1}, ..., \text {head}_{h}) \\ \text {head}_{i=1..h} = \text {Attention}(QW_{Q}, KW_{k}, VW_{V}) \end{array}\right. } \end{aligned}$$

(2)

The proposed methodology consists of making the dot product of DenseNet121 and Self Attention models outputs. Thereafter, we applied Global Average Pooling on both attention model output and the resulting dot product tensors. The classification part is composed of two dense layers with dropout function to prevent the model from over-fitting. Figure 3 illustrates the different parts of the proposed breast cancer detection system.

4 Experimentation and Results

4.1 Database Description

The proposed methodology was evaluated on MIAS multi-class database containing images of normal, benign, and malign breast cancer [23]. This database consists of 322 mammogram images of size $(1024\times 1024)$ pixels and stored according to Portable Gray Map (PGM) format. These images belong to three types: glandular dense, fatty, and fatty glandular. Each type is divided into three categories: normal, benign, and malignant. The dataset also contains radiologists’ actual estimations of the location of abnormalities (benign, malignant), with an approximate determination of the radius surrounding the center of the anomaly. In this work, we use all the images in the dataset, which consists of 207 normal images, 64 benign images, and 51 malignant images. Figure 4 shows three images from MIAS database representing three categories (Normal, Benign, and Malign).

4.2 Data Augmentation

Since MIAS dataset contains only 322 images, the proposed model may not be generalized. For this purpose, we applied data augmentation operation to increase the number of training samples in each class and prevent the model from overfitting. In this work, data augmentation is mainly based on geometric transformations including rotation, flipping, and shifting. Thus, we obtained a new database of 1836 breast cancer images evenly distributed over the three classes (612 images per class). Figure 5 shows an example of data augmentation where vertical and horizontal flip were applied on the original image.

4.3 Experimental Setup

During the experiments, the training database was divided into batches of size 32, with shuffling option to make different min-batch samples in each epoch. Moreover, in each iteration categorical cross entropy method was used to compute the loss between desired and calculated outputs. The model was trained using Adam (Adaptive Moment Estimation) optimizer with an initial learning rate of 0, 001. This value can be reduced by a factor of 0.5 once learning stagnates. Moreover, the early stopping approach is applied as a regularization method. It consists of stopping the training process early before it has over-fit the training database. In the multi-head self attention model, we employed 8 parallel attention layers or heads. For each of these, we used 64 units in the linear projector of both query, key, and value matrices (Table 2).

Table 2. Hyperparameters setting

Full size table

4.4 Evaluation Metrics

To illustrate the performance of the proposed model, the confusion matrix and other metrics were calculated like Accuracy, Recall, Precision, and F1-score (Eq. 3–6). They are all based on the calculation of True positive (TP), False positive (FP), False negative (FN) and True negative (TN) values. TP denotes images predicted with breast cancer when they were. TN relates to normal images predicted as healthy. FP concerns normal images which are predicted as breast cancer, and FN refers to images predicted as normal, but they were not.

$$\begin{aligned} Accuracy = \frac{TP + TN}{TP + TN + FP + FN} \end{aligned}$$

(3)

$$\begin{aligned} Recall = \frac{TP}{TP + FN} \end{aligned}$$

(4)

$$\begin{aligned} Precision = \frac{TP}{TP + FP} \end{aligned}$$

(5)

$$\begin{aligned} \text {F1-score} = 2\times \frac{Precision \times Recall}{Precision + Recall} \end{aligned}$$

(6)

4.5 Experimental Results

In the experiments, the images shape was fixed to $(256\times 256\times 3)$. Moreover, several models were studied with different values of splitting and optimizers. All of these models have been used with pretrained weights. First, we evaluated the model’s performance without the use of self attention mechanism. The best result was obtained with DenseNet-121 (Table 3). When applying multi-head self attention mechanism, the DenseNet-121 accuracy was improved by 6%, and we reached the accuracy of 0.9939 for 90% of database split. On the other hand, several other metrics were evaluated such recall, precision, and AUC (Table 4). In all of these metrics, the best results have been obtained using DenseNet-121 model with Adam optimizer. Figures 6 and 7 represent the confusion matrices related to the classification report for different split ratios. We notice that the model performances was improved when using multi-head self attention mechanism. Moreoever, the proposed model allows a good discrimination between benign and malign image samples, but it confuses between normal and benign classes (Table 5).

Table 3. Models accuracies without and with attention

Full size table

Table 4. DenseNet model performance with different split ratio

Full size table

Table 5. Performance results with optimizers

Full size table

4.6 Comparative Study and Discussion

Table 6 summarizes several works evaluated on multi-class MIAS dataset. When applying the split ratio of 90% and multi-head self attention mechanism, the proposed model achieves the state-of-the-art performances on MIAS dataset, and outperforms the models based on ADL-BCD and ResNet50. However, in the case of split ratio of 80%, the proposed approach is better than DenseNet-201 model, and it is slightly less efficient than VGG16 and OMLTS-DLCN approaches. Furthermore, the proposed work is the only one to combine multi-head self attention mechanism with the pre-trained deep neural networks DenseNet-121. This combination has led to significant improvement in the classification rates. In fact, the attention model was frequently applied to sequential data. In this work, we turned it to image classification task to associate high attention weights to the parts of images with relevant features.

Table 6. Results comparison on MIAS database

Full size table

5 Conclusion

In this paper, we proposed a deep architecture for breast cancer classification based on mammographic images to help medical doctors in breast cancer detection and diagnosis. The approach provides the breast image classification into normal, benign, and malignant. The virtue of our method is to combine pre-trained deep convolution neural networks DenseNet121 with a self-attention model. Moreover, data augmentation was applied to increase the number of images and prevent the model from overfitting. During the experiments, several hyper-parameters were tuned such as optimizer and learning rate to boost the diagnostic efficiency. The proposed methodology achieved the accuracy of 92.64% and 99.39% for a split ratio of 80% and 90% respectively. Finally, it can be concluded that by integrating the CNN using learning transfer with the attention mechanism, a clear improvement was achieved compared with other existing approaches. The results presented in this study open new windows for the use of self-attention-based architectures with vision transformer technology for breast cancer classification to obtain high-performance CAD schemes with better results.

References

Ara, S., Das, A., Dey, A.: Malignant and benign breast cancer classification using machine learning algorithms. In: 2021 International Conference on Artificial Intelligence (ICAI), pp. 97–101. IEEE (2021)
Google Scholar
Krithiga, R., Geetha, P.: Deep learning based breast cancer detection and classification using fuzzy merging techniques. Mach. Vision Appl. 31, 1–18 (2020)
Article Google Scholar
Elmannai, H., Hamdi, M., AlGarni, A.: Deep learning models combining for breast cancer histopathology image classification. Int. J. Comput. Intell. Syst. 14(1), 1003 (2021)
Article Google Scholar
Mendes, J., Domingues, J., Aidos, H., Garcia, N., Matela, N.: AI in breast cancer imaging: a survey of different applications. J. Imaging 8(9), 228 (2022)
Article Google Scholar
Samee, N.A., Atteia, G., Meshoul, S., Al-antari, M.A., Kadah, Y.M.: Deep learning cascaded feature selection framework for breast cancer classification: hybrid CNN with univariate-based approach. Mathematics 10(19), 3631 (2022)
Article Google Scholar
Jiang, F., Liu, H., Yu, S., Xie, Y.: Breast mass lesion classification in mammograms by transfer learning. In: Proceedings of the 5th International Conference on Bioinformatics and Computational Biology, pp. 59–62 (2017)
Google Scholar
Ribli, D., Horváth, A., Unger, Z., Pollner, P., Csabai, I.: Detecting and classifying lesions in mammograms with deep learning. Sci. Rep. 8(1), 4165 (2018)
Article Google Scholar
Alruwaili, M., Gouda, W.: Automated breast cancer detection models based on transfer learning. Sensors 22(3), 876 (2022)
Article Google Scholar
Kaur, P., Singh, G., Kaur, P.: Intellectual detection and validation of automated mammogram breast cancer images by multi-class SVM using deep learning classification. Inform. Med. Unlock. 16, 100151 (2019)
Article Google Scholar
Mohapatra, S., Muduly, S., Mohanty, S., Ravindra, J.V.R., Mohanty, S.N.: Evaluation of deep learning models for detecting breast cancer using histopathological mammograms images. Sustain. Oper. Comput. 3, 296–302 (2022)
Article Google Scholar
Muduli, D., Dash, R., Majhi, B.: Automated diagnosis of breast cancer using multi-modal datasets: a deep convolution neural network based approach. Biomed. Signal Process. Control 71, 102825 (2022)
Article Google Scholar
Rouhi, R., Jafari, M., Kasaei, S., Keshavarzian, P.: Benign and malignant breast tumors classification based on region growing and CNN segmentation. Expert Syst. Appl. 42(3), 990–1002 (2015)
Article Google Scholar
Saber, A., Sakr, M., Abo-Seida, O.M., Keshk, A., Chen, H.: A novel deep-learning model for automatic detection and classification of breast cancer using the transfer-learning technique. IEEE Access 9, 71194–71209 (2021)
Article Google Scholar
Punithavathi, V., Devakumari, D.: A hybrid algorithm with modified SVM and KNN for classification of mammogram images using medical image processing with data mining techniques. Eur. J. Mol. Clin. Med. 7(10), 2956–2965 (2020)
Google Scholar
Pillai, A., Nizam, A., Joshee, M., Pinto, A., Chavan, S.: Breast cancer detection in mammograms using deep learning. In: Iyer, B., Ghosh, D., Balas, V.E. (eds.) Applied Information Processing Systems. AISC, vol. 1354, pp. 121–127. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-2008-9_11
Chapter Google Scholar
Chougrad, H., Zouaki, H., Alheyane, O.: Deep convolutional neural networks for breast cancer screening. Comput. Methods Programs Biomed. 157, 19–30 (2018)
Article Google Scholar
Selvathi, D., Aarthy Poornila, A.: Breast cancer detection in mammogram images using deep learning technique. Middle-East J. Sci. Res. 25(2), 417–426 (2017)
Google Scholar
Hasan, N., Bao, Y., Shawon, A., Huang, Y.: DenseNet convolutional neural networks application for predicting COVID-19 using CT image. SN Comput. Sci. 2(5), 389 (2021)
Article Google Scholar
Albelwi, S.A.: Deep architecture based on DenseNet-121 model for weather image recognition. Int. J. Adv. Comput. Sci. Appl. 13(10), 2022
Google Scholar
Kim, T.-H.: Electricity theft detection using fusion DenseNet-RF model (2021)
Google Scholar
Zeng, L., Lang, J.: Classification of breast cancer histopathological image based on lightweight network. In: CIBDA 2022; 3rd International Conference on Computer Information and Big Data Applications, pp. 1–6. VDE (2022)
Google Scholar
Vaswani, A., et al.: Attention is all you need. Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Yoon, W.B., Oh, J.E., Chae, E.Y., Kim, H.H., Lee, S.Y., Kim, K.G.: Automatic detection of pectoral muscle region for computer-aided diagnosis using MIAS mammograms. Biomed. Res. Int. (2016)
Google Scholar
Xiang, Yu., Zeng, N., Liu, S., Zhang, Y.-D.: Utilization of DenseNet201 for diagnosis of breast abnormality. Mach. Vis. Appl. 30, 1135–1144 (2019)
Article Google Scholar
Kavitha, T., et al.: Deep learning based capsule neural network model for breast cancer diagnosis using mammogram images. Interdisc. Sci.: Comput. Life Sci. 1–17 (2021)
Google Scholar
Maqsood, S., Damaševičius, R., Maskeliūnas, R.: TTCNN: a breast cancer detection and classification towards computer-aided diagnosis using digital mammography in early stages. Appl. Sci. 12(7), 3273 (2022)
Article Google Scholar
Escorcia-Gutierrez, J., et al.: Automated deep learning empowered breast cancer diagnosis using biomedical mammogram images. Comput. Mater. Continua 71, 3–4221 (2022)
Google Scholar
Jebarani, P.E., Umadevi, N., Dang, H., Pomplun, M.: A novel hybrid k-means and GMM machine learning model for breast cancer detection. IEEE Access 9, 146153–146162 (2021)
Article Google Scholar

Download references

Author information

Authors and Affiliations

National School of Engineering of Sfax, University of Sfax, Sfax, Tunisia
Tawfik Ezat Mousa, Ramzi Zouari & Mouna Baklouti

Authors

Tawfik Ezat Mousa
View author publications
You can also search for this author in PubMed Google Scholar
Ramzi Zouari
View author publications
You can also search for this author in PubMed Google Scholar
Mouna Baklouti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tawfik Ezat Mousa .

Editor information

Editors and Affiliations

Bordeaux INP, Talence, France
Mohamed Mosbah
Dublin City University, Dublin, Ireland
Tahar Kechadi
ENSMA, Poitiers, France
Ladjel Bellatreche
University of Sfax, Sfax, Tunisia
Faiez Gargouri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mousa, T.E., Zouari, R., Baklouti, M. (2024). Breast Cancer Detection Based DenseNet with Attention Model in Mammogram Images. In: Mosbah, M., Kechadi, T., Bellatreche, L., Gargouri, F. (eds) Model and Data Engineering. MEDI 2023. Lecture Notes in Computer Science, vol 14396. Springer, Cham. https://doi.org/10.1007/978-3-031-49333-1_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-49333-1_19
Published: 22 December 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-49332-4
Online ISBN: 978-3-031-49333-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Breast Cancer Detection Based DenseNet with Attention Model in Mammogram Images

Abstract

Similar content being viewed by others

Deep residual learning with attention mechanism for breast cancer classification

DAN : Breast Cancer Classification from High-Resolution Histology Images Using Deep Attention Network

Investigating the Impact of Attention on Mammogram Classification

Keywords

1 Introduction

2 Related Works