A hybrid artificial bee colony with whale optimization algorithm for improved breast cancer diagnosis

Stephan, Punitha; Stephan, Thompson; Kannan, Ramani; Abraham, Ajith

doi:10.1007/s00521-021-05997-6

A hybrid artificial bee colony with whale optimization algorithm for improved breast cancer diagnosis

Original Article
Published: 07 May 2021

Volume 33, pages 13667–13691, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Neural Computing and Applications Aims and scope Submit manuscript

A hybrid artificial bee colony with whale optimization algorithm for improved breast cancer diagnosis

Download PDF

Punitha Stephan¹,
Thompson Stephan ORCID: orcid.org/0000-0002-6578-6919²,
Ramani Kannan³ &
…
Ajith Abraham⁴

741 Accesses
44 Citations
Explore all metrics

Abstract

Breast cancer is the most common among women that leads to death if not diagnosed at early stages. Early diagnosis plays a vital role in decreasing the mortality rate globally. Manual methods for diagnosing breast cancers suffer from human errors and inaccuracy, and consume time. A computer-aided diagnosis (CAD) can overcome the disadvantages of manual methods and helps radiologists for accurate decision-making. A CAD system based on artificial neural network (ANN) optimized using a swarm-based approach can improve the accuracy of breast cancer diagnosis due to its strong prediction capabilities. Artificial bee colony (ABC) and whale optimization are metaheuristic search algorithms used to solve combinatorial optimization problems. This paper proposes a hybrid artificial bee colony with whale optimization algorithm (HAW) by integrating the exploitative employee bee phase of ABC with the bubble net attacking method of whale optimization to propose an employee bee attacking phase. In the employee bee attacking phase, employee bees use exploitation of humpback whales for finding better food source positions. The weak exploration of standard ABC is improved using the proposed mutative initialization phase that forms the explorative phase of the HAW algorithm. HAW algorithm is used in simultaneous feature selection (FS) and parameter optimization of an ANN model. HAW is implemented using backpropagation learning that includes resilient backpropagation (HAW-RP), Levenberg–Marquart (HAW-LM) and momentum-based gradient descent (HAW-GD). These hybrid variants are evaluated using various breast cancer datasets in terms of accuracy, complexity and computational time. HAW-RP variant achieved higher accuracy of 99.2%, 98.5%, 96.3%, 98.8%, 98.7% and 99.1% with low-complexity ANN model when compared to HAW-LM and HAW-GD for WBCD, WDBC, WPBC, DDSM, MIAS and INbreast, respectively.

An Efficient Bio-inspired Bees Colony for Breast Cancer Prediction

Hybrid ABC and black hole algorithm with genetic operators optimized SVM ensemble based diagnosis of breast cancer

Article 26 October 2023

Enhanced Monarchy Butterfly Optimization Technique for effective breast cancer diagnosis

Article 29 May 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Early-stage diagnosis plays a major role in increasing the chance of recovery from breast cancer. World Health Organization (WHO) has reported that WHO estimates that there may be an increase in the cancer incidence of 27.5 million in the year of 2040, with 16.3 million deaths due to cancer [1]. Currently, the average risk of a woman in the USA developing breast cancer sometime in her life is about 13%. According to the American Cancer Society, in the USA in the year 2021, invasive breast cancer expected is 281,550 of new cases, and about 43,600 women are estimated to die because of breast cancer [2]. In metropolitan cities in India such as Mumbai, Chennai, Delhi, Bangalore, Ahmadabad and Bhopal, noninvasive breast cancer has affected 28% to 35% of the women population [3]. Thus, breast cancer has become a serious health issue around the globe and early detection is essential in reducing life fatalities [4]. Early detection can be done using various scanning methods such as magnetic resource imaging, ultrasound imaging, self-check-up, mammography and biopsies [5]. Traditionally followed breast cancer methods consume more time for diagnosis and they fail because of inaccurate diagnosis caused by human errors. Automated computer-based diagnosis schemes overcome the demerits of manual diagnosis, and hence, unnecessary surgeries and biopsies can be avoided [6]. Expert systems based on ANN have strong predictive capabilities which makes it suitable for building medical diagnosis systems [7]. ANN-based decision-making systems have outperformed the traditional technique used for classifying patterns.

Metaheuristic-based swarm intelligence approach is used for real-time optimization problem-solving [8,9,10]. Commonly used swarm intelligence approaches are the ant colony optimization (ACO) [11] and the particle swarm optimization (PSO) [12] inspired by the foraging behavior of ants and social behavior of birds, respectively. The echolocation capability of the microbats available in nature forms the basis of the bat algorithm (BA) [13]. A population-based swarm technique introduced based on the foraging behavior of honey bees [14]. The dynamic and static behavior of dragonflies forms the basis of a new metaheuristic algorithm called dragonfly algorithm (DA) [15]. Based on the herding behavior of krill, another swarm technique is proposed called krill herd (KH) algorithm [16].

A technique based on migration behavior, called the monarch butterfly optimization (MBO), is introduced [17]. The foraging behavior for the survival of E. coli bacteria forms the basis of the bacterial foraging optimization (BFO) [18]. Another swarm technique called the artificial immune system (AIS) is inspired by the biological immune system of the human body [19]. An algorithm for global optimization based on interior design and decoration [20]. A salp swarm algorithm (SSA) based on the swarming behavior of the salp in the ocean is introduced that can be used to solve multidimensional optimization problems [21]. Based on the Brownian movements and Levy movements of the predators during their foraging process, another swarm technique called marine predictor algorithm (MPA) was proposed [22]. This paper focused on hybridizing artificial bee colony optimization with the whale optimization algorithm to introduce the HAW algorithm. The proposed HAW algorithm integrates the employee bee phase of the ABC with the encircling prey/bubble net attacking method to have an enhanced exploitative phase called the employee attacking phase. In the employee attacking phase, the bees follow the bubble net attacking method of the whales to find out better food sources. The explorative phase of the HAW is driven by a mutative initialization phase of the standard ABC algorithm.

Appropriate selection of ANN topology design parameters such as the number of hidden layers, numbers of hidden nodes, initial weight values between the connections, learning rate and algorithm plays a vital role in building a successful ANN model [23]. The convergence of the backpropagation learning process can be affected by the improper selection of weights making the learning process to be trapped in the local optimal locations [24, 25]. Improper selection of the hidden nodes may make the ANN classifier to deal with the problems of under fitting and over fitting. If the usage of hidden nodes in an ANN model is not appropriate to the amount of learning required for accurate diagnosis, then the ANN classifier may be either overtrained where the ANN model can give accurate results in case of training and fails with inaccurate results in the case of testing or undertrained where the prediction rate decreases. Based on the above discussion, this paper focuses on optimal selection of the value of initial weights and the optimal selection of the hidden node sizes of an ANN model using HAW algorithm with the help of a wrapper architecture such that work aims at improving the learning performance of an ANN avoiding the problems of overfitting and underfitting with increased predictive capabilities.

FS deals with the deletion of irrelevant, redundant and noisy features present in the input dataset of a classifier. FS improves the generalization of an ANN classifier system with reduced computational time, as demonstrated in [26, 27]. Hence, simultaneously optimizing the input features and design parameters of ANN such as the initial weights and hidden node size can increase the predictability of the ANN classifier. Swarm-based intelligent systems are used for coupled optimization of input features and ANN design parameters [28]. Due to the importance of simultaneous optimization of ANN design parameters and FS process, the ANN topology optimization can be coupled with ABC optimization due to its powerful local and global search capabilities in finding out global optimal solutions.

This paper focuses on the following objectives:

(i)
A hybrid ABC-WOA optimization (HAW) that integrates the encircling prey and the bubble net attacking method of WOA with the employee bee phase of standard ABC to form an employee attacking phase.
(ii)
To make HAW to escape from local optimum locations, the proposed employee attaching phase uses the simulated annealing technique.
(iii)
To get a diversified set of solutions, exploration of the HAW is enhanced using the proposed mutative exploration phase of ABC.
(iv)
HAW is implemented for optimal feature subset selection and ANN parameter optimization using Wisconsin breast cancer dataset. HAW-optimized ANN model is evaluated in terms of accuracy, complexity and computational time.

1.1 Artificial bee colony (ABC) optimization

A metaheuristic swarm-based search mechanism called ABC is introduced by Karaboga in 2005. It is a population-based approach, inspired by the foraging nature of honey bees that solves multidimensional and multimodal real-time optimization problems for different applications, as demonstrated in [29]. ABC is based on a stochastic process that is robust and highly flexible with a lesser number of control parameters that make it simple. The algorithmic steps of the ABC optimization process are described in Algorithm (1):

Algorithm 1: Artificial bee colony algorithm

Step 1: Initialization:

Food sources are randomly produced using Equation (1).
$$A_{k}^{l} = A_{k}^{l} + random\left( {0,1} \right)*\left( {A_{{\max }}^{l} - A_{{\min }}^{l} } \right)$$
(1)
$A_{k}^{l}$ represents k^th food source with lth parameter and j = 1, 2………N, in which N represents maximum food sources. l = 1, 2……dim, in which ‘dim’ represents the dimension representing the number of parameters in the optimization problem. $A_{max}^{l}$ and $A_{min}^{l}$ are the minimum and maximum bound of the lth parameter of the optimization problem, respectively.

Step 2: Quality Evaluation of food source:

The fitness values are identified for each food source $A_{k}$.

Step 3: Employed bee Phase:

Food sources are assigned to employee bees or worker bees. The employee bees use Eq. (2) to search neighborhood food sources surrounding the current food sources $A_{k}^{l}$.
$$E_{k}^{l} = A_{k}^{l} + random\left[ { - 1,1} \right]{ }*\left( {A_{k}^{l} - A_{d}^{l} } \right)$$
(2)
$A_{d}$ is a random food source where d ∈ {1, 2…, N}. ‘l’ is a random integer, and i = {1, 2…, dim} and ‘d’ should not be equal to ‘l’ for proper exploitation. If the quality of $E_{k}^{l}$ is greater than $A_{k}^{l}$, then bee discards $A_{k}^{l}$ saving $E_{k}^{l}$ or vice versa.

Step 4: Onlooker Bee Phase:

Information regarding the selected food sources is shared with the onlooker bees. The probability value $Z_{k}$ of each food source received from the employee bee is calculated using Eq. (3).
$$Z_{k} = \frac{{fitness\left( {A_{k} } \right)}}{{\mathop \sum \nolimits_{k = 1}^{N} fitness\left( {A_{k} } \right)}}$$
(3)
The quality of the food source $A_{k}$ is represented as $fitness\left( {A_{k} } \right)$. The value $Z_{k}$ of food source is compared with a $random\left( {0,1} \right)$. Food sources with a $Z_{k}$ value greater than $random\left( {0,1} \right)$ are selected by the onlooker bees.

Step 5: Food source memorization:

The food source with the highest $fitness\left( {A_{k} } \right)$ is selected and memorized.

Step 6: Scout bee phase:

In the scout bee phase, unimproved food sources are identified based on a counter value and they are replaced by a randomly generated food source according to Equation (1).

1.2 Whale optimization algorithm (WOA)

WOA is a population-based swarm intelligence metaheuristic algorithm introduced by Mirjalili and Lewis [30] which is inspired by the foraging behavior of humpback whales. The humpback whales’ hunts group of krill or fishes using shrinking circle and producing bubbles in a circle ‘9′-shaped path. The exploitation phase is carried out using encircling prey and bubble net attacking based on the spiral. A random search of prey is used for exploration. The exploitation phase of WOA is explained in Algorithm (2).

Algorithm 2: Encircling prey/bubble net attacking of WOA

To hunt the prey for survival, humpback whales encircle around the prey which can be mathematically represented using Eqs. (4) and (5).

$$Y = \left| {\vec{L}.A^{*} \left( t \right) - A\left( t \right)} \right|$$

(4)

$$\vec{A}\left( {t + 1} \right) = \overrightarrow {{A^{*} }} \left( t \right) - \vec{M}*Y$$

(5)

where t represents the current iteration, ${\text{A}}^{*}$ represents the best solution found so far, A gives the position vector, | | represents the absolute value, and L and M represents the coefficient vectors that can be obtained using Eqs. (6 and 7).

$$\vec{M} = 2\vec{m}*\vec{r} - \vec{m}$$

(6)

$$\vec{L} = 2* \vec{r}$$

(7)

where m is reduced linearly starting from 2 till 0 as the iteration proceeds. r represents a random vector from a uniform distribution between [0,1]. Each whale that represents a solution updates its position using Eq. (5) where the updated new position of the whale depends on the best position (prey) found so far. The position of the whales can be controlled by the adjustment of vectors L and M. The value of m is decreased to achieve the shrinking encircling behavior using Eq. (8).

$$m = 2 - t\frac{2}{MaxIterat}$$

(8)

where t represents the current iteration and MaxIterat represents maximum iterations. The new position of the whale on the spiral path can be calculated using Eq. (9);

$$\vec{A}\left( {t + 1} \right) = Y^{\prime}*e^{ws} *\cos \left( {2\pi l} \right) + \vec{A}*\left( t \right)$$

(9)

where ${{{{Y}}^{\prime}}} = \left| {{{\vec{{{A}}}}}^{*} \left( t \right) - ~{{\vec{{{A}}}}}\left( t \right)} \right|$ which indicates the distance of a whale and the best solution (prey). w is a constant that represents the shape of the logarithmic spiral. s is the random number generated between [− 1,1]. Hence, the updated new position of the whale is calculated using 50% probability using a random number $P_{i}$ generated between [0,1] as represented by Eq. (10).

$$\vec{A}\left( {t + 1} \right) = \left\{ {\begin{array}{*{20}c} {\vec{A}\left( {t + 1} \right) = \overrightarrow {{A^{*} }} \left( t \right) - \vec{M}*Y} & {if (P_{i} < 0.5)} \\ {\vec{A}\left( {t + 1} \right) = Y^{\prime}*e^{ws} *\cos \left( {2\pi l} \right) + \vec{A}*\left( t \right)} & {if \left( {P_{i} \ge 0.5} \right)} \\ \end{array} } \right.$$

(10)

1.3 Comparative Investigation of ABC and WOA in terms of exploration and exploitation

In the context of exploration, WOA uses the search of prey phase for exploration that completely depends on a random search agent which is a stochastic strategy. In the same way, ABC incorporates scout bees for exploration with the help of a random search. This makes both the algorithms to produce solutions concentrated in a local area at the initialization phase, losing its diversification. Hence, the search process prematurely converges returning sub-optimal solutions in both of the algorithms. Hence, both ABC and WOA are weak at exploration.

In the context of exploitation, the local search process is incorporated using the encircling prey and the bubble net attacking method. The WOA exploitative phase guarantees convergence since positions of the whale are updated using the best solution (prey) obtained so far. Hence, proper exploitation is guaranteed by the encircling prey and the bubble net attacking method in the direction toward the prey since the search process is always guided by the best solution found so far. Comparatively, ABC exploitation is carried out using the employee bee phase and the onlooker bee phase where the positions of the food sources are updated by changing the single parameter of the old solution (food source) that causes the existence of similar food sources that converge at the same optimum locations. Also, the local search of ABC cyclically revisits the same solutions that create the problem of looping making the search converge prematurely. Hence, WOA is better at exploitation as compared to ABC.

1.4 Problems that are addressed by the proposed HAW

Many researchers have used ABC and WOA to develop optimal classifiers that can be used for medical diagnosis purposes, but still, the standard ABC suffers from the following issues which HAW addresses.

(i)
The local search by the employee and onlooker bee cyclically revisits similar solutions inducing the problem of looping making the search process converge prematurely.
(ii)
ABC optimization makes the solutions to be concentrated in local regions due to a lack of diversified solutions at initialization.
(iii)
The food source positions are updated by changing the single parameter of the old solution (food source) which causes the existence of similar food sources that converge at the same optimum locations.
(iv)
Exploitation is performed by two phases, namely the employee bee phase and the scout bee phase, whereas the exploration process is done only by scout bees, which leads to an imbalance in exploration and exploitation.

HAW that is capable of resolving the above issues can be used for generating an optimized ANN classifier that can accurately and efficiently be used for breast cancer diagnosis.

2 Related works

P. Shunmugapriya and S. Kanmani proposed an integrated algorithm of ABC and ACO for finding optimal feature subsets of medical datasets [31]. The global search followed by ABC is improved by using feature subsets generated by ACO to the ABC optimization process. The approach yielded an accuracy of 99.07% using Wisconsin breast cancer dataset (WBCD). The exploitation uses traditional greedy selection making algorithm to prematurely converge at the local optimal locations. The algorithm is only used for FS and no parameter optimization. Zorarpacı and Ozel introduced a hybrid algorithm of DE and ABC for optimal binary subsets selection [32]. The algorithm combines the high exploration property of DE with an improved onlooker bee phase of the ABC. The approach achieved F-measure of 92.2, 96.4 and 97.6 for decision tree classifier, naive Bayes classifier and RBF networks classifier, respectively, using WBCD. The algorithm is only used for FS and no parameter optimization.

Shanthi and Bhaskaran presented a modified ABC for FS [33]. The exploitation of the employee bees is improved where the neighborhood search process is improved using the global best solution. The modified ABC is used for FS using benchmark datasets called mammographic image analysis society (MIAS) and digital database for screening mammography (DDSM) for breast cancer diagnosis. Classification is carried out using self-adaptive resource allocation network. The accuracy was evaluated as 96.89% and 97.17% for MIAS and DDSM, respectively. The algorithm has not focused on the explorative phase and it has used only randomized initial solutions with loss of diversification. Rao et al. applied a FS algorithm using ABC and decision trees based on the gradient boosting model [34]. The features are selected from the Wisconsin breast cancer dataset and Haberman’s survival dataset. A regression tree is used as the classifier where gradient descent finds the direction of the gradient of residuals. The classification accuracy is 74.3% for Haberman’s cancer dataset and 92.8% for WBCD. It has not been evaluated in terms of complexity.

An efficient ABC is proposed by Badem et al., for optimal learning of deep neural networks (DNN) [35]. This algorithm used ABC and Broyden–Fletcher–Goldfarb–Shannon (BFGS) with limited memory. This proposed ABC tuned the parameters of DNN with cascaded autoencoder layers. The classification accuracy using WBCD is 73.03%. The step size of the neighborhood search is kept static throughout the entire search process affecting the convergence. Garro presented an optimized classification of DNA microarrays using ABC [36]. The optimal feature subsets from breast cancer datasets are selected using ABC. Then the selected optimal feature subsets are given to MLP, radial basis function neural network and support vector machine (SVM). The accuracy attained is 94.7% for MLP, for SVM accuracy is 89.5% and for RBF it is 73.7%. The algorithm used standard ABC without any improvement. Palanisamy and Kanmani proposed ABC-based FS for UCI datasets [37]. The system chooses 2 features from 9 attributes from WBCD and yielded an accuracy of 96.69%. The system is simple but used only the standard ABC.

Optimal FS using ABC for UCI repository datasets is proposed [38]. The employee bee phase is modified using a modification rate where the feature is selected if the random number is greater than the modification rate. The classification accuracy is 75.87%. The algorithm has not focused on the explorative phase and it has used only randomized initial solutions with loss of diversification. Two hybrid algorithms are proposed based on ABC and PSO [39]. In the first algorithm, the employee bee phase is hybridized with PSO to find new velocity position updates. In the second algorithm, the onlooker and scout bee phase are improved using mutations of the genetic algorithm. Both of the algorithms have the highest accuracy of 99.14% with an optimal selection of 13 features using WBCD. The algorithm is only used for FS and no parameter optimization. A hybrid algorithm for FS using branch and bound approach and ABC is proposed [40]. The algorithm first applies the branch and bound and finds the first set of features. Then, it applies ABC to identify the second set of features. A union operation is done to form a new set of optimal features. The algorithm has not focused on classification.

Schiezaro and Pedrini used an optimal FS using whale bubble net hunting strategy for UCI repository datasets [41]. The algorithm handles exploitation using the bubble net attacking method phase. Further, a global search is carried out by the search for the prey phase. During the evaluation, the SVM classifier attained an accuracy of 98.77%, precision of 99.15%, recall of 98.64% and f-score of 98.9%. This algorithm is only used for FS and no parameter optimization. J. Jona and N. Nagaveni presented an optimal FS using the integration of ACO and cuckoo search [42]. Local search behavior of ACO is improved using the exploitation of cuckoo search. The algorithm selected feature set that is optimal from the set of 78 texture features derived using GLCM. The input is taken from the MIAS dataset. In this approach, 5 features were selected with 94% accuracy. The algorithm uses the SVM classifier for prediction. This algorithm showed increased performance of 4% and 2% when compared with PSO and ACO, respectively.

A novel hybrid whale–artificial bee colony optimizer framework is introduced by Siddavaatam and Sedaghat for cross-layer optimization for Internet of Things (IoT) [93]. An efficient MAC for IoT has been designed to minimize energy consumption with extended network lifetime. The novel hybrid whale–artificial bee colony optimizer framework is used to obtain optimal nodes and the communication parameters in the IoT. It saves computation resources of the resource constrained IoT devices.

3 Materials and methodologies

A wrapper-based method that eliminates the use of statistical methods such as information gain or F-score is used for implementing the proposed HAW. The proposed architecture is depicted in Fig. 1. The input dataset is the breast cancer dataset where the total set is divided into three subsets. The first set that contains 50% of samples is used for training. The next 25% of samples are used for the validation, and the rest of 25% are used for testing. With the help of the optimal set of input features generated by HAW, the optimal features are selected from the three subsets where the other features are rejected. The optimal selected features of the training set are used for training the underlying ANN classifier. The proposed wrapper architecture was implemented utilizing MATLAB 8.5 software. Backpropagation training is done using a neural network toolbox.

3.1 Breast cancer datasets used by the proposed Wrapper architecture

HAW is evaluated using breast cancer datasets such as the WBCD [87], Wisconsin diagnostic breast cancer dataset (WDBC) [88], Wisconsin prognostic breast cancer dataset (WPBC) [89], DDSM [90], MIAS [91] and INbreast database [92]. The description of the datasets used is given in Table 1.

Table 1 Breast cancer datasets used by the proposed wrapper architecture

A hybrid artificial bee colony with whale optimization algorithm for improved breast cancer diagnosis

Abstract

Similar content being viewed by others

An Efficient Bio-inspired Bees Colony for Breast Cancer Prediction

Hybrid ABC and black hole algorithm with genetic operators optimized SVM ensemble based diagnosis of breast cancer

Enhanced Monarchy Butterfly Optimization Technique for effective breast cancer diagnosis

Explore related subjects

1 Introduction

1.1 Artificial bee colony (ABC) optimization

1.2 Whale optimization algorithm (WOA)

1.3 Comparative Investigation of ABC and WOA in terms of exploration and exploitation

1.4 Problems that are addressed by the proposed HAW

2 Related works

3 Materials and methodologies

3.1 Breast cancer datasets used by the proposed Wrapper architecture

3.2 Detailed description of the proposed HAW algorithm

3.2.1 Initialization and fitness calculation

3.2.2 Proposed mutative exploration phase

3.2.3 Proposed employee bee attacking phase

3.2.4 Onlooker bee phase

3.2.5 Scout bee phase

4 Performance evaluation of the proposed HAW optimization

4.1 Parameter settings and experimental setup using WBCD

4.2 Evaluation of Proposed HAW using WBCD with respect to accuracy, complexity and computational time

4.3 Performance evaluation of HAW using different breast cancer datasets

4.4 Comparison with existing approaches using WDBC and WPBC

4.5 Comparison with existing approaches using DDSM, MIAS and INbreast databases

4.6 Comparison with existing approaches that doesn’t involve complex tuning process for breast cancer diagnosis

5 Discussions

6 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation