Augmented arithmetic optimization algorithm using opposite-based learning and lévy flight distribution for global optimization and data clustering

Abualigah, Laith; Elaziz, Mohamed Abd; Yousri, Dalia; Al-qaness, Mohammed A. A.; Ewees, Ahmed A.; Zitar, Raed Abu

doi:10.1007/s10845-022-02016-w

Augmented arithmetic optimization algorithm using opposite-based learning and lévy flight distribution for global optimization and data clustering

Published: 21 September 2022

Volume 34, pages 3523–3561, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Augmented arithmetic optimization algorithm using opposite-based learning and lévy flight distribution for global optimization and data clustering

Download PDF

Laith Abualigah ORCID: orcid.org/0000-0002-2203-4549^1,2,
Mohamed Abd Elaziz^3,4,5,6,
Dalia Yousri⁷,
Mohammed A. A. Al-qaness^8,9,
Ahmed A. Ewees^10,11 &
…
Raed Abu Zitar¹²

563 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

This paper proposes a new data clustering method using the advantages of metaheuristic (MH) optimization algorithms. A novel MH optimization algorithm, called arithmetic optimization algorithm (AOA), was proposed to address complex optimization tasks. Math operations inspire the AOA, and it showed significant performance in dealing with different optimization problems. However, the traditional AOA faces some limitations in its search process. Thus, we develop a new variant of the AOA, namely, Augmented AOA (AAOA), integrated with the opposition-based learning (OLB) and Lévy flight (LF) distribution. The main idea of applying OLB and LF is to improve the traditional AOA exploration and exploitation trends in order to find the best clusters. To evaluate the AAOA, we implemented extensive experiments using twenty-three well-known benchmark functions and eight data clustering datasets. We also evaluated the proposed AAOA with extensive comparisons to different optimization algorithms. The outcomes verified the superiority of the AAOA over the traditional AOA and several MH optimization algorithms. Overall, the applications of the LF and OLB have a significant impact on the performance of the conventional AOA.

A novel generalized normal distribution arithmetic optimization algorithm for global optimization and data clustering problems

Article 22 May 2022

Opposition-based antlion optimizer using Cauchy distribution and its application to data clustering problem

Article 11 April 2019

NSLS with the Clustering-Based Entropy Selection for Many-Objective Optimization Problems

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The wide applications of the internet, WEB, and smart devices increase the data and produce critical problems to mine the useful data (Zhou et al., 2019; Ezugwu et al., 2022). Different data mining methods have been developed to tackle these problems using several techniques, including clustering, regression, and classification (Abualigah, 2019; Abualigah et al., 2018). Some of these applications are employed in different area, such as recommendation systems (Schickel-Zuber & Faltings, 2007), text mining (Chen et al., 2020), and computer vision applications (Dhanachandra et al., 2015; Namratha & Prajwala, 2012). Data clustering has received wide attention due to its simple application by collecting items in similar groups depending on their features (Abualigah et al., 2017). This is done by minimizing the distance between these comparable items and their centers. There are two common types of data clustering methods, called portioning and hierarchy. The hierarchical approaches face certain drawbacks with large datasets due to their slow implementations, and they can be considered time-consuming methods. Therefore, partitioning methods have been adopted for data clustering due to their efficiency with large datasets (Saxena et al., 2017; Xu & Wunsch, 2005) Moreover, the most common clustering methods are K-means and fuzzy C-means (FCM). Such methods generate canters to the group items in a random manner. Thus they face major limitations, for example, convergence in the local optima (Jain, 2010; Abualigah & Diabat, 2020).

In this regard, different types of the optimization techniques are applied to control in these algorithms (Abualigah, 2020; Abualigah & Diabat, 2021), such as particle swarm optimization (PSO) (Eberhart & Kennedy, 1995), sine-cosine algorithm (SCA) (Mirjalili, 2016), genetic algorithm (GA) (Holland, 1992), atom search optimization (ASO) (Zhao et al., 2019), artificial bee colony (ABC) (Karaboga & Basturk, 2007), salp swarm algorithm (SSA) (Mirjalili et al., 2017), gravitational search algorithm (GSA) (Rashedi et al., 2009), cuckoo search algorithm (CS) (Gandomi et al., 2013), marine predators algorithm (MPA) (Faramarzi et al., 2020), Aquila Optimizer (Abualigah et al., 2021), and other optimization algorithms (Abualigah & Diabat, 2020; Abualigah et al., 2020, 2022).

The application of these algorithms improves the performance of the clustering methods; however, these algorithms also still have some drawbacks, especially in solving mechanical clustering problems (Abualigah et al., 2021, 2020). For instance, some of them cannot effectively explore the search domain in all problems, whereas other methods have a low exploitation ability (Mukhopadhyay et al., 2015; Suresh et al., 2009). Therefore, several attempts are to overcome these limitations by combining some optimization algorithms or improving their local search methods. The results of these attempts showed an excellent ability to enhance many algorithms (Ewees et al., 2017, 2018). For example, Alswaitti et al. (2018) proposed a Kernel density-based PSO method for data clustering. To overcome the shortcomings of the traditional PSO, they applied the kernel density estimation method with a bandwidth estimation technique to solve the problem of premature convergence. They evaluated the improved PSO method with eleven UCI datasets and showed significant performance compared to the traditional PSO. In Abd Elaziz et al. (2019), an automatic data clustering algorithm was proposed using a hybrid of sine-cosine algorithm SCA and ASO. The main goal of the hybrid method is to automatically find the optimal number of centroids to minimize the Compact-separated index. Thus, the sine cosine algorithm enhances the atom search optimization algorithm’s searchability to find the optimal solution. It was evaluated with different datasets and with several performance measures. Evaluation outcomes showed that the hybrid ASOSCA obtained better results than the traditional ASO and SCA and several optimization methods.

In Zabihi and Nasiri (2018), a new data clustering method was proposed using a modified version of the ABC algorithm. The main idea of the modified version, called history-driven ABC (Hd-ABC), is to enhance the exploitation capability of the traditional ABC algorithm by employing a memory mechanism. It was evaluated on nine UCI datasets and showed superior performance. Zhou et al. 2019 proposed a clustering method using both density peaks clustering and a modified version of the GSA. They evaluated the combing approach using ten datasets, and they compared it to several existing optimization algorithms and the traditional k-means algorithm. It showed significant performance with a higher level of stability. In Boushaki et al. (2018), a new variant of the CS algorithm is proposed for data clustering. The main idea is to apply boundary handling strategy and Chaos maps to enhance the global search ability of the CS. The modified CS algorithm was evaluated with six real-life datasets and compared to eight optimization methods, and it showed competitive performance.

A hybrid of MPA and PSO for automatic data clustering was proposed by Wang et al. (2020). The global searching of the MPA is improved by using the update strategy of PSO, and it showed better performance compared to the traditional MPA, traditional PSO, and other optimization algorithms. Furthermore, various modified optimization algorithms have been applied for data clusterings, such as multi-objective GA with the fuzzy c-means (FCM) (Wikaisuksakul, 2014), an enhanced version of Grey Wolf Optimizer (Tripathi et al., 2018), a new variant of harmony search algorithm (Talaei et al., 2020), and a modified version of the multi-verse optimizer (Abasi et al., 2020).

In the same context, a new MH algorithm named Arithmetic Optimization Algorithm (AOA) was developed in Abualigah et al. (2021). This algorithm emulated the function of arithmetical operators such as subtraction, addition, division, and multiplication. These operators are used to represent exploration and exploitation. According to these behaviors, AOA has been applied to solve global and engineering optimization problems. However, similar to other MH techniques, AOA still needs more improvements to balance the exploration and exploitation during the search for the optimal solution. In addition, following the NFL theorem assumed that no one algorithm can solve all optimization problems with the same performance. This motivated us to present an alternative version of AOA and apply it to real-world applications.

Motivated by the excellent performance of MH algorithms in data clustering, we developed a new clustering method based on the modified AOA in this paper. This modification depends on using Opposition-based learning (OBL) and Lévy Flight (LF) distribution to improve the ability of AOA to converge towards the optimal solution. In general, OBL is applied to enhance the exploration of AOA and LF to improve the exploitation. These two techniques have established their performance in several applications through modifying several MH methods (Elaziz et al., 2020; Elaziz & Oliva, 2018; Elaziz & Mirjalili, 2019). For example, OBL is applied to enhance the performance of the Sine-cosine algorithm (SCA) as in Elaziz et al. (2017). The brainstorm optimization is improved using OBl, and it is used as global optimization and feature selection method in Oliva and Elaziz (2020). In Ewees et al. (2018), the modified version of the grasshopper optimization algorithm based on OBL has been applied as a global optimization technique and compared with other methods. Moreover, the LF distribution has been used to enhance the performance of several MH techniques such as improved PSO and used to improve the quality of flexible job shop greening scheduling with crane transportation application (Zhou & Liao, 2020). In Yan et al. (2017), LF distribution combined with PSO and applied to solve the atomic clusters optimization problem. Salp Swarm Algorithm has been improved using LF and applied to several global optimization methods as in Zhang and Wang (2020)

Besides these behaviors of OBL and LF, an alternative modified AOA has been presented. The developed method starts by setting the initial value for solutions. Followed by computing each solution’s fitness value and finding the best solution. The next step is to adopt the current solution using AOA, OBL, and LF distribution operators. Updating the solutions is repeated until it reaches terminal conditions and returns the best solution.

In summary, our main objectives and contributions are:

Propose an alternative global optimization and clustering technique according to the enhanced version of AOA.
Develop the performance of AOA using the operators of OBL and LF distribution.
Apply the developed method to global optimization problems and real-world clustering datasets.
Compare the results of the developed method with other MH techniques.

The sections of this paper are presented as follows. Section 2 describes the background of the applied techniques. Section 3 gave the proposed AAOA clustering method and its experimental evaluation compared to other methods. Section 4 shows the 23 benchmark functions. Section 5 is the conclusion and future directions.

Background

Arithmetic optimization algorithm

The basic steps of the Arithmetic Optimization Algorithm (AOA) (Abualigah et al., 2021) are introduced in this section. In general, AOA is similar to other MH techniques, with two phases named exploration and exploitation. These two phases are emulated using the basic mathematics operators (i.e., $ -, +, *$, and /).

The first step in AOA is to generate a set of N agents; each represents the solution for the tested problem. These agents represent the population X that is given as:

$$\begin{aligned} X=\begin{bmatrix} x_{1,1} &{} \cdots &{}x_{1,j} &{} x_{1,n-1} &{} x_{1,n}\\ x_{2,1} &{} \cdots &{} x_{2,j}&{} \cdots &{} x_{2,n}\\ \cdots &{} \cdots &{} \cdots &{}\cdots &{}\cdots \\ \vdots &{} \vdots &{}\vdots &{} \vdots &{} \vdots \\ x_{N-1,1} &{}\cdots &{} x_{N-1,j} &{} \cdots &{} x_{N-1,n}\\ x_{N,1} &{} \cdots &{} x_{N,j} &{}x_{N,n-1} &{} x_{N,n}\\ \end{bmatrix} \end{aligned}$$

(1)

The next step is to compute the fitness function for each agent and determine the best of them $X_b$. Then according to the value of Math Optimizer Accelerated (MOA), AOA will perform exploration or exploitation, and the value of MOA is updated as:

$$\begin{aligned} MOA(t)=Min+t \times \left( \frac{Max_{MOA}-Min_{MOA}}{M_t} \right) \end{aligned}$$

(2)

In Equation (2), t is the current iteration, $M_t$ is the total number of iterations. $Min_{MOA}$ and $Max_{MOA}$ are the minimum and maximum value of the accelerated function, respectively Zheng et al. (2022).

In the case of the AOA exploration phase, the division (D) and multiplication (M) operators are used. This process is formulated as:

$$\begin{aligned}&X_{i,j}(t+1)\nonumber \\&\quad =\left\{ \begin{matrix} X_{ij} \div (M_{OP}+\epsilon ) \times ((UB_{j}-LB_{j}) \times \mu + LB_{j} ), &{}r_2<0.5 \\ X_{ij} \times M_{OP} \times ((UB_{j}-LB_{j}) \times \mu + LB_{j} ),&{}otherwise \end{matrix}\right. \nonumber \\ \end{aligned}$$

(3)

where $X_{ij}$ is the ith position in the jth solution, $\epsilon $ refers to a small integer value, $UB_{j}$ and $LB_{j}$ denotes the lower and upper boundaries of the search space at the jth dimension, respectively. $\mu =0.5$ denotes the control function, and the Math Optimizer ($M_{OP}$) is formulated as:

$$\begin{aligned} M_{OP}(t)=1-\frac{t^{1/\alpha }}{M_t^{1/\alpha }} \end{aligned}$$

(4)

In Equation (4), $\alpha =5$ denotes the dynamic parameter which determines the precision of exploitation throughout iterations. Meanwhile, the exploitation phase of AOA is conducted using the subtracting (S) and addition operators (A) Elaziz et al. (2021). This achieved using the following formula:

$$\begin{aligned}&x_{i,j}(t+1)\nonumber \\&=\left\{ \begin{matrix} X_{ij} - M_{OP} \times ((UB_{j}-LB_{j}) \times \mu + LB_{j} ), &{}r_3<0.5 \\ X_{ij} + M_{OP} \times ((UB_{j}-LB_{j}) \times \mu + LB_{j}),&{}otherwise \end{matrix}\right. \nonumber \\ \end{aligned}$$

(5)

where $r_3$ is a random number generated inside [0,1]. After that, the updating process of agents is performed using the operators of AOA. The steps of the AOA are given in Algorithm 1.

Lévy flight distribution

In this section, Lévy flight is one of the most popular distribution approaches which follow the non-Gaussian distribution (Houssein et al., 2020; Chegini et al., 2018). After that, Equation (6) is used to update agents inside the population according to the following formula.

$$\begin{aligned}&x(t+1) = x(t) \times Levy(Dim) \end{aligned}$$

(6)

$$\begin{aligned}&Levy(Dim)=s \times \frac{u \times \sigma }{|\upsilon |^{\frac{1}{\beta }}} \end{aligned}$$

(7)

In Equation (6), $s=0.01$ denotes a constant value, u and $\upsilon $ denote random numbers between [0 1]. $\sigma $ is given using the following formula.

$$\begin{aligned} \sigma =\left( \frac{\Gamma (1+\beta ) \times sine(\frac{\pi \beta }{2} )}{\Gamma (\frac{1+\beta }{2}) \times \beta \times 2^{(\frac{\beta -1}{2})}} \right) \end{aligned}$$

(8)

where sine denotes the sine function value, and $\beta $ is a constant value fixed to 1.5.

Opposition-based learning (OLB)

The OBL strategy was proposed by Tizhoosh (2005) as a machine intelligence method. It was used in many applications as an efficient search mechanism to enhance several optimization methods (Ewees et al., 2018). The OBL works to create a new opposition solution using the current one to improve the search space.

In the OBL method, there is an opposite value ($X^O$) for a real value. X $\in $ [LB,UB] can be calculated using Equation (9).

$$\begin{aligned} X^O=UB+LB-X \end{aligned}$$

(9)

Opposite value (Ewees et al., 2018): X = ($X_1$, $X_2$ , ..., $X_n$) is a value in the search space, $X_1, X_2,..., X_D$ and $X_j$ [$UB_j,LB_j$], j $\in $ 1, 2, ..., D. This representation is applied using the following Equation (10).

$$\begin{aligned} {X^O_j}=UB_j+LB_j-X_j, \quad \quad where \quad j=1....D. \end{aligned}$$

(10)

Furthermore, in the optimization task, the two solutions ($X^O$ and X) are evaluated using the fitness functions; then, the best solution will be reserved and ignored the other.

The proposed AAOA

The general framework of the developed method, named AAOA, is given in Fig. 1. AAOA aims to enhance the ability of the AOA to balance exploration and exploitation during the process of searching for the optimal solution. To achieve this aim, the OBL approach and LF distribution are combined with the operators of traditional AOA. Each of them is applied to perform a specific task, such as OBL, to enhance the exploration ability of AOA to discover the infeasible region. Meanwhile, LF is used to improve the convergence rate towards the optimal solution and avoid the attraction to the local optima point. This integration between AOA, OBL, and LF significantly enhance the performance of AOA.

The proposed AAOA algorithm begins by randomly setting the initial value of N agents (X) using the following formula.

$$\begin{aligned}&X_{ij}=rand\times (UB-LB)+LB, \nonumber \\&\quad i=1,2,...,N, j=1,2,...,D \end{aligned}$$

(11)

In Equation (11), UB and LB are the upper and lower boundaries of the search domain, respectively. D denotes the dimension of each agent $X_i$. The following process calculates the fitness value for each agent and allocates the best of them $X_b$. Followed by starting updating the agents X using the combination between AOA, OBL, and LF. This was conducted using random factor $R_f\in [0,1]$ that switches between the operators of AOA (on one side) and the competition of OBL and LF (on the second side). For example, if $R_f <0.5$, then the operators of AOA will be used to update the current solutions. Otherwise, either the OBL or LF will be used, and inside the developed method, each of those two techniques has 50% to be applied. This process can be formulated as:

$$\begin{aligned}&X_{i}(t+1)\nonumber \\&=\left\{ \begin{matrix} Use \, operators \, of \, AOA\, as\, in \, Eqs.\, (3)-(5), &{}R_f<0.5 \\ \left\{ \begin{matrix} Apply\, OBL\, as\, in \, Equation (10) &{}rand<0.5 \\ Apply\, LF\, as\, in \, Equation (6),&{}otherwise \end{matrix}\right. ,&otherwise \end{matrix}\right. \nonumber \\ \end{aligned}$$

(12)

After that, the terminal conditions are checked, and if they are not satisfied, then the updating process is repeated. Otherwise, the best solution $X_b$ is returned as an output of the developed method. The steps of AAOA are illustrated in Algorithm 2.

Table 1 Unimodal benchmark functions

Augmented arithmetic optimization algorithm using opposite-based learning and lévy flight distribution for global optimization and data clustering

Abstract

Similar content being viewed by others

A novel generalized normal distribution arithmetic optimization algorithm for global optimization and data clustering problems

Opposition-based antlion optimizer using Cauchy distribution and its application to data clustering problem

NSLS with the Clustering-Based Entropy Selection for Many-Objective Optimization Problems

Explore related subjects

Introduction

Background

Arithmetic optimization algorithm

Lévy flight distribution

Opposition-based learning (OLB)

The proposed AAOA

Performance evaluation using 23 benchmark functions

Benchmark description

Experiments and results

First experiment: global optimization

Qualitative analysis

Simulations and discussions of 23 benchmark functions

Scalability analysis

Performance evaluation using CEC2019 benchmark functions

Second experiment: clustering applications

Datasets description

Results and discussion

Conclusion and potential future works

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation