A novel meta-heuristic algorithm for solving numerical optimization problems: Ali Baba and the forty thieves

Braik, Malik; Ryalat, Mohammad Hashem; Al-Zoubi, Hussein

doi:10.1007/s00521-021-06392-x

A novel meta-heuristic algorithm for solving numerical optimization problems: Ali Baba and the forty thieves

Original Article
Published: 09 August 2021

Volume 34, pages 409–455, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Neural Computing and Applications Aims and scope Submit manuscript

A novel meta-heuristic algorithm for solving numerical optimization problems: Ali Baba and the forty thieves

Download PDF

2019 Accesses
82 Citations
1 Altmetric
Explore all metrics

Abstract

This paper presents a novel meta-heuristic algorithm called Ali Baba and the forty thieves (AFT) for solving global optimization problems. Recall the famous tale of Ali Baba and the forty thieves, where Ali Baba once saw a gang of forty thieves enter a strange cave filled with all kinds of treasures. The strategies pursued by the forty thieves in the search for Ali Baba inspired us to design ideas and underlie the basic concepts to put forward the mathematical models and implement the exploration and exploitation processes of the proposed algorithm. The performance of the AFT algorithm was assessed on a set of basic benchmark test functions and two more challenging benchmarks called IEEE CEC-2017 and IEEE CEC-C06 2019 benchmark test functions. These benchmarks cover simple and complex test functions with various dimensions and levels of complexity. An extensive comparative study was performed between the AFT algorithm and other well-studied algorithms, and the significance of the results was proved by statistical test methods. To study the potential performance of AFT, its further development is discussed and carried out from five aspects. Finally, the applicability of the AFT algorithm was subsequently demonstrated in solving five engineering design problems. The results in both benchmark functions and engineering problems show that the AFT algorithm has stronger performance than other competitors’ algorithms.

Black-winged kite algorithm: a nature-inspired meta-heuristic for solving benchmark functions and engineering problems

Article Open access 23 March 2024

Metaheuristic optimization algorithms: a comprehensive overview and classification of benchmark test functions

Article 11 October 2023

A systematic review of the emerging metaheuristic algorithms on solving complex optimization problems

Article 26 March 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Solving optimization problems is the norm in almost all disciplines of engineering [1, 2] and science [3, 4], and the need for more robust solutions is ever increasing. This means, we need plausible algorithms that can fit the intricate nature of such up-to-date scientific and engineering challenges. When surveying the literature for the existing optimization methods, one may find a wide variety of these methods [5]. These range from traditional optimization techniques that use both linear and nonlinear programming [6], to the newer nature-inspired meta-heuristics [7, 8], each with their own strengths and weaknesses. Despite being successful in solving well-known optimization problems [2, 9], traditional algorithms on one side suffer from inherent dependency on gradient information and the desperate need for a promising initial starting vector within the search space [2, 9]. The existing nature-inspired meta-heuristic optimizers, on the other side, are highly problem dependent in that they might be very successful in solving certain problems and may not be able to provide satisfactory solutions for other problems [10]. This is partly ascribed to the common behavior of these meta-heuristics of being trapped in local or suboptimal solutions [11].

What makes contemporary optimization problems tough in nature can be summarized in a few points. It is very likely that such problems are nonlinear in essence, troublesome in nature, contain numerous decision variables, their objective functions are, in some cases, complex and handcuffed with several constraints, in addition to having multifarious peaks [12]. For these problems, it is imperative to go ahead from an encouraging starting point with the hope of finally hitting the global optimum solution. After many years of research, the research community has found that traditional methods might not represent themselves as the best catch for solving these contemporaneous optimization problems [13]. More specifically, the found solution must be accurate enough to be accepted, and the time needed to solve the problem should fall within reasonable ranges [14]. To this end, the researchers have turned their attention toward nature-inspired meta-heuristics that have shown extremely heartening capabilities in dealing with very knotted shapes of challenging optimization problems [15, 16]. Meta-heuristic techniques are global optimization methods designed based on simulations and methods inspired by nature that are openly applied to solve global optimization problems [17, 18].

In contrast to traditional algorithms, meta-heuristic methods have become startlingly very trendy. This reputation is due to the fact that these methods are very flexible, do not require gradient information and have proven their success in escaping from local minimums when solving real-world scientific or engineering problems that have several local solutions [11, 19]. It is important to note that the first and second merits stand out from the verity that meta-heuristics tackle optimization problems by assuming them as black boxes in that they only require knowledge of the input and output sets of the variables. Thereby, there is no necessity to calculate a derivative of the search space. Also, meta-heuristics belong to the family of stochastic optimization methods, in which they make use of the stochastic operators. This feature has been broadly affirmed, in which meta-heuristics have proven successful in keeping away from local minima when addressing real problems that often have a large number of local minimums [11, 18]. This explains their eligibility to handle challenging optimization problems in diversified domains [20, 21]. More closely, meta-heuristics have been harnessed to tackle hard real-life problems in a variety of scientific and engineering disciplines. Examples of such domains encompass, but are not limited to, image processing [22, 23], signal processing [24], the realm of process control [25], text clustering [26], classification problems [27] as well as several other domains [28, 29].

1.1 Motivations of the work

According to the “no-free-lunch” (NFL) theory [30], it is difficult to employ a single meta-heuristic algorithm in striving to solve all possible optimization problems [31]. As it really is, one meta-heuristic algorithm might do a good job in optimizing certain problems in particular fields, but it falls short to find the global optima in other fields [11]. This has been a motive for the researchers in this field, as well as ourselves, to look for new and innovative nature-inspired methods to solve and show superior scores on the current and new hard real-life problems [32]. The door is still open, and here we present a novel meta-heuristic algorithm based on human behavior with the very famous tale of Ali Baba and the forty thieves, as our inspiration targeting numerical optimization problems.

1.2 Contributions of the work

The core of this paper is to establish a novel nature-inspired algorithm, referred to as Ali Baba and the forty thieves (AFT), to solve global optimization problems. As its name glimpses, AFT falls into the category of human-based algorithms, as it is inspired by human interactions and human demeanor in a human-related story. The thieves’ behavior, in the tale of Ali Baba and the forty thieves, in finding Ali Baba and the intelligent methods that Ali Baba’s maid used to save him from the thieves, inspired us to simulate this behavior with an optimization algorithm. In this anecdote, there is a behavior that has many similarities with optimization processes. From the point of view of optimization, the thieves are the search agents, the environment (i.e., town of Ali Baba) is the search space, each position of the town corresponds to a realizable solution, the home of Ali Baba is the objective function and Ali Baba is the global solution. Based on these similarities, the AFT algorithm was developed to mimic the behavior of the thieves and the maid to locate the global solution to the considered optimization problems. The performance of AFT was evaluated on sixty-two benchmark test functions, and it was applied to optimize the designs of five engineering problems.

Section 2 presents the literature and related works. Section 3 shows the tale of Ali Baba and the forty thieves and the key concepts of this tale. Section 4 presents the mathematical models and analysis of the AFT method. Some of the possible expansions of AFT from several aspects are given in Sect. 5. Section 6 then presents a conceptual comparison of AFT with other existing optimizers. The experimental, qualitative and statistical analysis results are introduced in Sect. 7. Section 8 presents the applicability and reliability of AFT in solving five engineering problems. The conclusion comments and some further research paths are shown in Sect. 9.

2 Related works

This section looks at the most recent developments in the field of optimization. There are many sections in this field such as multi-objective, single-objective, constrained and others [33]. Since the meta-heuristic algorithm proposed in this work is turned to solve single optimization problems, the chief hub of this section concerns the relevant works in single optimization areas.

2.1 Single-objective optimization problems

In single-objective optimization problems, there is only one objective to be maximized or minimized. This kind of optimization might be subject to a set of constraints, which fall into two categories: equality and inequality [11]. Without loss of generality, single-objective optimization problems can be expressed as a minimization or a maximization problem. The search space is created using a set of variables, objectives, range of variables and constraints. For optimization problems, the search space can be easily plotted in the Cartesian coordinate system and its shapes can then be observed. Having a large number of decision variables is the first challenge when addressing optimization problems. The limitation of the search space is the range of variables, which is diversified. These variables can be discrete or continuous. This means that they either create a discrete or a continuous search space. In the first case, there is a finite set of points between two points in the search space, while in the second case, there is an infinite number of points between every two points [11].

Usually, an optimization method might begin with an initial range and extend it while optimization. The constraints restrict the search space even more, and typically lead to breaks in the search space because the solutions in those areas are not appropriate when solving an optimization problem. A set of constraints can even divide the search space into various detached areas. The solutions that penetrate the constrained regions are named infeasible solutions, while the solutions in the constrained regions are named feasible solutions. There are two terms for the portions of the search space that are within and out of the constrained regions: feasible and infeasible. A restricted search space has the potency to render the optimization method ineffective in spite of its sensible performance in an unrestricted search space [34]. Thus, optimization methods must be well prepared with adequate operators to deal effectively with the constraints [34]. Another challenge that arises when tackling optimization problems is the existence of local solutions.

In a single-objective search space, there is usually the global optimal solution that returns the best objective value. However, there are normally several other solutions that yield values close to the objective value of the global optimal [33]. This type of solutions is named local solutions as it is locally the best solution if we take into account the search space in its vicinity. On the other hand, it is not the best solution globally when taking into account the whole search space. The existence of local solutions in optimization problems bring many optimization algorithms to fall into local solutions [8]. A real search space generally contains a large number of local solutions. Thus, an optimization method must be able to efficiently averting them to find the global optimum. An optimization algorithm that is able to eschew local solutions is not necessarily capable of converging to the global optimum. The approximate position of the global optimal is found when an algorithm averts local solutions. The convergence speed is also a difficulty when solving optimization problems [8]. Needless to say, rapid convergence leads to local optima stagnation. In contrast, abrupt variations in the solutions result in avoiding local optima, but slow down the convergence rate toward the global optimal. These two trade-offs are the key challenges that an optimization algorithm handles while addressing optimization problems. There are other varieties of difficulties when addressing a single-objective search space such as: isolation of the optimum, dynamic objective function and many more [11]. Each of these challenges demands special attention. These conceptions are outside the scope of this paper, so solicitous readers are referred to the studies conducted by Boussaid [33].

2.2 Single-objective optimization algorithms

In the literature, optimization algorithms can be split into two broad categories:

Deterministic algorithms these algorithms always locate the same solution for a particular problem if they commence with the same starting point. The main merit of these methods is the reliability as they decidedly find a solution in each run. However, local optima stagnancy is a flaw as these algorithms do not typically contain random behaviors when solving optimization problems.
Stochastic algorithms these algorithms benefit from stochastic operators. This leads to find a different solution at each run even if the starting point, in the runs, remains unaltered and thus makes stochastic methods less reliable as compared to the deterministic methods. However, the stochastic behavior has the vigor to avoid the local optimums. The reliability of stochastic algorithms can be boosted by adjusting and rising the number of runs. Stochastic methods fall into two classes:

2.2.1 Individualist algorithms

The stochastic method starts and carries out optimization with a single solution. This solution is randomly changed and enhanced for a predefined number of steps or realization of a final criterion. The most well-respected algorithms in this class are Tabu search [35], hill climbing [36] and iterated local search [37]. The most chief feature of the algorithms in this set is the low computational effort and the need for few function evaluations.

2.2.2 Collective algorithms

Collective techniques generate and develop multiple random solutions during optimization. Usually, the collection of solutions collaborates to better identify the global optimum in the search domain. Multiple solutions reduce the chance to slack in local optima [38]. This is a key merit of these algorithms. However, each of the solutions requires a single function evaluation, where building an efficient cooperation between the solutions is a challenge. Despite these two flaws, collective stochastic optimization methods are widely used in optimization problems [11]. This is due to the well coveted features of these algorithms.

Irrespective of the distinctions between collective algorithms, they all pursue the same course of action for finding the global optimum. Optimization first begins with a pool of random solutions, which need to be combined and changed at random, quickly, and suddenly. This elicits that the solutions move globally. This stage is called exploration of the search space because the solutions are attracted toward various areas of the search space by abrupt changes [38]. After sufficient exploration, the solutions begin to sparingly change and move locally around the most promising solutions of the search space in order to raise their quality. This phase is called exploitation, and its key aim is to enhance the precision of the best solutions got in the exploration phase [39]. Although avoidance of local optima may occur in the exploitation phase, the coverage of search area is not as broad as the exploration occur. In this case, the solutions evade local solutions in the proximity of the global optimal. So we can deduce that the exploration and exploitation phases pursue inconsistent goals [40]. So, most of the methods seamlessly demand the search agents to transit from exploration to exploitation using adaptive strategies. A convincing recommendation for good performance is to achieve an adequate balance between them [38]. Due to the random behavior of the meta-heuristics, they can be deemed as stochastic collective algorithms [21, 41]. Continuing from this last point, any meta-heuristic algorithm can fall into:

Physics-Based (PB) algorithms these methods utilize the physical foundations present on Earth, in particular, or in our universe, at the broadest level. The general technique of PB methods is different from other meta-heuristics’ mechanisms because the search agents of these methods contact around the search space according to physics rules firmed up in physical processes. Some of the most prominent examples of PB algorithms include simulated annealing (SA) algorithm [42, 43], gravitational search algorithm (GSA) [44], multi-verse optimizer (MVO) [45], Henry gas solubility optimization (HGSO) [46] and equilibrium optimizer (EO) [47].
Evolutionary Algorithms (EAs) EAs follow the Darwinian theory of natural selection and biological Darwinism that represents the survival of the fittest, where the well-known evolution mechanisms in biology are simulated. These methods often do well at finding near-optimal solutions in view of the fact that they do not lend any credence about the underlying fitness landscape. The list of EAs includes, but not limited to, evolutionary strategy (ES) [48], genetic algorithm (GA) [49, 50], genetic programming (GP) [51] and differential evolution (DE) algorithm [52].
Swarm Intelligence (SI) algorithms these algorithms use the intelligence of the social collective behavior of various societies of creatures such as birds, bees, ants and the alike. This class includes a large variety of algorithms such as particle swarm optimization (PSO) [53], ant colony optimization (ACO) [54], artificial bee colony (ABC) algorithm [55], grey wolf optimizer (GWO) [56], dragonfly algorithm (DA) [57], salp swarm algorithm [11, 58], coral reefs optimization (CRO) [59] and many others [7, 8].
Human-based algorithm the algorithms of this class originate from human interactions in societies. The inspiration for researchers in the realm of human-based algorithms comes from experiences and stories related to human demeanor and human actions [60]. Previous works in this area include harmony search (HS) [61], seeker optimization algorithm [62], Human Group Formation (HGF) [63], Social-Based Algorithm (SBA) [64], Interior Search Algorithm (ISA) [65] and the football game inspired algorithm (FGIA) [66].

3 Inspiration

The proposed AFT algorithm is based on the well-known tale of Ali Baba and the forty thieves. We have found in this anecdote several intrinsic traits that inspired us to develop the AFT algorithm. Rather than literally retelling the story in this section, we prefer to link some of the events that took place in the tale to the attributes that constitute the AFT algorithm. The tale itself can be found in several books as well as web pages. We refer the interested readers to [67], as an example.

The nature of the tale is a search-based, where a gang of forty thieves go after Ali Baba. The ultimate goal of the gang is to catch Ali Baba for revenge and to get their treasure back. The search carried out by the gang for Ali Baba is iterative in nature, going in several rounds, each time reinforcing the solution found by previous iterations. The search is based upon the collective behavior of the thieves, represented as a population in the proposed algorithm. The counter measures took by the main character of the tale, named Marjaneh, prevented the gang at each iteration from fulfilling their search mission. The big town, where Ali Baba lives, represents the search space. The tale shows the success of the forty thieves in tracking down Ali Baba and spotting the location of his house. The looters’ successful actions in achieving their strategic target and finding Ali Baba were based on smart tricks and tactics. However, the acumen of the savvy maid of Ali Baba, Marjaneh, saved the life of Ali Baba in each of these tactics, abridged as follows [68].

In the first trial, the gang’s assistant captain disguised himself as a foreigner and entered the town in the hope to hear any talk or find any clue that could lead him to Ali Baba. He managed to get to Ali Baba’s house and marked his door with an ‘X’ sign in his pursuit to later fulfill the ultimate mission of the gang. Marjaneh observed the mark and, in response, she placed similar marks on the doors of all houses in the neighborhood, rendering the plan useless. The tactics followed by the robber are harnessed in the proposed algorithm to maximize the exploration efficiency.

The second trial took place when another assistant of the captain took the mission. He built upon the procedures previously followed by his comrade. In our algorithm, this is used by making the search in every iteration builds upon the best solution found so far from previous iterations. This time, the robber marked Ali Baba’s house with a sign that is not easy to be observed by chance. In AFT, this again reflects on the utilization and enhancement of previously found solutions [69]. This leads to strong exploration and exploitation features in the proposed algorithm.

The third incident occurred when the captain decided to change the plan, and take upon himself the task of capturing Ali Baba, yet to build upon the achievements attained so far by his two assistants. The captain and his followers succeeded in arriving at Ali Baba’s house, and they took every possible measure not to be discovered while attacking Ali Baba, but they failed.

The final trial in targeting Ali Baba took place by the captain alone with a totally new plan, who disguised as a merchant selling silks, and introduced himself to the son of Ali Baba. The perseverance and persistence of the gang’s captain are good traits for a successful search technique.

The approaches adopted in the tale, such as the attempts of the thieves and Marjana’s intelligence to disrupt these attempts, are reflected in the exploration and exploitation mechanisms built into the proposed algorithm. This has led to the mathematical models developed to design AFT and perform optimization. The proposed algorithm is described in detail below.

4 Ali Baba and the forty thieves algorithm

The overall goal of this work is to present a new optimization method that imitates the tale of Ali Baba and the forty thieves as a coordinated model of social behavior of humans’ actions. The following principles derived from this tale achieve the basic assumptions of this algorithm:

The forty thieves collaborate in a group and get guidance from someone or from one of the thieves to find Ali Baba’s house. This information may or may not be correct.
The forty thieves will travel a distance starting from an initial distance until they can locate Ali Baba’s house.
Marjaneh can deceive the thieves many times with astute ways to somehow protect Ali Baba out of arrival of them by a proportion.

The behaviors of the thieves and Marjaneh can be drawn up in such a manner that they can be linked to an objective function to be optimized. This makes it feasible to evolve a new meta-heuristic algorithm as detailed below.

4.1 Random initialization

The AFT algorithm is initiated by randomly initializing the position of a number of n individuals in a d-dimensional search space as shown below:

$$\begin{aligned} x= \begin{bmatrix} x_{1}^{1} &{} x_{2}^{1} &{} x_{3}^{1}&{} \ldots &{} x_{d}^{1} \\ x_{1}^{2} &{} x_{2}^{2} &{} x_{3}^{2} &{} \ldots &{} x_{d}^{2} \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots \\ x_{1}^{n} &{} x_{2}^{n} &{} x_{3}^{n}&{} \ldots &{} x_{d}^{n} \\ \end{bmatrix} \end{aligned}$$

(1)

where x is the position of all thieves, d is the number of variables of a given problem and $x_j^i$ represents the jth dimension of the ith thief.

The initial position of population (i.e., thieves) can be generated as shown in Eq. 2.

$$\begin{aligned} x^i= l_{j} + r \times (u_{j} - l_{j}) \end{aligned}$$

(2)

where $x^i$ is the position of the ith thief that denotes a candidate solution to a problem, $l_{j}$ and $u_{j}$ refer to the lower and upper bounds in the jth dimension, respectively, and r is a uniformly distributed random number in the range from 0 to 1.

The wit level of Marjaneh with respect to all thieves can be initialized as shown below:

$$\begin{aligned} m=\left[ \begin{array}{cccc} m_{1}^{1} &{} m_{2}^{1} &{} \ldots &{} m_{d}^{1}\\ m_{1}^{2} &{} m_{2}^{2} &{} \ldots &{} m_{d}^{2}\\ \vdots &{} \vdots &{} \vdots &{} \vdots \\ m_{1}^{n} &{} m_{2}^{n} &{} \ldots &{} m_{d}^{n} \end{array}\right] \end{aligned}$$

(3)

where $m_j^i$ denotes the astute level of Marjaneh in relation to the ith thief at the jth dimension.

4.2 Fitness evaluation

The values of the decision variables are inserted into a user-defined fitness function that is evaluated for each thief’s position. The corresponding fitness values are stored in an array as given in the following form:

$$\begin{aligned} f= \begin{bmatrix} f_{1}([x_1^{1}, &{} x_2^{1}, &{} \ldots , &{} x_d^{1})] \\ f_{2}([x_1^{2}, &{} x_2^{2}, &{} \ldots , &{} x_d^{2})]\\ \vdots &{} \vdots &{} \vdots &{} \vdots \\ f_{n}([x_1^{n}, &{} x_2^{n}, &{} \ldots ,&{} x_d^{n})]\\ \end{bmatrix} \end{aligned}$$

(4)

where $x_{d}^{n}$ is the dth dimension of the position of the nth thief.

In the simulation of the AFT algorithm, the solution quality is evaluated for each thief’s new location based upon a defined fitness function. After that, the location is updated if it is better than the solution quality of the current one. Each thief stays in his current location if his solution quality is more efficient than the new one.

4.3 Proposed mathematical model

As discussed above, three fundamental cases may occur while thieves search for Ali Baba. In each case, it is assumed that the thieves search efficiently throughout the surrounding environment, while there also a proportion that occurs due to Marjaneh’s intelligence that forces the thieves to search in random locations. The above searching behavior can be mathematically modeled as follows:

Case 1 The thieves may track down Ali Baba with the help of information obtained from someone. In this case, the new locations of the thieves can be obtained as follows:

$$\begin{aligned} x^{i}_{t+1}= & {} gbest_{t} + \biggl [Td_{t} \left( {\rm best}^{i}_{t} - y^{i}_{t}\right) r_{1} \nonumber \\+ & {} Td_{t} \left( y^{i}_{t} - m^{a(i)}_{t} \right) r_{2} \biggr ] \mathop {\mathrm {sgn}}({\rm rand}-0.5);\nonumber \\&r_{3} \ge 0.5, \;\; r_{4} >Pp_{t} \end{aligned}$$

(5)

where $x^{i}_{t+1}$ represents the position of thief i at iteration $(t+1)$, $y^{i}_{t}$ is the position of Ali Baba in relation to thief i at iteration t, ${\rm best}^i_t$ represents the best position that has achieved so far by thief i at iteration t, $gbest_t$ represents the global best position obtained so far by any thief up to the $t_\mathrm{th}$ iteration, $m^{a(i)}_{t}$ represents Marjaneh’s intelligence level used to camouflage thief i at iteration t, $Td_{t}$ is the tracking distance of the thieves at iteration t, $Pp_{t}$ denotes the perception potential of the thieves to Ali Baba at iteration t, rand, $r_{1}$, $r_{2}$ and $r_{4}$ are random numbers generated with a uniform distribution between 0 and 1 , $r_{3} \ge 0.5$ gives either 1 or 0 to indicate that the information obtained to the thieves is true or false, respectively, and $\mathop {\mathrm {sgn}}({\rm rand}-0.5)$ gives either 1 or − 1 to change the direction of the search process.

The parameter a in $m^{a(i)}_{t}$ can be defined as follows:

$$\begin{aligned} a = \lceil {(n-1) \cdot {\rm rand}(n,1)\rceil } \end{aligned}$$

(6)

where rand(n, 1) represents a vector of random numbers generated with a uniform distribution in the range of [0, 1].

Marjaneh updates her astute plans if the quality of the new solution that the thieves come up with is better than their previous position. In this case, Eq. 7 can be used to update her’s plans.

$$\begin{aligned} m^{a(i)}_{t}={\left\{ \begin{array}{ll} x^{i}_{t} &{} if \;\; f\left( x^{i}_{t}\right) \ge f\left( m^{a(i)}_{t}\right) \\ m^{a(i)}_{t} &{} if \;\; f\left( x^{i}_{t}\right) < f\left( m^{a(i)}_{t}\right) \\ \end{array}\right. } \end{aligned}$$

(7)

where $f(\cdot )$ stands for the score of the fitness function.

The tracking distance parameter ${\rm Td}_{t}$ is defined as given in Eq. 8.

$$\begin{aligned} {\rm Td}_{t} = \alpha _0 e^{-\alpha _1(t/T)^\alpha _1} \end{aligned}$$

(8)

where t and T denote the current and maximum number of iterations, respectively, $\alpha _0$ ($\alpha _0=1$) represents the initial estimate of the tracking distance at the first iteration and $\alpha _1$ is a constant value used to manage exploration and exploitation capabilities.

Equation 8 shows that ${\rm Td}_{t}$ is iteratively updated during the course of iterations of the AFT algorithm. Figure 1 shows the values of ${\rm Td}_t$ over 1000 iterations.

The tracking distance, as shown in Fig. 1, greatly affects the search ability, which has a large impact on both the exploration and exploitation power of the AFT algorithm. As presented in Fig. 1, the parameter ${\rm Td}_t$ starts from a value of 1.0 and goes down to the lowest value where it assumes that the thieves have arrived at Ali Baba’s house. Large values of ${\rm Td}_t$ result in global search that can be diverted toward further exploration, and this may avoid local optimal solutions. On the other side, small values of ${\rm Td}_t$ lead to local search, where this increases the exploitation ability in AFT so that the thieves have a good possibility to find Ali Baba.

Similarly, the perception potential parameter ${\rm Pp}_{t}$ was defined as given in Eq. 9.

$$\begin{aligned} {\rm Pp}_{t} = \beta _0 log(\beta _1 (t/T)^{\beta _0}) \end{aligned}$$

(9)

where $\beta _0$ ($\beta _0=0.1$) represents a final rough estimation of the probability that the thieves will realize their target at the end of the iterative process of AFT and $\beta _1$ is a constant value used to manage exploration and exploitation capabilities.

Figure 2 shows the values of ${\rm Pp}_{t}$ over 1000 iterations.

As shown in Fig. 2, by gradually increasing the value of ${\rm Pp}_t$, AFT tends to move from global search to local search in the most promising areas where a potential solution could be found in these areas. In other words, large values of ${\rm Pp}_t$ lead to local search that intensifies the search in the most appropriate areas of the search space. On the other side, small values reduce the possibility of searching in the vicinity of current good solutions. Thus, an increase in this value stimulates AFT to explore the search space on a global scale and to diversify the search in all areas of the search space.

For all of the problems solved in this work, $\alpha _1$ and $\beta _1$ are both equal 2.0. These parameters are found by experimental testing for a large subset of test functions.

Case 2 The thieves may grasp that they have been deceived, so they will randomly explore the search space for Ali Baba. In this case, the new locations of the thieves can be obtained as follows:

$$\begin{aligned} x^{i}_{t+1}= {\rm Td}_{t} \left[ \left( u_j - l_j\right) {\rm rand} + l_j\right] ; \;\; r_{3} \ge 0.5,\; r_{4} \le {\rm Pp}_{t} \end{aligned}$$

(10)

The parameter ${\rm Td}_t$ is incorporated in Eq. 10 because the thieves have a good level of knowledge to discern of the most propitious areas of the search space where Ali Baba’s house could be.

Case 3 In order to ameliorate the exploration and exploitation features of the AFT algorithm, this study also considers the search in other positions than those that could be obtained using Eq. 5. In this case, the new locations of the thieves can be obtained as follows:

$$\begin{aligned} x^{i}_{t+1}= & {} gbest_{t} - \biggl [Td_{t} \left( {\rm best}^{i}_{t} - y^{i}_{t}\right) r_{1} \nonumber \\+ & {} Td_{t} \left( y^{i}_{t} - m^{a(i)}_{t} \right) r_{2} \biggr ] \mathop {\mathrm {sgn}}({\rm rand}-0.5); \nonumber \\&r_{3} < 0.5 \end{aligned}$$

(11)

The pseudo-code of the AFT algorithm can be briefly described by the iterative steps given in Algorithm 1.

Algorithm 1 reveals that AFT initiates optimization in solving an optimization problem by randomly generating a set of positions (i.e., potential solutions), considering the upper and lower bounds of the problem variables. After that, the best position, the global best position of the thieves and Marjaneh’s wit plans are initialized. The quality of each created solution is assessed using a pre-defined fitness function, whereby the suitability of each solution is recalculated within each iteration in order to identify the thief with the optimal solution. For each dimension, the new position of the thieves is computed iteratively within each iteration using Eqs. 5, 10 and 11. The feasibility of each new position is examined to see if it moves out of the search area. In such a context, it will be brought back to the boundary on the basis of the simulated steps of AFT. Then, the new position, the best position, the global best position of the thieves and the wit plans of Marjaneh are assessed and updated accordingly. All the steps of AFT shown in Algorithm 1, except the initialization steps, are iteratively performed until the termination evaluation condition is reached. At the end, the best position of the thieves is scored as a solution of the optimization problem.

4.4 Exploration ability of AFT

There are many parameters in AFT that lead to exploration, explained as follows:

$Td_{t}$: this parameter controls the exploration quantity of AFT. It identifies the extent to which the new locations of the thieves would be to the house of Ali Baba. The selection of appropriate values for $\alpha _0$ and $\alpha _1$ for $Td_t$ would reduce the recession probability in local optima and augment the probability of approaching the global optimum. Based on the experimental tests, $\alpha _{0} = 1$ and $\alpha _1 = 2$ offers a good balance between exploration and exploitation.
$Pp_{t}$: this parameter underlines the high exploration capacity of AFT when it takes relatively small values. This parameter is gradually increased during the iterative process of AFT. The choice of the values for $\beta _0$ and $\beta _1$ in $Pp_{t}$ is a little bit arbitrary, but was selected based on pilot testing for a large set of test functions. In the initial iterations, the candidates are all far away from each other in distance. Updating the parameter $Pp_{t}$ improves AFT’s ability to avoid stagnation entrapment in local optima and approaches the global optimum. Based on empirical testing, $\beta _0 = 0.1$ and $\beta _1 = 2$ present a good balance between exploration and exploitation.
$\mathop {\mathrm {sgn}}({\rm rand}-0.5)$: this parameter manages the direction of exploration. Since rand takes values between 0 and 1 with a uniform distribution, there is an equal probability of negative and positive signs.
Marjaneh’s intelligence plans: Using this parameter will directly improve the AFT’s ability for exploration.

4.5 Exploitation ability of AFT

The key parameters that help to perform local search and exploitation in AFT can be described as follows:

$Td_{t}$: as iteration passes, exploration fades out and exploitation fades in. Small values of $Td_{t}$ lead to local searches in promising areas of the search space. As a result, at the last iterations, where thieves are close to the house of Ali Baba, the positioning updating process with cases 1, 2 and 3 will assist in local search around the best solution, leading to exploitation.
$Pp_{t}$: this parameter controls the exploitation feature, by quantifying the quantity of exploitation through in-depth search around the best solution. With the passage of iterations, the exploitation stage heightens with facing relatively large values of this parameter. Thus, the positioning updating process with cases 1 and 2 enhances AFT’s ability to locally ramp up the searches in the space, which results in further exploitation.
$\mathop {\mathrm {sgn}}({\rm rand}-0.5)$: this parameter controls the quality of exploitation by changing the direction of the search.

4.6 Computational complexity analysis

Computational complexity of an optimization method can be defined by a function that links the runtime of the optimization method to the input size of the problem. To do this, Big-O notation is applied here as a widespread terminology. In this, the time complexity of AFT can be given as follows:

$$\begin{aligned} {\mathcal {O}} (AFT)= & {} {\mathcal {O}} ({\rm problem} \; {\rm def}.) + {\mathcal {O}} ({\rm initialization})\nonumber \\&+ {\mathcal {O}}(t ({\rm pos}. \; {\rm update})) + {\mathcal {O}}(t ({\rm cost} \; {\rm function})) \nonumber \\&+ {\mathcal {O}}(t ({\rm boundary} \; {\rm control})) \end{aligned}$$

(12)

As Eq. 12 suggests, the time complexity of AFT relies on the number of iterations (t), the number of thieves (n), the dimension of the problem (d) and the cost of the objective function (c). In concrete terms, the overall time complexity of AFT under the termination method can be computed as follows:

$$\begin{aligned} {\mathcal {O}} ({\rm AFT})= & {} {\mathcal {O}}(1) + {\mathcal {O}}(nd) + {\mathcal {O}}(vtnd) \nonumber \\&+ {\mathcal {O}}(vtnc) + {\mathcal {O}}(vtnd) \end{aligned}$$

(13)

where v denotes the number of evaluation experiments.

The number of iterations (t) is typically greater than the number of thieves (n), the cost of the fitness function (c) and the number of problem’s variables (d). Also, the number of problem variables (d) and the cost of the fitness function (c) are usually less than the number of thieves (n). Accordingly, the parameters t and n are important factors in assessing the computational complexity. As $nd \ll tnd$ and $nd \ll tcn$, the items 1 and nd can be excluded from the complexity issue given in Eq. 13; also, $2 vtnd \cong vtnd$. Therefore, the time complexity of AFT can be expressed as follows:

$$\begin{aligned} {\mathcal {O}} ({\rm AFT}) \cong (vtnd + vtnc) \end{aligned}$$

(14)

As it is shown, the complexity issue of the AFT is of the polynomial order, which can be deemed as an effective meta-heuristic optimization algorithm.

4.7 Characteristics of AFT

Human-based algorithms possess two abilities, that is, exploration and exploitation. This is to optimize the search space of a problem. In AFT, these abilities are realized by the convergence of the thieves toward the global optimum solution. To be precise, convergence means that most of the thieves gather in the same position in the search space. AFT utilizes several parameters that lead to exploration and exploitation as explained in Subsects. 4.4 and 4.5, respectively. These parameters are beneficial for carrying out the convergence process of AFT. The thieves (i.e., search agents) in AFT can change their position in line with a mathematical model and tuning criteria as implemented by three basic cases that may occur while thieves search for Ali Baba. These cases are presented in Eqs. 5, 10 and 11. In each case, it is assumed that the thieves search efficiently throughout the surrounding environment, while there is also a percentage that occurs due to Marjaneh’s intelligence that forces the thieves to search in random locations. There are two important parameters in AFT, referred to as tracking distance and perception potential that are presented in Eqs. 8 and 9, respectively. With these two parameters, AFT can better search the space for all possible solutions to identify the optimal or suboptimal solutions. Another important parameter in AFT is the simulation of Marjaneh’s intelligent ways to deceive the thieves. Thereby, the thieves will explore the search space in different locations and directions, which implies that better solutions may be found in other promising areas. In short, AFT has several distinct merits according its basic principle, summarized as follows: (1) The position updating models for case 1 and case 3 of AFT effectively assist the individuals of the population to explore and exploit every area in the search space. (2) The random search that thieves use in the search space using case 2 not only enhances the diversity of the population but also ensures the speed of convergence, indicating an efficient balance between exploration and exploitation. (3) The number of parameters in AFT is small, but they have good ability to improve its strength and performance. (4) The computational burden of AFT is low as discussed in Subsect. 4.6.

As a result, there is a big room for enhancing the performance of AFT according to the above mentioned characteristics, as presented in the following section.

5 Possible developments of AFT

To further study the potential performance of the AFT algorithm, it is elaborated from several aspects as shown in the following subsections.

5.1 Self-adaptation of tracking distance of AFT

Self-adaptive tracking distance is used to strike a better balance between exploration and exploitation during the search process [70]. This distance decreases as a function of time indicating that the exploration capacity gradually fades out and exploitation capacity gradually fades in. However, the search agents trapped in local optima area demand reasonable exploration to escape from this local optima. Some search agents require a large tracking distance to explore the search space and others exploit the local area with a small tracking distance. So, it is imperative for each search agent to have its own tracking distance to balance exploration and exploitation. When the fitness value of a search agent is worse or unaltered, it denotes that the search agent can identify the local optimal area. In this, the search agents require a large tracking distance to move away from this area. When the fitness value of a search agent ameliorates, it indicates that the search agent has a superior chance of getting close to the optimal solution. Hence, the value of the tracking distance, Td(t), of the search agents of AFT should be grown. Figure 3 presents an illustration of the self-adaptive tracking distance [70].

In Fig. 3, the pentagram, sphere, circle and arrowhead stand for the global optimum area, the local optimum area, search agent and tracking distance, respectively. Case 1 implements that if a search agent is trapped into local optimal area, it needs an appropriate tracking distance to raise its strength to escape from this area. On the other hand, Case 2 indicates that a search agent rapidly moves to the global optimal area with mounting tracking distance. In this context, to estimate the situation of the search agents in AFT, the counters $ns_i$ and $nf_i$ are presented in Eqs. 15 and 16, respectively, to record the fitness results of the ith search agent [70].

$$\begin{aligned} ns_{i}^{t}= & {} {\left\{ \begin{array}{ll} ns_{i}^{t-1}+1 &{} if \;\; f_i(t) < f_i(t-1)\\ 0 &{} if \;\; f_i(t) \ge f_i(t-1) \end{array}\right. } \end{aligned}$$

(15)

$$\begin{aligned} nf_{i}^{t}= & {} {\left\{ \begin{array}{ll} nf_{i}^{t-1}+1 &{} if \;\; f_i(t) > f_i(t-1)\\ 0 &{} if \;\; f_i(t) \le f_i(t-1) \end{array}\right. } \end{aligned}$$

(16)

Equations 15 and 16 are presented to adjust the parameter Td(t) according to the fitness of the objective function, where it is employed to estimate the search condition of search agents. If a search agent fails to obtain a better solution in many iterations, the search agent gets stuck in the local optima area with a high probability. If a search agent is improved in many iterations, it may migrate to the global optima.

A threshold $\theta$ and probability p are applied to dominate the update of the tracking distance of search agents over the course of iterations. If $ns^t_i$ exceeds $\theta$, the tracking distance will be increased to speed up the convergence of the ith search agent toward the super search agent. Likewise, if $nf^t_i$ exceeds $\theta$, the tracking distance is enhanced to mend the ability to avert the local optima while searching. The self-adaptive ${\rm Td}_i(t)$ of the ith search agent is defined in Eq. 17:

$$\begin{aligned} {\rm Td}_{i}(t)={\left\{ \begin{array}{ll} {\rm Td}_{i}(t)\cdot r_i(t) &{} {\rm if}\;\; {\rm count} > \theta\; \&\, {\rm rand} < p\\ Td_{i}(t) &{} {\rm otherwise} \end{array}\right. } \end{aligned}$$

(17)

where count comprises two counters $ns^t_i$ and $nf^t_i$, and rand is a uniformly distributed random value in the interval [0, 1].

In Eq. 17, when count overrides $\theta$ and rand overrides p, the search agent requests a large tracking distance to strengthen its exploration capability, where it is multiplied by $r_i(t)$. Otherwise, the updated tracking distance is set to the tracking distance of AFT.

The tracking distance of search agents is related to the distance that thieves use to follow Ali Baba as given in Eq. 8. Also, there is another parameter that is related to the perception potential of the thieves for Ali Baba. Therefore, the tracking distance and perception potential constants of the ith search agent are used to adapt its ${\rm Td}_i(t)$. The ratio of tracking distance and perception potential constants of the search agents can be thought of as the new tracking distance value for the search agents. Here, $r_i(t)$ stands for the adjusted value of ${\rm Td}_i(t)$ and is presented in Eq. 18:

$$\begin{aligned} r_{i}(t)={\left\{ \begin{array}{ll} \frac{2}{c} &{} if \;\; c < 1\\ c &{} c \ge 1 \end{array}\right. } \end{aligned}$$

(18)

where c is the ratio of tracking distance and perception potential constants as shown in Eq. 19.

$$\begin{aligned} c= \left| log\left( \frac{{\rm Td}_i(t)}{{\rm Pp}_i(t)}\right) \right| \end{aligned}$$

(19)

where ${\rm Td}_i(t)$ and ${\rm Pp}_i(t)$ are the tracking distance and perception potential of search agent $x_i$ at iteration t, respectively.

Equation 18 states that $r_i(t)$ is set to the reciprocal of c when c is less than 1 to ensure that the search agent can have a reasonably large tracking distance, otherwise, $r_i(t)$ is set to c.

The original and self-adaptive tracking distance constants are presented in Figs. 1 and 4, respectively.

The tracking distance constant in Fig. 4 is changed in accordance to the search condition of the search agents, where the red and blue lines indicate a modification of Case 1 and Case 2, respectively. The search agent is in a state of failure when the search agent gets a worse search status. In contrast, the search agent is in a state of success when the search agent gets better search status.

In sum, in the search process, various search agents have different search cases. Some of them want vigorous exploration to explore the solution space, while others want extensive exploitation to locate a better solution. Therefore, each search agent adjusts its Td to balance exploration and exploitation.

5.2 Population hierarchical structure of AFT

The hierarchical structure of the population is that the search agents are arranged and placed in different layers according to some specific characteristics [71, 72]. These layers can be stated as different levels from top to bottom according to the actual effect of each layer on the search agents. The top layer leads its next layer, and its next layer leads its second one, and so on. In this light, the interactive relationship between the layers is created to form a hierarchical structure that is used to guide the evolution direction of the search agents [72]. Here, a hierarchical population structure was used in the AFT algorithm, so that premature convergence could be mitigated and search agents could elicit correct interactive information to realize a better development. Further, the search agents have ample opportunity to escape from local optima and get close to the global optimum. Here, a three-layer hierarchical structure was constructed for hierarchical AFT (HAFT) as follows:

Bottom layer The distribution of all search agents in the current population is displayed on this layer. Search agents move toward better ones in terms of the best search agents on the middle layer. This layer can divulge the landscape of the function created by a big number of search agents.
Medium layer For the effective guidance of the development of general search agents, the best predetermined search agents are ranked on this layer. At each iteration, the medium layer leads the bottom layer to fulfill the position update for the search agents. Each search agent needs a large tracking distance and small perception potential to globally explore the whole search area in the first few iterations of the search process, and these parameters are gradually updated over iterations. This means that the exploration ability of AFT wants to be supported by a large suitable Td(t) and its exploitation capacity demands a small one. Thus, to improve the exploration ability of AFT, a new Td(t) is presented to supersede the original one given in Eq. 8. In the proposed HAFT, a log-sigmoid transfer function was used to design a new constant Td(t) with the formula given below:
$$\begin{aligned} {\rm Td}(t) = \frac{T0}{1+ e^{\frac{t-\frac{t}{2}}{L}}} \end{aligned}$$
(20)
where L is a step length. It can be observed that the value of Td(t) in the graph shown in Fig. 1 decreases rapidly before 500 iterations, indicating the exploration ability of AFT is rapidly diminishing. On the other side, the value of Td(t) in Fig. 5 consistently preserves a large value before 500 iterations and then drops rapidly to near zero. This effect can ensure a powerful exploration ability of AFT in the early stage so that it has enough time to search for an approximate optimal that can be further improved by its next exploitation capability. Since the number of the best search agents gradually decreases with iterations, which means that the number of best search agents dynamically decreases on this layer. It is useful for the global optimum search agent on the top layer to effectively direct many elite search agents in the current population to provide a better development tendency for all of the search agents on the bottom layer.
Top layer To provide efficient management for the middle layer, a global optimum search agent is determined and placed on this layer. At each iteration, the pre-identified best search agents on the medium layer are selected to be compared with the global optimum search agent. In case there is a better search agent, the optimal global one is replaced by it. The global optimal search agent has the best position in the current population. Hence, it can attract many of the best search agents to move toward it. This strategy could prevent the best search agents from being trapped in the local optima and could accelerate the convergence speed of the population. According to this strategy, the formula of updating the position of the best search agents oriented by the global optimal search agent is presented as follows:
$$\begin{aligned} x^{i}_{t+1}= & {} g_{{opt}_{t}} - \biggl [{\rm Td}_{t} \left( y^{i}_{t} - m^{i}_{t} \right) r_{2} \biggr ] \mathop {\mathrm {sgn}}(0.5-r) \nonumber \\&r_{3} <0.5 \end{aligned}$$
(21)

where $g_{\rm opt}$ denotes the global optimal search agent and r represents a random number with a uniform distribution in the interval [0, 1].

In HAFT, Eq. 21 was proposed to update the position of the search agents on the top layer which could efficaciously alleviate the early convergence of AFT and improve its performance.

In order to further clarify the apparent properties of HAFT, Fig. 6 was drawn to show its operating precept on the multimodal landscape with local optima.

It is obvious from Fig. 6a that search agents $x_3$ and $x_4$ are heading toward a local optimal, while the global optimal search agent $g_{\rm opt}$ provides additional guidance to assist them escape from the local optimal. When search agents $x_3$ and $x_4$ do not fall into a premature convergence as shown in Fig. 6b, they can draw others to move toward them. Meanwhile, $g_{\rm opt}$ will accelerate the movement of $x_3$ and $x_4$ in order to improve the convergence rate of the population. Thus, the capabilities of exploration and exploitation can be enhanced in the search process. More details about hierarchical population structure can be found in [71, 72].

5.3 Exploration and exploitation

Exploration and exploitation are the two most important properties of meta-heuristic algorithms to achieve success when addressing optimization problems [38]. With regard to these two concepts, empirical experiments have shown that there is a robust relationship between the exploration and exploitation ability of a particular search method and its convergence speed. Particularly, while exploitation procedures are known to improve convergence toward the global optimum, they are also known to rise the likelihood of entrapment into local optima [38]. Conversely, search strategies that promote exploration over exploitation incline to increase the likelihood of locating areas within the search space, where the global optimum is more probable to be identified. This is at the cost of deteriorating the convergence speed of optimization algorithms [39]. In recent years, the question of how exploration and exploitation of solutions is realized in meta-heuristics has remained an open subject, and although it appears trivial, it has stayed as a source of contention among many researchers [40]. Although many thoughts and notions may sound opposite, there appears to be a common consent within the research community on the conception that an adequate ratio between exploration and exploitation is necessary to ensure reasonable performance in this type of search methods.

Meta-heuristics use a set of candidate solutions to explore the search area with the goal of finding satisfying solutions for an optimization problem. Generally, the search agents with the superior solutions are liable to guide the search process toward them. As a result of this attraction, the distance between the search agents fades in while the impact of exploitation fades out. On the other side, when the distance between the search agents increases, the influence of exploration strategy is more pronounced. To compute the increase and decrease in distance between the search agents, a diversity measurement [73] is taken into account. Under this method, population diversity is stated as follows [38]:

$$\begin{aligned} {\rm Div}_j= & {} \frac{1}{N}\sum _{i=1}^{N} \left| {\rm median}(x^j) - x_i^j\right| \end{aligned}$$

(22)

$$\begin{aligned} {\rm Div}= & {} \frac{1}{m}\sum _{j=1}^{m}{\rm Div}_j \end{aligned}$$

(23)

where ${\rm median}(x^j)$ represents the median of dimension j in the entire population, $x^j_i$ is the dimension j of search agent i, n stands for the number of search agents and m denotes the number of design variables of the optimization problem.

The diversity in each dimension, ${\rm Div}_j$, is stated as the distance between the dimension j of each search agent and the median of that dimension on average. The full balance response is defined as the percentage of exploration and exploitation utilized through a given algorithm. These values are calculated at each iteration using the following formulas [38]:

$$\begin{aligned} {\rm XPL}\%= & {} \left( \frac{\rm Div}{{\rm Div}_{\rm max}}\right) \times 100 \end{aligned}$$

(24)

$$\begin{aligned} {\rm XPT}\%= & {} \left( \frac{\left| {\rm Div-Div}_{\rm max}\right| }{{\rm Div}_{\rm max}}\right) \times 100 \end{aligned}$$

(25)

where ${\rm Div}_{\rm max}$ stands for the maximum diversity value present in the whole optimization process.

The percentage of exploration (XPL%) corresponds to the relationship between the diversity at each iteration and the maximum diversity reached. The percentage of exploitation (XPT%) represents the level of exploitation [38]. As can be observed, both elements XPL% and XPT% are mutually conflicting and complementary. In assessing the balance response, the use of the median value averts discrepancies through the use of a reference element. This balance response is also affected by $Div_{max}$ that is found during the whole optimization process. This value is employed as a reference to assess the rate of exploration and exploitation.

5.4 Chaos in meta-heuristic algorithms

Chaos theory is one of the most effective strategies used to improve the performance of meta-heuristics by fostering their exploration and exploitation features. Chaos appears to exhibit irregular motion, a characteristic often encountered in nonlinear dynamic systems [74, 75]. It appears to be random, unexpected behavior that a deterministic nonlinear system can present under deterministic conditions. Thus, a chaotic system alters randomly and ultimately passes through each state in the search space when the time period is long enough. The applications of chaos in global optimizers fall into two categories.

5.4.1 Chaotic maps and sequences

Chaotic maps are one of the preferable ways to reinforce the performance score of meta-heuristics in terms of both local optima avoidance and convergence property. They are widely used to improve population diversity and solution quality by substituting random values and adjusting parameters in the initialization of population and iterative loop procedures [74, 75]. Chaotic properties have been used in improved and new meta-heuristics, such as EAs [74, 75], the immune system algorithm [76], PSO [77] and DE [78]. These chaotic meta-heuristics have received a high level of performance through the use of chaotic sequences to replace random variables and parameters. In this, they presented superb performance compared to the other corresponding standard meta-heuristics.

5.4.2 Chaotic local search

Chaotic local search (CLS) appears as an applicable option by making use of randomness and ergodicity of chaos [74, 75]. Chaotic search is a mechanism that could be conducted to improve the accuracy of the search and convergence speed. For this reason, CLS has been integrated with several meta-heuristic algorithms and achieved splendid success in enhancing their performance, such as chaotic PSO [79], chaotic DE [80] and chaotic GSA [81]. Their outcomes showed that CLS could prominently strengthen search capacity and dwindle the problem of getting into local optima. It has been widely demonstrated that meta-heuristics with CLS achieved better performance in terms of convergence rate and solution accuracy than the other corresponding original versions [74, 75].

5.5 Theoretical analysis of the AFT algorithm

To theoretically analyze the performance of the AFT algorithm from the perspective of complex network, there is a need to establish a relationship among its search agents. This analysis is helpful and vital to explain its essence and discover some guiding methods to overcome its limitations in order to better foster the performance of this algorithm [82]. For this intent, we use the population interaction network (PIN) method reported in [83, 84] to put in place the relationship between the search agents of AFT. This is for exploring and analyzing the intrinsic phenomenon that occurs in a complex network. A clustering method was used to classify the search agents [83, 84]. In this method, each search agent can be regarded as a vertex and the update position mechanism between the search agents denotes the generation of edges. The PIN method can be used to obtain both the intrinsic connection of knowledge and characteristic of the network formed by the population. The method that composes the interaction of population in AFT is displayed in Fig. 7.

In Fig. 7, blue circles, transparent circles, transparent rectangles, blue rectangles, and blue diamond shapes implement the search agents (i.e., vertices) in the current population, the clusters, search agents $x_p$ that will be replaced, the search agents that have been replaced and the new created search agent $U_n$, respectively. The circle, square and triangle denote the current, new constructed and old substituted vertices, respectively. It is noticed from Fig. 7 that the distribution of search agents has changed in the whole population and that the number of search agents in each class changes accordingly. In sum, the initial construction process of PIN can be described as stated below:

1.
There are three classes and nine basic search agents in the population;
2.
Two chosen search agents, $x_{s1}$ and $x_{s2}$, from two classes yield the new created search agent $U_{n}$ for comparison with the previous search agent, $x_{p}$. A vertex and two edges are created at the same time;
3.
If the search agent $U_n$ overrides the search agent $x_p$, in which $(f(U_n) < f (x_p))$, then $U_n$ replaces $x_p$;
4.
Another search agent $x_s$ is chosen from one class to create $U_n$ to replace $x_p$, which means creating a vertex and an edge;
5.
The replacement process is carried out once more when $f (U_n) < f (x_p)$;
6.
At the next iteration, the clustering method resumes classifying the search agents into three classes to finally get to terminate the algorithm and obtain the PIN topology.

Readers can read [83, 84] for a detailed description of the PIN method.

6 Comparative analysis of AFT with other meta-heuristics

This section presents a comparative analysis of AFT with other meta-heuristics such as PSO, GSA, DE, GA, covariance matrix adaptation-evolution strategy (CMA-ES) and ant colony optimization (ACO) algorithm.

6.1 Particle swarm optimization

PSO [85] mimics the cooperative social collective behavior of the living creatures such as flocks of birds. Optimization begins with the use of randomly generated solutions known as artificial particles. Each particle in the swarm has a randomly generated velocity. If $x_i$ is the initial position of the ith particle with velocity $v_i$, then the position updating strategy of PSO can be given as follows [86]:

$$\begin{aligned} v_i(t + 1)&= wv_i(t) + c_1(Pbest_i - x_i(t)) r_{1} \\&\quad + c_2(Gbest - x_i(t)) r_{2} \end{aligned}$$

(26)

$$\begin{aligned} x_i(t + 1)= x_i(t) + v_i(t + 1) \end{aligned}$$

(27)

where w is the inertial weight, $c_1$ and $c_2$ are cognitive and social constants, respectively, $r_{1}$ and $r_{2}$ are distributed random numbers in the interval [0, 1], $Pbest_i$ is the local best solution of the ith particle and Gbest is the global best solution among all particles.

6.1.1 AFT versus PSO

Similar to PSO, AFT initiates the optimization process by motivating the search agents to move in the search space in search of their target. However, the positioning updating mechanism of AFT is entirely different from that of PSO. Some of the main differences are described as follows:

1.
In PSO, the movement update of the ith particle is obtained by $Pbest_i$ and Gbest as given in Eq. 26, where the effect of these two parameters is considered to identify the new position of the particles in the search space. In regard to AFT, the new position of the search agents are obtained through three different cases as given in Eqs. 5, 10 and 11. In other words, PSO updates all solutions with one strategy as presented in Eq. 27, while the search agents of AFT use three strategies to update their position in the search space.
2.
The PSO algorithm is greatly influenced by the initial values of the cognitive and social parameters as well as the weighting strategy of the velocity vector, where these parameters are used as the particle develops a new position. In the AFT model, the thieves develop a new position with the help of tracking distance which is adapted during its iterative loops. This enables AFT to alternate between local and global searches.
3.
The behavior of the thieves’ movement is affected by the information given by someone about the whereabouts of Ali Baba’s house, which is designed with a random number ($r_{3}$). Accordingly, case 1 or case 3 shown in Eqs. 5 or 11 are used, respectively. Inclusion of this random number in the AFT model suddenly redirects thieves’ movement and thus improves exploration and exploitation in AFT. On the other hand, PSO does not use such behavior.
4.
Simulation of thieves’ behavior in Eq. 10 imparts an opportunity to present a random behavior of thieves’ movement. This enables AFT to mitigate stumbling in local optimum areas. This behavior is not used in PSO due to the natural behavior of swarms.
5.
The use of Marjaneh’s intelligence that is formulated in Eq. 7 improves the exploration feature of AFT, where there is no such thing in the PSO algorithm.
6.
The use of tracking distance and perception probability in AFT enables it to conduct local searches in local areas at some times, exploration of the search space on a global scale at other times as well as getting an appropriate balance between exploration and exploitation features. These two parameters are not present in the PSO algorithm.

6.2 Gravitational search algorithm

Gravitational search algorithm is a physics-based algorithm evolved on the basis of the law of gravity [44]. Each individual (i.e., agent) evolves its position according to the gravitational force among individuals. The mechanism of GSA is based on the interaction of masses in the universe by means of the Newtonian law of gravitation. To describe GSA, consider a system with N masses (i.e., agents), where the position of the ith mass is defined as follows:

$$\begin{aligned} X_i = (x^1_i , x^2_i , \ldots , x^d_i )\;\;\;\;\; i\in 1, 2, 3, \ldots , N \end{aligned}$$

(28)

where $x^d_ i$ denotes the position of the ith mass in the dth dimension and d represents the total number of dimensions in the search space.

The mass of the ith agent is computed after calculating the fitness of the current populations, which is defined as follows:

$$\begin{aligned} m_i(t)= & {} \frac{f_i(t)-{\rm worst}(t)}{{\rm best}(t)-{\rm worst}(t)} \end{aligned}$$

(29)

$$\begin{aligned} M_i(t)= & {} \frac{m_i(t)}{\sum _{j=1}^{N}m_j(t)} \end{aligned}$$

(30)

where $M_i(t)$ and $f_i(t)$ represent the mass and fitness values of the ith agent at iteration t, respectively, best(t) and worst(t) represent the best and worst fitness values of the current population in the tth iteration, respectively, where worst(t), for a minimization problem, is defined as follows:

$$\begin{aligned} {\rm worst}(t) = {\rm max} fit_j(t), \;\;\;\;\;\;\;\; j \in \left\{ 1, 2, \ldots , N\right\} \end{aligned}$$

(31)

The gravitational force between agents $X_i$ and $X_j$ in the dth dimension can be computed as follows:

$$\begin{aligned} F_{ij}^d(t)= G(t) \frac{M_i(t) \times M_j(t)}{R_{ij}(t)+\epsilon } \left( x_j^d(t) -x_i^d(t) \right) \end{aligned}$$

(32)

where $R_{ij}(t)$ stands for the Euclidean distance between agents i and j, $\epsilon$ is a small value used to eschew division by zero and G(t) is a gravitational constant given as a function of time as shown below:

$$\begin{aligned} G(t) = G_0 \times e^{-\alpha \frac{t}{T}} \end{aligned}$$

(33)

where $G_0$ represents an initial value, $\alpha$ represents a constant value, and t and T represent the current iteration number and the maximum number of iterations, respectively. The total gravitational force $F^d_i(t)$ for agent $X_i$ is given as follows:

$$\begin{aligned} F_i^d(t) = \sum _{j \in Kbest, j \ne i}{\rm rand}_iF_{ij}^d(t) \end{aligned}$$

(34)

where Kbest refers to a group of the first K best agents with the best fitness value and biggest mass is kbest, K is the agent number of Kbest and ${\rm rand}_i$ is a uniformly distributed random number in the range [0, 1].

Hence, the acceleration $a^d_i(t)$ of agent $X_i$ in the dth dimension at time t can be computed using a law of motion as shown in Eq. 35.

$$\begin{aligned} a_i^d(t)= \frac{F_i^d(t)}{M_i(t)} \end{aligned}$$

(35)

Then, the velocity $v^d_i(t + 1)$ and position $x^d_i(t + 1)$ of agent $X_i$ are updated, respectively, as follows:

$$\begin{aligned} v_i^d(t+1)= & {} {\rm rand} _i \times v_i^d (t) + a_i^d(t) \end{aligned}$$

(36)

$$\begin{aligned} x_i^d(t+1)= & {} x_i^d (t) + v_i^d(t+1) \end{aligned}$$

(37)

where ${\rm rand}_i$ is a uniformly distributed random value in the interval from 0 to 1.

6.2.1 AFT versus GSA

1.
As shown in Eqs. 5, 10 and 11, the position updating mechanism of the search agents of AFT is totally different from that of GSA as defined in Eq. 37.
2.
GSA uses acceleration and velocity vectors for the movement of its agents, while AFT generates new directions of movement for its search agent by various mechanisms.
3.
There are some own adaptive parameters for AFT such as ${\rm Td}_t$ and ${\rm Pp}_t$. However, GSA does not use these parameters, where it has its own parameters such as $\epsilon$ and $R_{ij}(t)$.
4.
AFT algorithm incorporates the concept of random movement of the thieves based on the parameter $Pp_t$, and uses Marjaneh’s intelligence in its position updating mechanism. Obviously, the generation of new solutions by AFT seems very different from the update mechanism of the agents of GSA.

6.3 Conventional differential evolution algorithm

The differential evolution (DE) algorithm is a population-based evolutionary algorithm evolved to solve real-valued optimization problems [87]. The evolutionary process of DE involves evolutionary concepts such as mutation, crossover and selection strategies similar to those used by GAs. The initialization of each individual $X_i , i \in \left\{ {1, 2, \ldots , {\rm NP}}\right\}$ in DE is described as follow:

$$\begin{aligned} X_i^d= X_i^{dl} + {\rm rand} (0, 1) \cdot (X_i^{u} - X_i^{l}) \end{aligned}$$

(38)

where NP is the population size, $d \in \left\{ {1, 2, \ldots , D}\right\}$ denotes the dimension of the problem, u and l represent the upper and lower bounds of $X_i$ in the dth dimension, respectively.

The mutation strategy of DE can characteristically create a mutant vector to be an intermediate variable $V_i$ for evolution according to:

$$\begin{aligned} V_i= X_{r1} + F \cdot (X_{r2}-X_{r3}) \end{aligned}$$

(39)

where $r_1, r_2$ and $r_3 \in \left\{ {1, 2, \dots , {\rm NP}}\right\}$ are random indices, $i \ne r_1 \ne r_2 \ne r_3$ and F is a constant operator that indicates the level of amplification.

The crossover strategy of DE that can boost the diversity of new agent $U_i$ by combining the original agent $X_i$ with the intermediate variable $V_i$ can be defined as follows:

$$\begin{aligned} U_{i}^{d}={\left\{ \begin{array}{ll} V_{i}^{d} &{} {\rm if} \;\; {\rm rand}(0, 1) \le {\rm CR} \;\; {\rm or} \;\; d=d_{\rm rand}\\ X_{i}^{d} &{} {\rm otherwise} \end{array}\right. } \end{aligned}$$

(40)

where CR represents a crossover control parameter and $d_{\rm rand} \in \left\{ [1, 2, \ldots , D]\right\}$ denotes a random number.

The selection process in DE is performed in each iteration by contrasting $U_i$ with $X_i$ using a greedy norm for a better agent reserve in the population for the next iteration. Through these evolutionary processes, DE could rapidly converge and eventually obtain the global optimum.

6.3.1 AFT versus DE

Generally speaking, since AFT is a human-based optimization algorithm, so there is no need for evolutionary processes such as crossover, mutation and selection operations. The main differences between DE and AFT can be briefed by the following points:

1.
The AFT algorithm preserves search space information over subsequent iterations, while the DE algorithm discards the information of previous generations once a new population is formed.
2.
AFT involves fewer operators to adjust and run as compared to DE that uses several operations such as selection and crossover. Moreover, AFT utilizes a parameter denoting Marjaneh’s plans, while DE does not memorize the best solution obtained so far.
3.
In DE, exploration is enhanced using crossover and selection operations, while in AFT, it is enhanced by allowing the thieves to randomly explore the search space.
4.
In DE, mutation is generally implemented on the basis of enhancing the exploitation of DE. However, a better exploitation of AFT is achieved with the perception probability parameter.

6.4 Genetic algorithm

GA was first put forwarded by Holland [88]. It is considered as a global optimization algorithm inspired by biological mechanisms such as evolution and genetics. When using GAs, the search space is used to construct chromosomes, whereby every possible solution is coded as a chromosome (i.e., individual). In optimization with GA, evolution begins with a group of randomly formed individuals from a population. The fitness score of each individual is computed in each generation. The variables of the solutions are adjusted based on their fitness values. Since the best individuals are given higher probability to participate in enhancing other solutions, the random initial solutions are very probable to be improved. Based on a fitness function, chromosomes are selected and certain genetic operators such as mutation and crossover are applied to the selected chromosomes to form new ones. The idea is that these chromosomes evolve and always create better individuals until they reach the global optimum [89].

6.4.1 AFT versus GAs

Both GAs and AFT are population-based techniques; however, the key differences between them can be briefed as follows:

1.
Since GA is similar to DE in which it uses crossover, mutation and selection operations; the AFT algorithm does not use these operations.
2.
The AFT algorithm uses Marjaneh’s intelligence, while GA does not use such a parameter and does not save the best solutions obtained so far.
3.
GA evolves and updates its population using crossover, mutation and selection operations, while AFT improves its exploration ability with the concept of random relocation of the thieves that is managed by a perception probability parameter.

6.5 Covariance matrix adaptation-evolution strategy

The CMA-ES [90, 91] is an evolutionary algorithm for nonlinear non-convex optimization problems in continuous domain. Specifically, it is a second-order approximation algorithm that estimates the derivatives of a covariance matrix within an iterative procedure according to the principle of maximum likelihood. By doing this, it tends to maximize the probability of the distribution. At each iteration, the members of the new population are sampled from a multivariate normal distribution N with covariance $C\in {\mathbb {R}}^{n\times n}$ and mean $m\in {\mathbb {R}}^n$. The new individuals at generation $i + 1$ are sampled as:

$$\begin{aligned} x^{i+1} _k \sim m^i + \sigma ^i{\mathcal {N}}\left( 0, C^i \right) \;\;\;\;\;\;\;\;\;\;\;\;\;\;k = 1,\ldots ,\lambda \end{aligned}$$

(41)

where $\sigma ^i$ is the ith step size and $x^i_k$ is the $k_\mathrm{th}$ individual at generation i.

The sampled points, $\lambda$, are ranked in ascending order of fitness, and the best points, $\mu$, are chosen. The mean of the sampling distribution given in Eq. 41 is updated using weighted intermediate recombination of these specified points:

$$\begin{aligned} m^{i+1} = \sum _{j=1}^{\mu }\omega _j X_{j:\lambda }^{i+1} \end{aligned}$$

(42)

with

$$\begin{aligned} \sum _{j=1}^{\mu }\omega _j =1, \;\;\;\;\;\;\;\;\; \omega _1 \ge \omega _2\ge \ldots \ge w_{\mu }>0 \end{aligned}$$

(43)

where $\omega _j$ are positive weights, and $x_{j:\lambda }^{i+1}$ stands for the jth ranked individual of the $\lambda$ sampling points $x_{k} ^{i+1}$. The sample weights of the standard CMA-ES implementation are decreased as:

$$\begin{aligned} \omega _j =log\left(\frac{\lambda - 1}{2} + 1\right) - log(j) \end{aligned}$$

(44)

The covariance matrix can be adapted for the next generation using a combination of rank-$\mu$ and rank-one update as follows:

$$\begin{aligned} C^{i+1}= & {} (1 -c_{cov})C^i + \frac{c_{cov}}{\mu _{cov}} p_c^{i+1} p_c^{{(i+1)}^T} \nonumber \\+ & {} c_{cov}\left(1-\frac{1}{\mu _{cov}}\right)\sum _{j=1}^{\mu }\omega _j y_{j:\lambda }^{i+1}(y_{j:\lambda }^{i+1})^T \end{aligned}$$

(45)

With $\mu _{cov} \ge 1$ is the weighting between rank-one update and rank-$\mu$, $c_{cov} \in [0, 1]$ is the learning rate for the covariance matrix update, and $y_{j:\lambda }^{i+1} = (X_{j:\lambda }^{i+1}- m^i)/\sigma ^i$. The evolution path $p_c^{i+1}$ and $\sigma ^i$ are identified by an adaptation formula [91].

6.5.1 AFT versus CMA-ES

Both CMA-ES and AFT are population-based techniques; however, the major differences between them are summarized as shown below:

1.
The CMA-ES basically parameterizes the multivariate normal distribution ${\mathcal {N}}(textbf{m}, \sigma ^2C)$ which consists of three terms: the mean vector m, the step-size $\sigma$ and the covariance matrix C. On the other hand, the AFT algorithm does not use these components, rather it uses Eqs. 5, 10 and 11 to update the position of its search agents.
2.
The CMA-ES uses two evolution paths that accumulate consecutive steps of the mean vector update for the cumulative step-size adaptation and the rank-one update of the covariance matrix. However, the AFT algorithm does not use such these evolution paths.
3.
AFT uses some concepts to assist alternating between local and global solutions in the update of its search agents’ position. However, the CMA-ES algorithm uses a covariance matrix that can be adapted for the next generation using an integration of rank-$\mu$ and rank-one.

6.6 Ant colony optimization

ACO is a meta-heuristic algorithm that distributes the search activities to so-called “ants” [92]. The activities are split among agents with simple basic abilities that imitate, to some extent, the behavior of real ants in foraging. It is crucial to underline that ACO has not been developed as a simulation of ant colonies, but to employ the metaphor of artificial ant colonies and their application as an optimization tool. At the start of processing in ACO, where there is no information about the path to go from one point to another, the choice of ants about which path to walk in is totally random. During processing, the intention is that if an ant has to choose between different paths at a given point, those that have been chosen heavily by the preceding ants (i.e., those with a high trail level) are chosen with a higher probability. Generally, ACO approach tries to address an optimization problem by repeating the next two steps:

A pheromone model, as a specific probability distribution, was used to evolve the candidate solutions over the solution space;
The candidate solutions are utilized to adjust the pheromone values in a manner that is deemed to bias future sampling toward higher quality solutions.

The choice of ant agents in constructing solution components using a pheromone model is probabilistically defined at each construction step. An ant moves from node i to node j using the following rule:

$$\begin{aligned} \rho _{(i, j)} = \frac{\tau _{(i, j)}^{\alpha } \cdot n_{(i, j)}^{\beta }}{\sum (\tau _{(i, j)}^{\alpha }) \cdot (n_{(i, j)}^{\beta })} \end{aligned}$$

(46)

where $\tau _{(i, j)}^{\alpha }$ is the pheromone value associated with edge (i, j), $n_{(i, j)}^{\beta }$ is the heuristic value associated with edge (i, j), $\alpha$ is a positive real parameter whose value identifies the relative importance of pheromone versus heuristic information and controls the influence of $\tau _{(i, j)}^{\alpha }$, $\beta$ is a positive parameter that identifies the relative importance of pheromone versus heuristic information and controls the influence of $n_{(i, j)}^{\beta }$.

Once the solution is built, the ant evaluates the partial solution to be used using Eq. 47 that specifies how much pheromone to deposit.

$$\begin{aligned} \tau _{(i, j)} = (1- \rho )\tau _{(i, j)} + \delta \tau _{(i, j)} \end{aligned}$$

(47)

where $\tau _{(i, j)}$ is the pheromone value correlated with edge (i, j), $\rho \in (0, 1]$ is the pheromone evaporation rate and $\delta \tau _{(i, j)}$ is the amount of pheromone deposited, typically given by:

$$\begin{aligned} \tau _{(i, j)}^k ={\left\{ \begin{array}{ll} 1/L_K &{} {\rm if} \; {\rm ant} \; K \; {\rm travels} \; {\rm on} \; {\rm edge} \; (i, j)\\ 0 &{} {\rm otherwise} \end{array}\right. } \end{aligned}$$

(48)

where $L_K$ is the cost of the $K{th}$ ant’s tour.

6.6.1 AFT versus ACO

ACO and AFT apparently look similar but are quite different. They both present several differences in their formulation and position updating mechanism.

1.
Both ACO and AFT work on the effective division in the search for the optimal solution. In ACO, the idea is to create a pool of artificial ants that move randomly around an environment. In AFT, the thieves search for Ali Baba by exploring and exploiting each area in the search space using three different cases in the position updating process.
2.
In ACO, the candidate solutions are constructed using a pheromone model. However, AFT uses the local and global best solutions of the thieves to locate the optimum solutions.
3.
In ACO, a new solution is created by Eq. 48, which is conceptually different from the position updating strategy of AFT given in Eqs. 5 to 11. The updating strategy of AFT is a kind of directed and undirected search approach, in which new solutions are forced to move toward a better solution.
4.
The AFT algorithm uses a memory parameter (i.e., Marjaneh’s plans) in its updating process. On the other hand, ACO does not use such a parameter in updating its new solutions. Apart from this, AFT also uses a stochastic location updating strategy as shown in Eq. 10 to improve its exploration feature, while ACO does not use such a strategy.
5.
AFT has two parameters that can be adapted during its iterative process to enhance exploration and exploitation features and to balance them. However, ACO does not use any parameters to be adapted over the course of iterations.

As previously discussed, an effective meta-heuristic must strike an appropriate balance between exploration and exploitation. However, there is no rule of thumb [93] to make this happen. The slight differences in solutions update and random distributions could have a significant effect on the performance of the designed algorithms [94]. Therefore, AFT becomes a good competitor to the existing meta-heuristics.

7 Experimental results and analysis

In this section, to assess the accuracy of the proposed AFT algorithm, we conducted intensive evaluations on a set of 62 test functions, involving commonly used unimodal, multimodal, hybrid and composition functions. These functions involve: (1) 23 benchmark test functions (described in Table 33 in "Appendix A"), (2) 29 functions taken from IEEE CEC-2017 benchmark functions [95] described in Table 34 in "Appendix B", and (3) a set of 10 IEEE CEC-2019 functions (described in Table 35 in "Appendix C"). These functions are over and over used in the literature to test the performance of any new meta-heuristic algorithm. The experiments designed to verify the performance of AFT are outlined as follows:

First, exhaustive comparative studies were presented to verify the reliability and accuracy of the AFT algorithm in relation to other meta-heuristics.
Second, a set of qualitative measures including search history, trajectory, convergence curves and average fitness values were plotted to examine the adequacy of AFT in addressing several types of test functions.
Third, the optimization performance of AFT was studied in light of several development applied to AFT from several aspects.
Forth, Friedman’s and Holm’s test methods were used to verify the significance of the outcomes obtained by AFT.

7.1 Experimental setup

The results produced by AFT in the optimization of the above three mentioned benchmark functions are compared with those produced by other well-regarded algorithms. The parameter definitions of AFT algorithm and those comparative algorithms are given in Table 1.

Table 1 Parameter setting values of the AFT algorithm and other algorithms

A novel meta-heuristic algorithm for solving numerical optimization problems: Ali Baba and the forty thieves

Abstract

Similar content being viewed by others

Black-winged kite algorithm: a nature-inspired meta-heuristic for solving benchmark functions and engineering problems

Metaheuristic optimization algorithms: a comprehensive overview and classification of benchmark test functions

A systematic review of the emerging metaheuristic algorithms on solving complex optimization problems

Explore related subjects

1 Introduction

1.1 Motivations of the work

1.2 Contributions of the work

2 Related works

2.1 Single-objective optimization problems

2.2 Single-objective optimization algorithms

2.2.1 Individualist algorithms

2.2.2 Collective algorithms

3 Inspiration

4 Ali Baba and the forty thieves algorithm

4.1 Random initialization

4.2 Fitness evaluation

4.3 Proposed mathematical model

4.4 Exploration ability of AFT

4.5 Exploitation ability of AFT

4.6 Computational complexity analysis

4.7 Characteristics of AFT

5 Possible developments of AFT

5.1 Self-adaptation of tracking distance of AFT

5.2 Population hierarchical structure of AFT

5.3 Exploration and exploitation

5.4 Chaos in meta-heuristic algorithms

5.4.1 Chaotic maps and sequences

5.4.2 Chaotic local search

5.5 Theoretical analysis of the AFT algorithm

6 Comparative analysis of AFT with other meta-heuristics

6.1 Particle swarm optimization

6.1.1 AFT versus PSO

6.2 Gravitational search algorithm

6.2.1 AFT versus GSA

6.3 Conventional differential evolution algorithm

6.3.1 AFT versus DE

6.4 Genetic algorithm

6.4.1 AFT versus GAs

6.5 Covariance matrix adaptation-evolution strategy

6.5.1 AFT versus CMA-ES

6.6 Ant colony optimization

6.6.1 AFT versus ACO

7 Experimental results and analysis

7.1 Experimental setup

7.2 Classical benchmark test functions

7.2.1 Performance of AFT in unimodal functions

7.2.2 Performance of AFT in multimodal functions

7.3 Performance of self-adaptive AFT algorithm

7.4 Performance of hierarchical AFT algorithm

7.5 Evaluation of the balance of exploitation and exploration of AFT

7.6 Chaotic maps for AFT

7.7 Analysis of the number of clusters of AFT

7.8 Qualitative analysis of AFT

7.9 Performance of AFT in CEC-2017 benchmark

7.10 Performance of AFT in CEC-C06 2019 benchmark

7.11 Statistical test analysis

7.11.1 Statistical test on functions \(\hbox {F}_1\)–\(\hbox {F}_7\)

7.11.2 Statistical test on functions \(\hbox {F}_8\)–\(\hbox {F}_{{23}}\)

7.11.3 Statistical test on IEEE CEC-2017 benchmark

7.11.4 Statistical test on CEC-C06 2019 benchmark

8 Real engineering design problems

8.1 Constraint handling

8.2 Welded beam design problem

8.3 Pressure vessel design problem

8.4 Tension–compression spring design problem

8.5 Speed reducer design problem

8.6 Rolling Element Bearing Design Problem

9 Conclusion and Future Work

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Appendices