1 Introduction

In recent years, expanding the range of complex industrial and theoretical problems has increased the need for advanced optimization algorithms. Some of these algorithms are called meta-heuristics, and part of them are inspired by the social behaviors of creatures or natural phenomena. Due to the random mechanisms of meta-heuristic algorithms, they can quickly escape from local optima and find the best solution.

1.1 Optimization algorithms

Typically, every meta-heuristic algorithm has two main phases of exploration and exploitation. In the exploration phase, the algorithm tries to probe the whole search space of the problem, and the movements of search agents are long and random. But in the exploitation phase, the search agents are moving around the promising solutions with short steps [1]. For developing a new meta-heuristic algorithm, finding a balance between exploration and exploitation is a significant challenge[2, 3]. Various meta-heuristic optimization algorithms with high computational efficiency are introduced for global search [4]. One of the well-known nature-inspired optimization algorithms is the genetic algorithm (GA) [5], which imitates the Darwinian theory of evolution. Operators like selection, crossover, and mutation support the GA to generate new solutions and avoid local optima. Some of the other meta-heuristic algorithms that are applied for comparisons in this paper are as follows: the particle swarm optimizer (PSO) [6], ant lion optimizer (ALO) [7], artificial bee colony (ABC) [8], artificial electric field algorithm (AEFA)[9], salp swarm algorithm (SSA) [10], biogeography-based optimization [11], satin bower bird optimization (SBO)[12], poor and rich optimization method (PRO) [13], water evaporation optimization (WEO) [14], multi-verse optimizer (MVO) [15], moth search (MS) [16], grasshopper optimization algorithm (GOA) [17], barnacles mating optimizer (BMO) [18], farmland fertility [19], symbiotic organisms search (SOS) [20, 21], butterfly optimization algorithm (BOA) [22], Harris hawks optimizer (HHO) [23], interactive search algorithm (ISA) [24], gray wolf optimizer (GWO) [25], and whale optimization algorithm (WOA) [3].

The GWO algorithm [25] mimics the hierarchical leadership of gray wolves as a social behavior for hunting. Four types of wolves participate in this hierarchy called alpha, beta, delta, and omega. According to the experiments on unimodal benchmark functions, the GWO presents a superior performance in the exploitation phase. Also, the exploration power of the GWO is confirmed by the multimodal functions. The GWO has a high performance for engineering design and real problems with unknown search spaces. Mirjalili et al. proved the superiority of GWO to some other well-known optimization algorithms, which is related to the high search speed and precision along with the simplicity of the GWO [25]. All the mentioned strength points reveal the high research value of the GWO for applying as a part of new combinatorial optimization algorithms.

The WOA is another optimization algorithm that is inspired by the feeding behavior of humpback whales in nature. Three main activities in the WOA algorithm are searching for, encircling, and attacking the prey. The WOA algorithm presents a high performance in solving different kinds of problems like feature selection [26, 27] and data clustering [28, 29]. The GWO and WOA algorithms have some drawbacks like slow convergence and getting stuck in the local optimums.

As the No Free Lunch (NFL) theorem [30] says, most of the heuristic methods are suitable for solving specific problems, and they are not applicable for all fields. Various approaches have been proposed so far to develop effective and innovative meta-heuristics. Some of them are made based on improving the existing methods by adding special evolutionary operators or local search steps. Some others are made based on the combination of previous algorithms or applying the chaotic maps in them. The combinatorial algorithms usually benefit from one algorithm’s high exploration and the other one’s exploitation ability [31]. Numerous combinatory meta-heuristics are introduced so far which some of the recent ones are as follows: GSA-GA [32], PSO-GA [33], GA-GSA [34], PS-ABC [35], TVAC-GSA-PSO [36].

Applying random number generators is necessary for implementing most of the optimization algorithms. The chaos theory in mathematics is related to dynamic systems, which have infinite unstable and periodic variations. These systems are sensitive to the initial conditions, so a small change in the initial conditions may result in large variations in the outcomes. Chaos-based systems exhibit similar behavior to random systems. One of the applications of chaotic maps in the meta-heuristic optimization algorithms is adjusting the parameters, which increases the convergence speed and possibility of escaping from the local optima [37, 38]. Different types of chaotic maps are applied in the following algorithms: the PSO [39], harmony search (HS) [40], ABC [41], imperialist competitive algorithm (ICA) [42], firefly algorithm (FA) [43], Krill-Herd (KH) [44], butterfly optimization algorithm (BOA) [45], GWO [46], the WOA[47], and SSA [48]. The motivation of introducing a new hybrid algorithm in this paper was developing an algorithm to solve a more extensive scope of problems by combining and modifying the existing meta-heuristics methods.

1.2 Research objectives

This study aims to develop a combinatorial meta-heuristic algorithm for solving optimization problems. This algorithm, named chaotic GWO and WOA (CGWW), merges the GWO and modified WOA to get the advantage of both algorithms. The CGWW algorithm has the moving steps of the WOA and GWO algorithms simultaneously, so it has a perfect set of evolutionary operators. Accordingly, the hybrid algorithm can obtain more promising solutions for the problem. This algorithm uses a reliable strategy to initialize the search agents and guide them to achieve the near-optimal solutions. Several chaotic maps are applied for the initialization step and adjusting the parameters of the proposed algorithm. In the CGWW algorithm, the roulette wheel selection operator is employed to select the target search agents based on their fitness value. Therefore, the proposed algorithm preserves the elite search agents and tries to direct search agents toward optimal solutions. From another viewpoint, the hybrid algorithm has the properties of a multi-swarm algorithm with more diversity for population members. The CGWW algorithm is evaluated using 23 benchmark functions and a real-world application for intrusion detection systems. Then the results are compared with the obtained results by several states of the art optimization methods.

The rest of this paper includes the following sections. A brief description of the applied and well-known meta-heuristic algorithms, the application of chaotic maps for optimization, and different kinds of chaotic maps are presented in Sect. 2. The structure of the proposed combinatory algorithm is presented in Sect. 3. In Sect. 4, the 23 mathematical benchmark problems, performed experiments, and the generated results are presented. Finally, in Sect. 5, the conclusion of the paper and future works are described.

2 Preliminary

Swarm-based meta-heuristic optimization methods are developed based on the evolution of a group of solutions for a problem in a search space. Solutions with high-quality will survive in the iterations of the optimization method to obtain the optimal solution. Followingly, the applied techniques and operation mechanisms of some of these meta-heuristic methods are summarized in Sect. 2.1. Using random number generators is necessary for implementing most of the optimization algorithms. Chaotic maps behave similarly to random number generators. In recent years, chaotic maps are applied to improve the exploration ability of the optimization algorithms. Instead of random generators, the chaotic maps can generate a sequence of numbers to visit more places in the search space. They can be used to initialize the population and adjust the parameters of meta-heuristic algorithms. Because of dynamic characteristics and better coverage of chaotic maps, meta-heuristic optimization algorithms can work with higher speeds [47, 49, 50]. Some of the meta-heuristic algorithms with the application of chaotic functions are discussed in Sect. 2.2. The GWO and WOA algorithms are briefly discussed in Sects. 2.3 and 2.4.

2.1 Meta-heuristic optimization algorithms and hybrid methods

One of the oldest optimization algorithms is the GA, which is inspired by the natural selection process [5]. The GA algorithm applies different kinds of operators like selection, mutation, and crossover. After some generations, solutions with the highest fitness value will survive. The DE algorithm is much like the GA but with the local search facilities [31, 51]. In the DE algorithm, all solutions have an equal chance to be selected for the next generation, but in GA, they are chosen based on their fitness. The moth-flame optimization algorithm (MFO) is created based on the routing mechanism of moths [52]. Moths fly in a straight line for long distances with a constant angle to the moon, but they sometimes entrap in a helix-shaped line around the artificial lights. These two straight-line and helix-shaped movements of moths implement the exploration and exploitation phases of the MFO algorithm. Group behavior of honeybees inspired the ABC algorithm [53, 54]. There are different groups of bees in the ABC, in which everyone in the group has a specific task. The final goal of all the activities is to produce honey, which is the solution to the investigated problem. Different theoretical and practical real-world problems have been tried to solve with the ABC algorithm [55, 56]. The ALO algorithm is inspired by the trapping and hunting method of the ant lions [7]. Ant lions make some traps in nature and hunt ants using them. Different stages of the ALO algorithm are designed based on this natural phenomenon. The AEFA is introduced by Anita et al. and is developed based on electrostatic forces [9]. In the AEFA, a group of particles establishes a population wherein each particle retains a charge value. This value indicates the fitness for a particle, and the electrostatic force of a particle with a high charge value attracts other particles with a lower charge. Hence, all particles try to move toward particles with higher fitness, which results in finding the optimum solutions. The SSA is presented by Mirjalili et al. and imitates the social behavior of salps [10]. It starts with some random salps, which will forage and move in the ocean water. The salps population tries to move toward the food sources, the salps with maximum fitness value. After some iterations, the near-optimal solutions in the search space are found.

In recent years, some hybrid meta-heuristic optimization algorithms have also been introduced. One of these algorithms is the GSA-GA algorithm, introduced by Garg, for solving the constrained optimization problems [32]. In the GSA-GA algorithm, the problem solutions are firstly conducted with the gravitational search method then, they are enhanced with the GA operators. The PSO-GA is another hybrid method again presented for solving the constrained optimization problems by Garg [33]. In the PSO-GA algorithm, the exploration and exploitation operation is performed with cooperating the GA operators and the PSO. A parameter-free penalty function handles the problem constraints in the PSO-GA. The GA-GSA is also a combinatorial optimization algorithm introduced by Garg for optimizing an industrial system [34]. The GA-GSA is used for maximizing the availability, reliability, and maintainability parameters as the objectives to increase the productivity and performance of the industrial system. The fuzzy sets are applied for resolving the conflicts between the problem objectives in [34]. The PS-ABC is another hybrid algorithm proposed by Li et al. for solving high-dimensional optimization problems [35]. This algorithm applies global search phases of the ABC algorithm with the local search phase of the PSO to obtain the global optimum. The PS-ABC considers the aging degree of each search agent’s best for deciding on the type of search. Beigvand et al. introduced the hybrid TVAC-GSA-PSO algorithm with time-varying acceleration coefficients for solving the large-scale Combined Heat and Power Economic Dispatch (CHPED) problems [36]. The TVAC-GSA-PSO algorithm is developed based on the self-adaptive learning strategy of swarm-based algorithms using various particle movements and the Newtonian laws for gravitation.

2.2 Chaotic maps and optimization algorithms

Seven kinds of ABC algorithms are proposed in [41]. In these algorithms, chaotic maps are applied to adjust the parameters' values to increase convergence speed and scape from local optima. For increasing the strength of the bat algorithm (BA) for global search, chaotic maps are applied in [49]. Thirteen chaotic maps are evaluated in [49], and four different types of BA are developed using chaos. A chaotic map named sinusoidal has been applied as the rate of pulse emission in the BA, which resulted in developing the CBA-IV algorithm with the best performance. A chaotic map called piecewise linear (PWLCM) is presented by Xiang et al. [57]. The PWLCM chaotic map is applied in the PSO to develop the PWLCPSO algorithm. Another PSO algorithm merged with chaotic maps is introduced in [58]. This algorithm uses the PSO and a new operator named adaptive inertia weight factor (AIWF) to explore the search space. Talatahari et al. presented an optimization method called chaotic imperialist and competitive algorithm (CICA) [59]. The ICA is inspired by the socio-political evolution process of the world's countries. Different kinds of chaotic maps are implemented and evaluated in CICA. The results indicated the superiority of the logistic and sinusoidal maps. The chaotic maps are applied to set the light and other absorption parameters of fireflies in the firefly algorithm (FA) to increase the global search capability [43]. The Gaussian map as the absorption coefficient is reported to have the best performance. The chaotic maps are also applied inside the KH algorithm to accelerate the general convergence [44]. Different types of krill movements are proposed using chaotic maps, in which the singer map has had the best performance.

2.3 The gray wolf optimization algorithm

Mirjalili suggested a meta-heuristic algorithm named gray wolf optimization (GWO), which imitates the leadership mechanism and hunting method of gray wolves in nature [25]. There are four types of wolves known as alpha, beta, delta, and omega in the GWO to simulate the leadership hierarchy. Moreover, the GWO algorithm has three fundamental steps of encircling, searching, and attacking the prey. From four types of wolves, alpha is the leader and manages the other types. The alpha wolves control the hunting steps and take the principal decisions. The beta wolves are in the second level of the hierarchy and return the feedback from others to the alpha type. The next level of the gray wolves hierarchy comprises delta wolves, which dominate the last one, containing omega wolves. Equation (1) calculates the distance of each wolf from alpha, beta, and delta wolves. X is the position of the current wolf in the search space.

$$ D_{{{\text{alpha}}}} = \left| { \, C_{1} . \, X_{{{\text{alpha}}}} - X \, } \right|, \quad D_{{{\text{beta}}}} = \left| {C_{2} . \, X_{{{\text{beta}}}} - X \, } \right|, \quad D_{{{\text{delta}}}} = \left| {C_{3} . \, X_{{{\text{delta}}}} {-} \, X} \right| $$
(1)

To obtain the next position of the current wolf, the X1X2, and X3 values, can be calculated by Eq. (2).

$$ X_{1} = X_{{{\text{alpha}}}} - A_{1} . \, D_{{{\text{alpha}}}} , \quad X_{2} = X_{{{\text{beta}}}} - \, A_{2} . \, D_{{{\text{beta}}}} , \quad X_{3} = X_{{{\text{delta}}}} - \, A_{3} . \, D_{{{\text{delta}}}} $$
(2)

The A, a, and C are some parameters to control the algorithm calculated by Eq. (3). The r1 and r2 are two random values ranging from 0 to 1. The C is a value to increase the exploration ability of the GWO algorithm. The value of ‘a’ changes linearly from 2 to 0 during the iterations of the algorithm.

$$ A = 2 \, a. \, r_{1} {-} \, a ,\quad C = 2 \, . \, r_{2} $$
(3)

The next position of each wolf in the population is updated by Eq. (4).

$$ X\left( {t + 1} \right) \, = \, \left( {X_{1} + \, X_{2} + \, X_{3} } \right)/3 $$
(4)

2.4 The whale optimization algorithm

The whale optimization algorithm (WOA) [3] works based on a hunting method called bubble network, which belongs to humpback whales. In the WOA, for each iteration of the algorithm, the search agent with the best fitness (X*) is selected. Other search agents try to get their location close to the X*. The distance of each agent to the X* is calculated by Eq. (5). In Eq. (5), C is a vector calculated by Eq. (6), X is the position vector, r is a random number between 0 and 1, and t is the iteration number.

$$ \overrightarrow {D} = |\mathop{C}\limits^{\rightharpoonup} .\mathop{X}\limits^{\rightharpoonup}{^{*}} (t) - \mathop{X}\limits^{\rightharpoonup} (t)| $$
(5)
$$ \mathop{C}\limits^{\rightharpoonup} = 2. \mathop{r}\limits^{\rightharpoonup} $$
(6)

The position of each agent in the next iteration is calculated by Eq. (7). The value of A is a vector computed by Eq. (8). \(\mathop{a}\limits^{\rightharpoonup}\) is also a coefficient that changes linearly from 2 to 0.

$$ \mathop{X}\limits^{\rightharpoonup} (t + 1) = \mathop{X}\limits^{\rightharpoonup}{^{*}} (t) - \mathop{A}\limits^{\rightharpoonup} .\mathop{D}\limits^{\rightharpoonup} $$
(7)
$$ \mathop{A}\limits^{\rightharpoonup} = 2\mathop{a}\limits^{\rightharpoonup} . \mathop{r}\limits^{\rightharpoonup} - \mathop{a}\limits^{\rightharpoonup} $$
(8)

The exploitation phase of the bubble network hunting method contains two types of movements, named encircling-shrinking and spiral-shaped around the prey. For the exploration phase, random moves are applied to prey search. The encircling-shrinking is done by Eq. (7) when the value of is decreasing from 2 to 1 during the iterations. Each search agent (whale) tries to get its position closer to the best one (prey) of the previous iteration in this movement. In the spiral-shaped move, the search agent goes through a spiral-shaped path around the best solution using Eq. (8). b is a constant which defines the shape of the spiral-shaped movement, and l is a random number between -1 and 1. Value of D’ in Eq. (9) indicates the distance between the selected agent and the prey calculated by Eq. (10).

$$ \mathop{X}\limits^{\rightharpoonup} \left( {t + 1} \right) = \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{D^{\prime}} . e^{bl} .\cos \left( {2l} \right) + \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}}{{X^{*} }} \left( t \right) $$
(9)
$$ \mathop{D}\limits^{\rightharpoonup} {^{\prime}} = |\mathop{X}\limits^{\rightharpoonup} {^{*}} (t) - \mathop{X}\limits^{\rightharpoonup} (t)| $$
(10)

Selecting encircling-shrinking or spiral-shaped movements is made by a random variable called p with a 50 percent chance for each one. In the exploration phase of the WOA, each population member moves toward a randomly selected search agent. This kind of movement is called the random search of the WOA, which increases the population diversity. This movement is implemented by Eqs. (11) and (12).

$$ \mathop{X}\limits^{\rightharpoonup} \left( {t + 1} \right) = \mathop{X}\limits^{\rightharpoonup} _{{{\text{rand}}}} - \mathop{A}\limits^{\rightharpoonup} . \mathop{D}\limits^{\rightharpoonup} $$
(11)
$$ \mathop{D}\limits^{\rightharpoonup} = \left| {\mathop{C}\limits^{\rightharpoonup} .\mathop{X}\limits^{\rightharpoonup} _{{{\text{rand}}}} - \mathop{X}\limits^{\rightharpoonup} } \right| $$
(12)

Xrand in Eq. (11) is the position of the selected random search agent. 'A' is a number greater than one or less than minus one calculated by Eq. (8). To improve the diversity of solutions, the value of A enforces the whales to get away from each other. The WOA algorithm initializes a whales' population and updates the whales' positions with the three types of movements mentioned above.

The CWOA algorithm is the chaotic version of WOA [47]. In the CWOA, chaotic maps are applied to adjust the variables used to balance the exploration and exploitation abilities of WOA. Ten types of chaotic maps are used and evaluated for developing the CWOA algorithm. An improved version of the whale optimization algorithm, called IWOA, is also suggested to solve different scales of 0–1 knapsack problem with one and multi-dimensions [60]. A negative penalty value is considered inside the evaluation function to detect invalid solutions found by the IWOA. Moreover, to balance exploration and exploitation, a local search strategy and Lévy flight trajectories are implemented to develop the IWOA.

3 Chaotic and hybrid GWO and WOA algorithm

A chaotic and hybrid algorithm of GWO and WOA, called CGWW, is proposed in this section, which uses chaotic maps for initialization and adjusting its parameters.

3.1 Combination of the GWO and WOA algorithms

Exploration and exploitation are two fundamental phases of an optimization algorithm. In the exploration phase, the algorithm tries to search the entire search space of the problem with long steps. But in the exploitation, it strives to investigate the search space around the best solutions found in the exploration phase by small movements. The exploration phase of the WOA algorithm [3] is performed by the random search, where the exploitation phase is done by encircling-shrinking and spiral-shaped moves. But for some problems, the WOA gets stuck in the local optimum solutions.

In this paper, a modified version of the WOA algorithm is combined with the GWO to improve the exploration and exploitation operation. A random whale is selected in the random search of the WOA algorithm, as described by Eq. (11) in the previous section. As a modification, the whales' selection in this movement could perform by the roulette wheel operator to enhance the exploration mechanism of the WOA. In this way, the whales with better fitness values would be selected and became the target of other whales to move. The roulette wheel selection operator is one of the principal operators of GA [5].

Another defect of the WOA algorithm is fast convergence to solutions with low quality. Two approaches are proposed in this paper to solve this problem. The first approach is combining the GWO with the WOA to develop a multi-swarm algorithm with high exploration capability. The second one is applying chaotic maps to generate solutions with a high degree of diversity. Chaotic maps are used to initialize the population and adjust different parameters and coefficients of the CGWW algorithm. The chaotic maps that are explained in the next section are applied for developing the CGWW. Some of the most significant properties of the CGWW algorithm are listed below.

  • The chaotic maps are applied to initialize the population of the CGWW algorithm.

  • The CGWW algorithm is developed by combining the GWO and WOA, and it has more diverse movements for the search agents in the search space of the problem.

    • The initial population of the CGWW is divided into two subpopulations. The GWO works on the first one, and the WOA works on the second one.

    • After each iteration, the first and second subpopulations are merged in a total population. Then the fitness evaluation is performed, and the search agents of the total population are divided between the first and second subpopulations based on their fitness value. In this division, both subpopulations have approximately equal numbers of search agents with high fitness value.

    • The roulette wheel selection operator is applied in CGWW to select the search agents in random movements, so better ones from the fitness viewpoint would be the target whale instead of random ones.

The flowchart of the CGWW algorithm is presented in Fig. 1.

Fig. 1
figure 1

Flowchart of the hybrid CGWW algorithm

Algorithm 1 shows the pseudo-code of the proposed CGWW algorithm.

figure a

3.2 Chaotic maps and the proposed hybrid algorithm

Random number generators are needed for implementing most of the optimization methods. In most optimization algorithms, randomness is attained by applying uniform or Gaussian probability distributions. The chaotic systems in mathematics are related to deterministic, non-linear, and dynamic systems, which have infinite unpredictable and periodic changes. They are sources of randomness, and they can perform explorations at higher speeds compared to random searches that principally rely on randomness [43]. Because of the similar behavior of chaotic maps and random systems, the chaotic maps are applied in most of the meta-heuristic optimization algorithms to improve the exploration phase. This improvement occurs by increasing the diversity of population members using chaotic maps. In this way, the meta-heuristic algorithms can perform the iterative search faster than the standard random searches, which use the mentioned probability distributions. The chaotic systems are sensitive to the beginning conditions, and a minor variation in them may result in essential changes in the results [49, 50]. Due to the dynamic behavior and non-repetition characteristics of the chaotic maps, they are applied mainly for the initialization and adjusting the parameters of meta-heuristic methods to increase the convergence speed and scape from the local optima [37, 38]. Various chaotic maps are designed by mathematicians, physicians, and researchers which some of the most commonly used single-dimensional ones are presented below [61]. These maps are applied in this paper to implement the CGWW algorithm. In the following functions, k is the iteration number, and xk is the kth chaotic value. All of the investigated chaotic maps are adjusted to generate values between 0 and 1.

  1. (1)

    Chebyshev map: Eq. (13) defines this map [62].

    $$ x_{k + 1} = {\text{cos}}\left( {k{\text{cos}}^{ - 1} \left( {x_{k} } \right)} \right) $$
    (13)
  2. (2)

    Circle map: This map is defined by Eq. (14), which produces a chaotic sequence between 0 and 1 for a = 0.2 and b = 0.5 [63].

    $$ x_{k + 1} = x_{k} + b - \left( \frac{a}{2} \right){\text{ sin}}\left( {2x_{k} } \right){\text{mod}}\left( 1 \right) $$
    (14)
  3. (3)

    Gauss/Mouse map: Equation (15) defines this map [61].

    $$ \begin{gathered} x_{k + 1} = \left\{ { \begin{array}{ll} 0 \\ {\frac{1}{{x_{k} {\text{mod}}\left( 1 \right)}}} \\ \end{array} } \right. \begin{array}{*{20}c} {x_{k} = 0} \\ {{\text{otherwise}}} \\ \end{array} , \hfill \\ \quad \quad \; \frac{1}{{x_{k} {\text{mod}}\left( 1 \right)}} = \frac{1}{{x_{k} }} - \left[ {\frac{1}{{x_{k} }}} \right] \hfill \\ \end{gathered} $$
    (15)
  4. (4)

    Iterative map: Iterative map with infinite falling points can be expressed by Eq. (16), as \(a\in (\mathrm{0,1})\) [64].

    $$ x_{k + 1} = {\text{ sin}}\left( {\frac{a}{{x_{k} }}} \right) $$
    (16)
  5. (5)

    Sine map: Eq. (17) expresses this map [65], which \(0<a\le 4\).

    $$ x_{k + 1} = \frac{a}{4}{\text{ sin}}\left( {x_{k} } \right) $$
    (17)
  6. (6)

    Singer map: Eq. (18) shows this one-dimensional map, as µ is a parameter between 0.9 and 1.08 [66].

    $$ x_{k + 1} = \left( {7.86x_{k} - 23.31x_{k}^{2} + 28.75x_{k}^{3} - 13.3x_{k}^{4} } \right) $$
    (18)
  7. (7)

    Sinusoidal map: Eq. (19) expresses the sinusoidal chaotic map [67], that in a particular case, once a = 2.3 and X0 = 0.7, it can be simplified by Eq. (20).

    $$ x_{k + 1} = ax_{k}^{2} {\text{sin}}\left( {x_{k} } \right) $$
    (19)
    $$ x_{k + 1} = {\text{sin}}\left( {x_{k} } \right) $$
    (20)
  8. (8)

    Tent map: This map is similar to the well-known logistic function that Eq. (21) defines [68].

    $$ x_{k + 1} = \left\{ {\begin{array}{*{20}l} {\frac{x_{k}}{0.7}} \\ {\frac{10}{3}\left( 1 - x_{k} \right)} \\ \end{array} } \right. \begin{array}{*{20}l} {x_{k} <0.7} \\ {x_{k} >0.7} \\ \end{array} $$
    (21)
  9. (9)

    Sawtooth map: It can be formulated by mod function by Eq. (22) [69].

    $$ x_{k + 1} = 2x_{k} {\text{mod}}\left( 1 \right) $$
    (22)
  10. (10)

    Logistic map: Eq. (23) expresses this map [67]. The x will be between 0 and 1 only if the initial value of x0 is in (0, 1).

    $$ x_{k + 1} = x_{k} \left( {1 - x_{k} } \right)x_{0} \in \left( {0.0,{ }0.25,{ }0.75,{ }0.5,{ }1.0} \right) $$
    (23)

3.3 Advantages of the proposed algorithm

The CGWW algorithm is developed to solve various kinds of continuous optimization problems with its diverse operators and procedures, which have been borrowed from the well-known GWO and WOA algorithms. In other words, theoretically, the strengths points of the GWO and WOA are integrated into the CGWW along with the chaotic maps to represent a high-performance meta-heuristic algorithm. Some advantages of the CGWW algorithm are as follows:

  • The CGWW algorithm has a high power of exploration, and it can search the search space of the problem effectively to find the promising regions. This characteristic of the proposed algorithm is due to the search agents containing diverse solutions from the combined algorithms.

  • Due to the dynamic behavior and application of chaotic maps in the optimization algorithms to better exploring the search space, they have been applied in the CGWW algorithm. The second reason for the increased exploration ability in the proposed algorithm is initializing the search agents with the chaotic maps.

  • Also, during the iterations of the hybrid algorithm, the chaotic maps are applied to avoid stagnation in the solutions, which causes to find new promising regions in the search space.

  • The proposed algorithm represents a high performance in the exploitation phase due to the combinatorial operators given from the GWO and WOA. These operators search around the found promising solutions of the search space in the exploration phase. This ability is improved using the roulette wheel selection operator to determine the target solutions for the search agents of the WOA algorithm.

  • About the local optima avoidance, the CGWW algorithm performs well with balancing the exploration and exploration ability. The chaotic maps also help in establishing this balance. In the first iterations of the proposed algorithm, the diversity of search agents is high due to generating the initial population by two strategies of the GWO and WOA, along with the chaotic maps. But in the ending iterations, the diversity decreases, and the population members converge to optimal solutions.

  • Some mechanisms are applied to implement the elitism in the proposed algorithm. The best solutions of each iteration are saved as elite solutions and inserted in the subsequent populations. Besides, the stagnation of each search agent is examined in each iteration. If there is no improvement in the best position for the previous five iterations, the chaotic maps reinitialize the search agent.

  • The last advantage of the proposed CGWW algorithm is its application for solving most of the continuous optimization problems without any information about the search space. In other words, the proposed algorithm looks at the given problem as a black box. Moreover, by some modifications in the fitness function, the proposed algorithm can also solve some kinds of multi-objective problems.

3.4 The complexity of the proposed algorithm

Considering the structure and implementation method of the proposed CGWW algorithm, the computational complexity is evaluated and discussed in this section. Most of the meta-heuristic algorithms have limited memory usage, and the proposed algorithm is no exception. But in terms of runtime, various algorithms behave differently, and the algorithms with less computational complexity have a better performance. For the CGWW algorithm, the number of wolves in the GWO part, number of whales in the WOA part, maximum number of iterations, number of variables in the solution, applied chaotic map functions, sorting method of population, and the roulette wheel selection operator determines the computational complexity. The merged population, containing the wolves of the GWO and whales of the WOA, is sorted based on fitness value in each iteration of the CGWW algorithm. Whenever the sorting operation is required in the CGWW, the Quicksort algorithm with the complexity of O(n2) and O(n log n) for the worst and best case is used. As mentioned in the previous sections, to improve the WOA, the roulette wheel selection operator is applied for selecting the target whales during movements. The complexity of this operator, related to the implementation method, is O(n) or O(log n). For every iteration of the proposed algorithm, if the search agents have no improvement in the last five iterations, the chaotic maps reinitialize them. But no function evaluations are required for these operations, and their cost is trivial. Equation (24) presents the computational complexity of the CGWW algorithm.

$$ \begin{aligned} O \, \left( {{\text{CGWW}}} \right) = & \;O \, (t \times O \, \left( {\text{Quick sort}} \right) \, + \, O \, ({\text{positions update of the GWO)}} \\ & + O \, \left( {\text{roulette wheel for the whales of the WOA}} \right) \\ & + \, O \, \left( {\text{positions update of the WOA}} \right))) \\ \end{aligned} $$
(24)
$$ O \, \left( {{\text{CGWW}}} \right) \, = \, O \, (t \times n^{2} + \, n/2d \, + \, n/2{\text{log }}n \, + \, n/2 \times d)) = O \, \left( {tn^{2} + \, tnd \, + \, tn \, \left( {{\text{log }}n} \right)/2} \right) $$

In Eq. (24), t is the maximum iterations, n is the number of search agents (wolves in the GWO plus whales in the WOA), and d is the dimension or number of variables.

3.5 Tips for converting the proposed algorithm to a multi-objective optimization method

The proposed CGWW algorithm is intrinsically a single-objective optimization algorithm but, with some modifications, it can be transformed into a multi-objective optimization algorithm. For solving a multi-objective problem, the final result is a set named Pareto optimal solutions. The currently proposed CGWW algorithm cannot be applied for solving multi-objective problems due to the following reasons. The first reason is that in each iteration of the CGWW, two solutions are found and saved as the best ones. One for the GWO and the other for the WOA. But for solving a multi-objective problem, the output is multiple non-dominated solutions. The second reason is that, for solving a multi-objective problem, a set of non-dominated solutions (Pareto set) are required for updating the positions of the remaining population members. Therefore, the first modification over the proposed algorithm may be having a repository of non-dominated solutions or the Pareto set. This repository is very similar to the archives in the multi-objective particle swarm optimization algorithm (MOPSO) [70, 71]. The obtained solutions of the GWO and WOA are compared against the repository members during the execution of the multi-objective algorithm using a dominance operator. If a solution dominates a member of the repository, the member will be replaced by it. During the optimization process, a solution may be non-dominated comparing with the repository members, which has to be added to the repository. By applying the mentioned rules, the repository always has the non-dominated solutions which have been found so far. The repository should have a limited capacity, so some members should be removed by adding new non-dominated solutions beyond capacity. The simplest way to remove the surplus solutions from the repository is the random method, but deleting the similar non-dominated solutions could be a better way. The proposed multi-objective algorithm should obtain the uniformly distributed Pareto optimal solutions. Therefore, removing some similar solutions from the populated regions of the repository is the best approach [71]. The segmentation of the repository for grouping similar solutions in each segment can be applied to do this. Then for each repository member, a rank is assigned based on the number of neighboring solutions. Solutions with a higher rank number are selected with the roulette wheel selection operator to remove from the repository. After some iterations of the multi-objective algorithm, the repository members will have improved distribution. In an ideal situation for the repository, there will be one solution per segment. Figure 2 illustrates an example for this repository updating process of the proposed multi-objective algorithm for solving a problem with two objective functions of f1 and f2. The solid black circles are candidates to be removed from the repository in the example of Fig. 2.

Fig. 2
figure 2

The repository updating process when it is full (solid black circles are candidates to be removed)

For the multi-objective CGWW algorithm, there should be a single repository for the GWO and WOA parts. In each iteration of the multi-objective CGWW, by using the dominance operator, the GWO algorithm compares its alpha wolf with the repository members, and the WOA does the same for the best-found whale.

There is more than one best solution in a multi-objective search space. Therefore, as mentioned previously, the selection method of targets in the GWO and WOA for updating the positions of the remaining population members in each iteration is the second issue for developing the multi-objective version of the CGWW. This issue can be solved with the random selection of the targets. But a wiser way is to apply the same ranking method and roulette wheel operator for removing the surplus solutions from the repository. However, the non-dominated solutions with a lower rank, which has a less populated neighborhood, have a higher chance to be selected as the targets. For example, the non-dominated solution in the third row and second column in Fig. 2 with no neighbor has the highest probability for selection. Algorithm 2 presents the pseudo-code of the proposed multi-objective CGWW algorithm.

figure b

Algorithm 2 indicates that the multi-objective CGWW algorithm first initializes the total population using the chaotic maps. This algorithm then calculates the fitness of each solution and obtains the non-dominated ones. The repository is updated considering the new non-dominated solutions. If the repository becomes full, some solutions with a populated neighborhood are selected using the roulette wheel operator and then removed from the repository. In the next step, the alpha wolf and the best whale are chosen from the repository, again with the roulette wheel, which both of them have the least populated neighborhood. The rest of algorithm 2 is the same as algorithm 1 for the single objective CGWW.

In this section, some suggestions are presented to transform the CGWW algorithm into a multi-objective approach. But, as mentioned earlier, the CGWW algorithm is intrinsically single-objective in its current form. However, with some modifications to the objective functions, the multi-objective problems can be solved by the single-objective optimization algorithm to some extent. For instance, the GWO algorithm is applied for solving the feature selection problem in anomaly-based intrusion detection systems in [72]. In this study, a weighted sum fitness function has been proposed using the objective of the problem. Therefore, to further evaluate the CGWW algorithm, a kind of feature selection problem in intrusion detection systems has been tried to solve with the proposed CGWW algorithm in Sect. 4.6.

4 Results of experiments and discussion

Several experiments are conducted to evaluate the CGWW algorithm with some benchmark functions and a real-world application for intrusion detection systems. In the following sections, the details of the performed experiments and their conditions and results are presented.

4.1 Benchmark functions and the optimization algorithms

The CGWW algorithm is evaluated using unimodal, multimodal, and composite benchmark functions [73,74,75,76,77,78,79]. The first group of benchmark functions, called unimodal, includes seven members (F1 to F7), where each one has one optimum solution. The unimodal benchmark functions presented in Table 1 can evaluate the convergence power and exploitation ability of the optimization algorithms. The second group, called multimodal (F8 to F13), has large dimensions and is presented in Table 2. Multimodal benchmark functions have many local and one global optimum, so they are more challenging than unimodal functions. Multimodal benchmarks can evaluate the exploration and local optima avoidance abilities of optimization algorithms. The optimum value of the F8 in Table 2 is a coefficient of problem dimension, which is 30 in the experiments. For the other benchmarks of Tables 1 and 2, the optimum value is zero. The last group of functions is composite or multimodal with fixed or small dimensions (F14 to F23). Composite benchmark functions, shown in Table 3, are rotated, biased, shifted, and combined versions of unimodal or multimodal ones [80, 81]. Most of the composite benchmarks have complicated shapes and many local optimums, so they are more like real-world problems. For the composite benchmark functions, finding a balance between exploration and exploitation operations is essential to find the global optimum.

Table 1 Unimodal benchmark evaluation functions
Table 2 Multimodal benchmark evaluation functions
Table 3 Composite benchmark evaluation functions

The proposed CGWW algorithm is compared with eight well-known optimization algorithms to evaluate the performance of it. The algorithms are the GA [5, 82], PSO [6], MFO [52], ABC [8, 54], SSA [10], AEFA [9], GWO [25] and WOA [3].

4.2 Experimental setup

The proposed CGWW and other compared algorithms have some execution parameters. The initial values for the significant parameters are adjusted to evaluate the CGWW algorithm. In Table 4, the initial values of the parameters of the compared algorithms are presented. These values are adjusted according to the values stated in the literature. The same initialization method has been applied for all the algorithms to compare them equivalently. The number of search agents and iterations for all algorithms is considered 30 and 500, respectively. Parameter values of the proposed CGWW algorithm are a combination of values for the GWO and WOA algorithms. The experiments and algorithms are implemented in Matlab R2018a using a system with a Core i5 Intel processor, 2.8 GHz clock speed, 8 GB of RAM size, and Microsoft Windows 10 64-bit OS.

Table 4 Parameters values of the compared algorithms

The parameters of Table 4 for the PSO are adjusted according to the values recommended in [6, 83,84,85]. The inertial value between 0.4 and 0.9 adjusts the exploration and exploitation phases, while the acceleration rates of c1 = c2 = 2 control the moving distance of particles. For the MFO algorithm, the convergence parameter linearly decreases from −1 to −2, according to [52], while the abandonment limit parameter of ABC is determined based on [8]. The GA algorithm has three parameters of crossover rate, mutation rate, and selection pressure. The crossover is the essential operator of the GA, and 0.7 is a reasonable rate for it. A high probability for this operator will combine different solutions to generate new ones. The mutation rate should be low since it will increase the diversity more than enough, and this will remove the high-quality solutions during the generations [5]. The K0 and ‘a’ are two main parameters of the AEFA algorithm, which are used to set the Coulomb’s constant according to [9]. The most significant parameter of the SSA is c1, which establishes a balance between the exploration and exploitation mechanisms. The other parameters of the SSA are c2 and c3, random numbers between 0 and 1, according to [10]. The GWO and WOA algorithms have a control parameter named ‘a’, which balances the exploration and exploitation and changes from 2 to 0 according to [25]. The ‘a2′ parameter of the WOA and the hybrid proposed algorithm varies between 1 and 2, while r1 and r2 are two random values as stated in [3]. The chaotic maps are used to set the r1 and r2 values in this paper.

For implementing the CGWW algorithm, the mentioned chaotic maps of Sect. 3.2 have been used and compared. The comparison results showed that the Iterative, Circle, and Sine maps present better performance than the others. All of the experiments of this paper have been repeated 100 times to get more accurate results. For all compared algorithms, the average, standard deviation, worst, and best values are computed and reported as the performance metrics using the obtained results. The first two criteria show the stability of the algorithm in solving the investigated problems. For each test function, the best-obtained solution is displayed by the bold-face in the tables.

4.3 Results of experiments on the benchmark functions

In the following sections, the comparison results of CGWW with eight popular meta-heuristic algorithms for solving the unimodal, multimodal, and composite benchmark test functions are presented to evaluate the proposed algorithm.

4.3.1 Results for solving the unimodal benchmark evaluation functions

The first group of test functions, named unimodal, has only one optimum solution. Accordingly, the unimodal functions (F1 to F7) are applicable for evaluating the convergence speed of optimization algorithms. A high convergence speed could improve the exploitation phase of the algorithms. Table 5 presents the execution results of the compared algorithms for the unimodal benchmark functions. The results indicate that the CGWW algorithm could find the best solutions on 6 out of 7 unimodal test functions. In the GWO algorithm, there is a hierarchy of wolves based on the solution quality. One of the responsibilities of this hierarchy is focusing on the high-quality solutions found so far. In this way, the exploitation ability of the algorithm will improve. About the WOA, it has different motion operators for the whales. The helix-shaped movement and the movement around the current best solution participate in the exploitation phase. Besides, the roulette wheel operator, which forces the whales to move toward high-quality solutions, improves the exploitation operation. Integration of the above operators and movements in the hybrid CGWW algorithm results in high performance for the exploitation best solutions to find the global optimum. The best, worst, average, and standard deviation results in Table 5 indicate that the proposed CGWW algorithm obtained the minimums for the F1, F2, F3, F4, and F7 functions. About the F5, the minimum of best solution belongs to the PSO algorithm, but for other evaluation parameters, the CGWW has found the minimum. About the F6, the proposed algorithm does not have an acceptable performance where the best answer belongs to the AEFA, and the minimum of worst, average, and standard deviation are obtained by the SSA. Therefore the CGWW has a high capability for finding the global optimum solution of the unimodal functions and presents a high performance in the exploitation phase with extra precision.

Table 5 Results of compared algorithms on the unimodal benchmark functions

4.3.2 Results for solving the multimodal benchmark evaluation functions

The second group of test functions, named multimodal (F8 to F13), have one global optimum and many local optima. By increasing the problem dimensions, the number of local optima increases exponentially in the multimodal functions. These functions can evaluate the strength of optimization algorithms to escape from local optima. Therefore, the multimodal functions can estimate the power of the optimization algorithms in the exploration phase. Table 6 presents the execution results of the compared algorithms for the multimodal benchmark functions. The results indicate that the CGWW algorithm could find the best solutions on 5 out of 6 multimodal benchmark functions. The exploration power of the CGWW algorithm comes from its several features. First of all, the GWO algorithm has remarkable proficiency in exploring the search space and avoiding local optima as a part of the CGWW. Secondly, the WOA, which has various moving strategies, can avoid local optima by its random movement in most cases and explore the search space thoroughly. Thirdly, the chaotic maps are applied in the initialization phase and during the iterations of the hybrid algorithm for solving the stagnation problem of search agents. So, the proposed algorithm is prevented from getting stuck in local optima, and its exploration ability is improved. The obtained results in Table 6 indicate that the CGWW algorithm finds the minimum values for the F9, F10, and F11. For the F8 function, the minimum value for the best solution is found by the proposed algorithm. But, the minimum value of worst and average solutions is obtained by the GA. For F12 and F13 problems, the CGWW algorithm finds the minimum for the average, worst, and standard deviation. But, the minimum of the best answer is found by the PSO algorithm. Therefore, the CGWW presents a high performance in the exploration phase and can explore the search space to obtain more promising areas and the global optimum for most of the multimodal benchmark functions by avoiding all local optima.

Table 6 Results of compared algorithms on the multimodal benchmark functions

4.3.3 Results for solving the composite benchmark evaluation functions

The third group of test functions, named composite (F14 to F23), have fewer local optima and dimensions than the multimodal group. For solving the composite benchmark functions, which is an exceptionally challenging job, the optimization algorithms should establish a balance between the exploration and exploitation operations. Table 7 presents the execution results of the compared algorithms on the composite benchmark functions. According to the obtained results, the CGWW algorithm can find the best solutions on 8 out of 10 composite benchmark functions. The GWO and WOA algorithms have acceptable performance for balancing the exploration and exploitation operations. The hybrid CGWW algorithm inherits this skill from both algorithms. The chaotic maps and roulette wheel operator also reinforce the hybrid algorithm to perform the exploration and exploitation phases more accurately. Therefore, the hybrid algorithm focuses on the exploration and does the long and sudden movements in the early iterations. Then the proposed algorithm considers short and gradual movements to perform the exploitation in the last repetitions. For the F14, F15, F19, F20, and F23 composite benchmark functions, the CGWW algorithm finds the minimum best, worst, average, and standard deviation. The compared algorithms have similar behavior for finding the minimum best, worst, and average values for the F16, F17, and F18 functions. But, about the standard deviation value, the PSO is better than the other algorithms. Finally, the ABC is superior to the others on the F21 and F22 functions in finding the minimum value for the best, worst, average, and standard deviations. The results of Table 7 confirm that the CGWW is capable of solving complicated optimization problems because the search spaces of the composite test functions are very similar to the search space of real-world problems. Therefore, the CGWW presents a high performance to balance the exploration and exploitation phases to solve problems having a complicated search space.

Table 7 Results of compared algorithms on the composite benchmark functions

4.4 Convergence analysis of proposed algorithm

In this section, the convergence speed of the proposed algorithm is compared with the other optimization methods. The purpose of convergence analysis is to demonstrate the exploration and exploitation mechanisms of the CGWW algorithm. The convergence curves of GA, PSO, ABC, MFO, GWO, WOA, SSA, AEFA, and CGWW algorithms are demonstrated for some of the benchmark test functions in Fig. 3. a and b. For each iteration of the mentioned algorithms, the average of the best solution’s value for 100 times of executions is plotted as the convergence curve in Fig. 3a and b. The search agents investigate the promising areas of the search space during the iterations of the CGWW and perform the exploitation operations. These search agents move with a high velocity in the beginning steps and then gradually converge at a lower speed to a near-optimal solution. Figure 3 demonstrates that the convergence performance of the CGWW algorithm For F1, F2, F6, F7, F12, and F13 benchmark functions is much better than other algorithms. The convergence curve of the proposed algorithm is very similar to the WOA and GWO for the F5. Competition between the WOA and CGWW is very intense but, after iteration 65, the CGWW is the winner. The same conditions exist about the F21 function between the ABC and CGWW algorithms. For the F18, the GWO algorithm has the best convergence curve between the others. Finally, for the F22 and F23, the ABC algorithm has the best performance about the convergence curve. The investigation results imply that the CGWW presents the fastest convergence for most of the benchmark functions. For the remaining benchmarks, the obtained results of the CGWW are competitive with those of the other compared meta-heuristic optimization methods.

Fig. 3
figure 3figure 3

Convergence curves of compared optimization algorithms

4.5 Results of the proposed algorithm and the Literature

In this section, the results of the CGWW are compared with the existing ones in the literature. The experimental conditions in the literature are applied for the execution process of the proposed algorithm. Table 8 presents the adopted results from [3, 25] for the WOA, PSO, DE, and GWO, along with the results of the CGWW, using the same experimental conditions, for solving 23 benchmark test functions. The number of iterations and search agents for all the algorithms are considered 500 and 30, respectively. The results in Table 8 indicate that the CGWW algorithm could find the best solutions in 14 out of 23 benchmark functions. The results of applying the GA, ABC, SBO, and ALO algorithms for solving 13 unimodal and multimodal benchmark functions, extracted from [12], are also presented in Table 9. According to experimental conditions in [12], the number of iterations for the proposed algorithm is assumed to be 1000. The results in Table 9 indicates that the CGWW algorithm could find the best solutions in 12 out of 13 benchmark functions. The results of Tables 8 and 9 show that the CGWW algorithm presents a remarkable performance for solving unimodal, multimodal, and composite benchmarks by establishing a proper balance between the exploration and exploitation stages of the proposed algorithms.

Table 8 Results of the CGWW algorithm in comparison with the results in [3, 25]
Table 9 Results of CGWW algorithm in comparison with the results in [12]

4.6 Application of the proposed algorithm for intrusion detection

4.6.1 Feature selection and intrusion detection systems

Intrusion detection systems are classified into two main groups of anomaly detection and misuse or signature−based detection [72, 86]. Anomaly detection systems build profiles of normal network events. Network events that do not match with these profiles are classified as attacks. In misuse detection systems, attacks are detected based on their signatures and according to known attack methods. These systems cannot identify new network attacks because they decide based on the previously known attack types. But anomaly detection systems can detect new attacks if they do not conform to normal profiles. But the drawback of these systems is more false alarms due to the failure to identify previously unknown normal events. Hybrid models of the anomaly and misuse detection provide the ability to detect attacks more efficiently [86]. Due to the existing large number of features in each network event in intrusion detection systems, feature reduction is a significant problem to speed up attack detection. For feature selection, by removing related and duplicate features, an optimal subset of features is selected that represents the entire dataset. The optimal set of features is used to construct a classifier with high detection accuracy. The job function of a classifier is to classify network events and identify normal events from the attacks. Several classifiers are applied in intrusion detection systems. In this experiment, the proposed CGWW algorithm is used to select the optimal set from the set of available features for network intrusion detection. Other mentioned optimization algorithms are compared in this experiment with the proposed CGWW algorithm for solving the feature selection problem.

One of the simplest and most widely used forms of Bayesian networks is the Naive Bayesian network (NBN). NBN is a data mining tool used in various fields, such as a classifier in intrusion detection systems. The reason for the simplicity of this classifier is the independence of its constituent features from each other[87]. The focus of this experiment is more on the feature selection step of the intrusion detection systems. Therefore the NBN is applied to use the selected features in the previous step as a simple classifier.

4.6.2 The NSL KDD99 dataset

The NSL KDD99 is adopted as the intrusion detection dataset in the feature selection experiments. It is an updated version of the KDD99 dataset which most of the defects are fixed [88]. There are no duplicate records in the train and test set of NSL KDD99. This dataset contains 125,973 and 22,544 records for train and test sets, respectively. Each record in the NSL KDD dataset includes 41 features, which have three types of nominal, binary, and numeric. There are normal and anomaly records in the dataset. Anomaly records are categorized according to Table 10.

Table 10 Different types of attacks in NSL KDD99

4.6.3 Evaluation criteria and fitness function for the feature selection problem

In this experiment, three evaluation criteria are used to compare different optimization algorithms for feature selection [89]. The first criterion is Detection Rate (DR), which refers to the percentage of the anomaly events which are correctly classified. The second criterion is False Positive Rate percentage (FPR) which relates to the normal network events identified as attacks. The Accuracy Rate (AR) is the third criterion which is the percentage of correctly classified events. The DR and AR are expected to be high, but low values are awaited for the FPR. The DR and FPR are two inconsistent objectives for the feature selection problem, so, similar to the presented approach in [72], a weighted sum value of the DR and FPR is applied as the fitness function for this experiment. However, the number of selected features and AR are the objectives of the proposed approach of [72]. The fitness function to evaluate the feature subsets, obtained with optimization algorithms, is presented in Eq. (25).

$$ {\text{Fitness of each subset}} = \, m \, * \, DR \, + \, n \, * \, \left( {1 \, - {\text{ FPR}}} \right) $$
(25)

Because the DR value is more important to identify the attacks, we assumed m = 0.7, n = 0.3. Another important measure to deal with the feature selection problem is the number of selected features subset. For the simplicity of the problem, we assumed the number of features to be k = 5, 10, 15, and 20. But the best way to consider all measures for solving the problem is by applying a multi-objective algorithm with an objective function for selected features count.

4.6.4 Solution representation

For the feature selection problem, each search agent of all the algorithms contains a solution. This solution is in an array form consists of 41 floating-point numbers for 41 features. Each number in the array represents the importance of the related feature in the feature set. After some iterations of each algorithm, the k most significant features in candidate solutions are selected to perform classification over the test set. For example, consider the solution represented in Eq. (26) for ten numbers related to ten features. If k = 5, the features 3, 4, 6, 8, and 10 have the five highest values in the array, so they are selected for the classification.

$$ {\text{A sample solution for 10 features }} = \, \left[ {12.3,15,16.7,32,11,87.9,11.2,17.2,15.7,42} \right] $$
(26)

4.6.5 Comparison of optimization algorithms for feature selection

The proposed CGWW algorithm is compared with the GA, PSO, GWO, ABC, SSA, and AEFA for solving the feature selection problem of intrusion detection systems. Table 11 shows the DR and FPR values resulted from compared algorithms for NSLKDD 99 dataset.

Table 11 shows that the CGWW presents better performance than the other algorithms for k = 5 and k = 10. For k = 15, the DR value for CGWW is lower than ABC and higher than the others. For k = 20, the DR value for CGWW is lower than ABC and SSA and higher than the remaining algorithms. Table 11 shows that 10 is a reasonable number for the number of selected features. Table 11 also shows that the proposed CGWW algorithm has high FPR for k = 5 and k = 20. If the number of selected features is 10 and 15, the FPR value for the CGWW algorithm is lower than 2 and 4 compared algorithms, respectively. Table 12 shows the AR values resulted from compared algorithms for the NSLKDD 99 dataset.

Table 11 The intrusion detection and false-positive rate of compared algorithms for k = 5, 10, 15 and 20

Results in Table 12 presents that the proposed CGWW algorithm has better AR than 6 of the compared algorithms for k = 10 and 15 selected features. But for k = 5 and 20, the AR value of the CGWW algorithm is not so noticeable. According to the results of Tables 10, 11, and 12 the CGWW algorithm can find the second-best solutions in most cases for the feature selection problem in the intrusion detection systems.

Table 12 The accuracy rate of compared algorithms for k = 5, 10, 15 and 20

The conducted experiments indicate that the proposed CGWW algorithm presents competitive performance compared with the other optimization algorithms for solving the benchmark and real-world applications.

5 Conclusion

In the present paper, a chaotic and hybrid optimization algorithm has been presented named CGWW. The CGWW algorithm has been developed by the composition of the GWO and improved WOA algorithms to utilize the strength points of both algorithms. Applying the chaotic maps instead of random generators in the initialization and controlling parameters of the CGWW algorithm is also led to an improvement in the exploration mechanism. The main modification of the WOA is the utilization of the roulette wheel selection method to choose the search agents with better scores and consequently improve the exploitation mechanism of the CGWW. The second advantage of the proposed algorithm is its multi-swarm characteristic. A group of whales from WOA are cooperating with the gray wolves of GWO in the proposed algorithm to establish a more diverse population. For evaluating the proposed algorithm, 23 benchmark problems are employed. Furthermore, the CGWW algorithm is evaluated in solving the feature selection problem in intrusion detection systems. The experimental results presented that the CGWW algorithm can offer remarkably competitive outcomes compared with the other optimization algorithms such as PSO, MFO, ABC, GA, SSA, SBO, ALO, AEFA, GWO, and WOA.

There are some limitations in comparing the CGWW with some of the new optimization approaches. For some algorithms, the source code was inaccessible, or precise implementation was impossible. For the others, the investigated benchmark functions were different from those applied in this paper. Therefore, the CGWW algorithm may be compared with other state-of-the-art meta-heuristics in future works. The proposed algorithm also has some disadvantages. Firstly, it cannot find the optimum solution for one of the unimodal, one for multimodal, and two of the composite benchmark functions. Therefore, by more investigation on some parameters of the hybrid algorithm, obtaining better results is possible. The first set of parameters are those for balancing exploration and exploration. The second one is the member count of sub-populations. Secondly, as the computational complexity of the CGWW algorithm indicates, by adding the roulette wheel selection operator and sorting the total population of the hybrid algorithm to balance high-quality solutions between sub-populations, the running time of the algorithm increased. Using the chaotic maps consumes more computational time than the random generator. Nevertheless, the complexity order of the CGWW algorithm has no increase in comparison with other compared meta-heuristics. As the last point, developing a binary and multi-objective version of the CGWW algorithm for solving practical optimization problems in different areas could be future works.