1 Introduction

Metaheuristics [1, 2], which are typically inspired by physical phenomena, animals’ behaviours or evolutionary concepts, are becoming very popular over the last few years. In addition to the large number of theoretical works, metaheuristics have been applied to different real-life problems in various fields of study. The reasons behind popularity of metaheuristics are simplicity, flexibility, derivation-free mechanism and local optima avoidance capability [3]. First, metaheuristics are mostly based on very simple concepts of nature and easy to understand. The simplicity of metaheuristics helps researchers to simulate different natural phenomena, propose new metaheuristics, hybridize two or more metaheuristics for better efficiency or modify current metaheuristics. Second, flexibility refers to the quick applicability of metaheuristics to different real-life problems without any special changes in the structure of the algorithm since they presume optimization problems to be black boxes. For a metaheuristic, we need to be concerned only about the input(s) and output(s) of a problem. Normally, all researchers need to know how to represent a problem for metaheuristics by an objective function that will be minimized or maximized. Third, mostly all metaheuristics have derivation-free mechanisms since metaheuristics optimize problems stochastically. The metaheuristics process is initiated with random solution(s) called population, and calculation of derivative of search spaces is not necessary. These characteristics make metaheuristics greatly appropriate for real-life optimization problems with expensive or unknown derivative information. Finally, due to the stochastic nature of metaheuristics, they have superior capability to avoid local optimal point compared with conventional optimization methods and also help search the entire search space extensively. Usually, the search space of real-life optimization problems is very complex and unknown with lots of local optima; hence, metaheuristics are very efficient for optimizing these types of problems.

Although there are different types of metaheuristics, two phases of search process: exploration (global search) and exploitation (local search), are basic characteristics of all. The exploration phase is the process of finding the promising area(s) of the overall search space. On the other hand, exploitation refers to the searching capability around the promising regions obtained in the exploration phase. Suitable balance between these two types of search phases is a very crucial task for any metaheuristic, which is normally obtained using different stochastic operators.

Generally, metaheuristics can be classified based on number of initial solutions into two groups: single-solution-based and population-based. In case of single solution (as an example: Simulated Annealing [4]) approaches, the optimization starts with a single candidate solution. This initial solution is then modified over the course of iterations. For population-based meta-heuristics, the search process is initialized with a random initial population (a set of solutions), and this population is changed during iterations. However, population-based metaheuristics have some advantages compared with single-solution-based algorithms: greater exploration capability, less chance to stick in local optima and ability to cover all promising search space for sharing information among the population. On the other hand, metaheuristics may be divided into three major classes on the basis of their nature: Evolutionary, Physics-based and Swarm Intelligence (SI) algorithms.

Usually, Evolutionary Algorithms (EA) are inspired by the concepts of evolution in nature, like mutation, etc. Some of the popular EAs are Genetic Algorithm (GA) [5, 6], Differential Evolution (DE) [7], Evolutionary Programming (EP) [8], Biogeography-Based Optimizer (BBO) [9], etc. Use of several operators like selection, crossover, mutation, etc. and computational deficiency are the main disadvantages of EA.

The second branch of metaheuristics is physics-based techniques. Such algorithms typically mimic physical rules, namely gravitational force, ray casting, electromagnetic force, inertia force, weights and so on. Some of the most popular algorithms are Big-Bang Big-Crunch (BBBC) [10], Gravitational Search Algorithm (GSA) [11], Black Hole (BH) algorithm [12], Galaxy-based Search Algorithm (GbSA) [13], etc. Although a large number of physics-based metaheuristics exist, they are not so popular due to the complex nature of these algorithms and search strategies are also not so good.

Finally, the most popular and efficient subclass of metaheuristics is the SI methods. According to Bonabeau et al [14], SI is “The emergent collective intelligence of groups of simple agents”. These algorithms mostly mimic the social behaviour of swarms, herds, flocks or schools of insects and animals in nature where the search agents navigate using the simulated collective and social intelligence of creatures. Some of the advantages of SI algorithms are the following: easy implementation, fewer parameters to adjust, less operators to be used and storing capability of previous results. Most popular SI techniques are as follows: Particle Swarm Optimization (PSO) [15], Bat Algorithm (BA) [16], Cuckoo Search (CS) [17], Flower Pollination Algorithm (FPA) [18], Firefly Algorithm (FA) [19], Ant Colony Optimization (ACO) [20], Artificial Bee Colony (ABC) Optimization [21], Social Spider Algorithm (SSA) [22], etc. For example, PSO algorithm simulates the social behaviour or movement of particles such as bird flocking or fish schooling. It uses particle best and global best position to update the velocity and position of the particles, i.e., solutions. PSO is very popular for its good convergence speed and exploration capability. Moreover, it has few optimization parameters that need to be tuned properly. Many types of modified PSO [23,24,25,26] have already been proposed with advanced variants. BA searches the optimal value of a function on the basis of foraging of prey using echolocation behaviour of bats by updating frequency, velocity and position. However, BA requires many variables to be properly tuned for better search [16]. CS is based on brood parasitism of cuckoo birds, which reproduce their eggs by utilizing nests of others host birds, where the highest quality nests with eggs (i.e., best solutions) selected move over to the next generation. Due to the use of Levy flights in the algorithm, it is very efficient and performs superior than PSO [17] but requires more computational time. FPA is typically associated with the transfer of pollen for reproduction or flowering of plants by different pollinators such as insects. Due to long-distance pollinators and flower consistency [18], FPA can explore larger search space and converges more quickly. FA is based on gradual movement of fireflies towards brighter light during dark night. The main disadvantage of FA is that its computational complexity is very high and it requires lot of computational time [19]. Moreover, its convergence speed is not up to the mark. The social intelligence of ants in finding the shortest path between the nest and a source of food using pheromone is the main inspiration for ACO [20]. It can be used in several dynamic applications but its theoretical analysis is quite difficult as probability distributions change by iteration and sequences of random decisions. Moreover, time to convergence is also uncertain. On the other hand, ABC [21] mimics the collective behaviour of bees in finding food sources, i.e., honey. The advantages of ABC are robustness and fast calculation; however, it suffers from a major drawback, i.e., search space limited by initial solution. SSA algorithm [6] is based on the foraging behaviour of social spiders and the information-sharing foraging strategy. This is conceptually simple but consists of a large number of probabilistic variants.

The No Free Lunch (NFL) theorem [27] is worth mentioning here. Through the NFL theorem it was logically proved that there is no single metaheuristic that is best suited for solving all kinds of optimization problems; this implies that a particular algorithm might show very promising results on a set of problems, but the same metaheuristic might show worse results on a different set of problems. Therefore, proposing new metaheuristics and modification of current approaches (with good convergence speed, good accuracy, less computational time, less number of parameters to be tuned, good exploration and exploitation capability) is still a very fascinating field of study to the computer science researchers.

This paper aims to introduce a new SI-based optimization algorithm, namely elephant swarm water search algorithm (ESWSA), which is based on water search strategy of elephants swarm and its application in real-life problems. The rest of the paper is organized as follows. In the next section, we discuss some social behaviour and communication techniques of elephants that would be of help for the proposed new ESWSA optimization technique. In section 3, the ESWSA metaheuristic is proposed and elaborated. Later, we perform experimentation for global optimization of several numerical benchmark functions and statistical analysis to validate the efficiency of the proposed algorithm. The results of ESWSA are also compared to those of several state-of-art techniques such as BA, CS, FPA and PSO. Results regarding the performances of ESWSA for two well-known constrained optimization problem in the field of engineering are given in section 5. Following this, ESWSA is applied to a real-life optimization problem; it is used to reconstruct or infer Gene Regulatory Network (GRN) from time-series gene expression data. Results analysis and discussion are given in section 7. Conclusion is given in section 8 followed by references.

2 Behaviours of elephants

Elephants are the largest living terrestrial animals [28] of the family Elephantidae, which are traditionally categorized in two species, the African and Asian elephants [29]. African elephants are 3–4 m tall and weigh 4,000–7,000 kg while Asian elephants are 2–3.5 m in height and weigh 3,000–5,000 kg [30]. Elephants are recognized by their large ears, long trunk, long tusks, large limbs and huge body structure. Elephants are often found to exist in a “fluid fission–fusion” social environment [31]. In natural habitat, elephants live in herds comprising 3–35 elephants. Number of elephants in one herd can be varied depending on weather and availability of food, water, etc. Living in groups is a social habit that pays an important role in protecting the herd members and finding resource for living [32]. Each elephant herd is guided by the biggest and oldest adult cow, while the adult bull lives only for breeding periodically. Meanwhile, young cows and children keep staying in the group.

Elephants are well known for their good memory [33] and exhibit advanced intelligence [34] like self-recognition, an indication of self-awareness and cognition that has also been demonstrated in some apes and dolphins [35]. One study of a captive female Asian elephant suggested that the animal was capable of learning and distinguishing between several visual and some acoustic discrimination pairs [36]. Elephants can also use tools in real life.

Social mammals like elephants have very advanced sensing and communication systems; they use of all of their senses of hearing, smell, vision, touch and exceptional ability to detect vibrations [37]. Among different communication techniques, seismic, acoustic and chemical communications are used for long-distance communication up to 10–12 km away. On the other hand, visual and tactile communications are considered as short-distance communication. For seismic communication, elephants are able to pick up the seismic signals (i.e., 10–40 Hz vibration on earth surface caused due to rumble and movement of elephants), to orient in the direction the vibrations come from and even to respond to them appropriately using their mechano-receptors [38] in the toes or feet and the tip of an elephant’s trunk, which are extremely sensitive to vibrations. Elephants produce a broad range of acoustic (i.e., sound) signal from a low of 5 Hz to a high of over 10,000 Hz (generally called as infrasound) rumbles [38]. These animals use their ears like a parabola, scanning back and forth while remaining still to listen to low-frequency sounds from other distant elephants. The trunk can amplify audible sounds. The use of chemical or olfactory cues is central to communication between elephants [38]. They often raise their trunks up to smell the air, or use the tips of their trunks to explore the ground or tree as well as to sniff other elephants. The trunk is a fusion of nose and upper lip and is made up of millions of olfactory receptor cells. On the other hand, the eyesight of elephants is said to be good in dull light, but considerably reduced in bright light, reaching a maximum range of 46–100 m [38]. Elephants use many different displays and gestures to communicate with one another. In case of visual communication, heads, eyes, mouth, ears, trunk and even their whole body are used for signalling messages to each another or to other species. Elephants are extremely tactile animals. They touch each other purposefully using their trunk, tusks, feet, tail, etc. Tactile communication between elephants occurs to express aggressive, defensive, sexual and exploratory behaviour [38], etc.

During dry weather, lack of water creates an overwhelming problem for all animals, especially for huge animals like elephants. In spite of the ruthless living conditions created during drought, elephants are equipped well to survive by finding water resources. Elephants may utilize one or more communication system or methods to search the water resources, depending on their current conditions. An adult elephant drinks approximate 40–60 gallons/day on average from rivers, water holes, lakes, ponds, etc. If the area is very dry and the duration of drought is very long, elephants can migrate to other surrounding areas where water is available or plentiful. Typically, they migrate as far as required to find food and water. If the size of drought area is small, the elephants do not go far away usually. When the drought area is very large, elephants can travel to more remote areas in search of water, where they stay until the rainy season. They dig with their feet, trunks and tusks into dry stream earths or beds or other spots to discover ample amount of water source lurking below the surface. Elephants also help others animals by creating or finding new water sources. Elephants show an unselfish and social behaviour during adverse situation like drought. They communicate and share information among different elephant groups for better water searching, which is another secret of their longevity.

3 Elephant swarm water search algorithm

Now we can idealize some characteristics of water search strategies for elephants so as to develop ESWSA. For simplicity, following four idealized rules are used to describe our proposed algorithm.

  1. (i)

    Elephants roam around in search of water during drought in several groups, which is called elephant swarm. Each group consists of a number of elephants and all groups (i.e., elephant swarm) work together to find water. The leader of each group (oldest cow) is responsible for taking decision about the movement of the group to search the best water resource. For an optimization problem, each elephant group is identified by its particular velocity and position whereas each elephant group of the swarm is similar to the solution of the corresponding problem.

  2. (ii)

    Whenever an elephant group finds some water resource, the leader communicates (via seismic, acoustic, chemical, visual and tactile communication) with the other groups of the swarm about the quantity and quality of the water. For a maximization problem, the fitness value and the objective function are directly proportional to the quantity and quality of the water resources. Better water level denotes better solution.

  3. (iii)

    Elephants have very sharp memory. Each elephant group can remember the best location of water supply that was discovered by its own group (local best solution) so far and the best location of water source so far (global best solution), which was discovered by the whole swarm or all groups. Based on these memories (solutions), the elephant group can move from one point to another, i.e., velocity and position of each elephant group are updated gradually during the searching process according to some rules (the rules for global search, local search and position updating are described later). Long-distance and short-distance communication techniques of elephants are dominant for global and local search, respectively.

  4. (iv)

    Water search in local and global area is controlled by a probabilistic constant called switching probability p ∈ [0, 1]. The leader of the group takes probabilistic decision to switch between local search and global search during water search. Due to the physical proximity and other factors such as attenuation of signal from large distance, local water search can have a significant fraction p in the overall searching activities.

For d-dimensional optimization problem, the position of i-th elephant group of a swarm (consisting of N particles, i.e., number of elephant groups) at t-th iteration is given as X ti,d  = (xi1xi2, …, x id ) and the velocity is represented by V ti,d  = (vi1vi2, …, v id ). Locally best solution by i-th elephant group at current iteration is given as P tbest,i,d  = (Pi1Pi2, …, P id ) and global best solution is denoted by G tbest,d  = (G1G2, …, G d ). Initially, the elephant groups (position and velocity) are randomly placed throughout the search space. As iteration proceeds, the velocity and position of the elephants are updated according to some rules.

Most water finding activities can occur at both local and global scales. In practice, adjacent water searches in the not-so-far-away neighbourhood are more likely to be executed by the group than those far away. For this, a constant known as switching probability p is used to switch between global and local water searches. It is assumed that if value of a random variable is greater than p, common global water search will be performed, else intense local water search will be executed. This randomized condition helps reducing the probability of sticking at local optima. Global and local best solutions are updated after each iteration. As iteration proceeds, the velocities of the particles are updated in different ways for global and local search according to following equations depending on the value of parameter p:

$$ V_{i,d}^{t + 1} = V_{i,d}^{t} \omega^{t} + rand\left( {1,d} \right) \odot \left( {G_{best,d}^{t} - X_{i,d}^{t} } \right)\quad {\text{if}}\,rand > p \left[ {\text{for global search}} \right] $$
(1)
$$ V_{i,d}^{t + 1} = V_{i,d}^{t} \omega^{t} + rand\left( {1,d} \right) \odot \left( {P_{best,i,d}^{t} - X_{i,d}^{t} } \right)\quad {\text{if}}\,rand \le p \left[ {\text{for local search}} \right] $$
(2)

where rand(1, d) generates a d-dimensional array of random values within [0,1]; ⊙ denotes element-wise multiplication; ωt is the inertia weight at current iteration to balance between exploration and exploitation. Then, position of an elephant group is modified according to following equation.

$$ X_{i,d}^{t + 1} = V_{i,d}^{t + 1} + X_{i,d}^{t} $$
(3)

\( t_{max} , X_{max}\, {\text{and}}\, X_{min} \) denote the values of maximum iteration number, upper boundary and lower boundary of the positions, respectively. After completion of all iterations, the elephants gradually update their position and reach the best water resource position, i.e., best solution of the optimization problem. Thus, the pseudo-code of the proposed ESWSA is given as follows.

figure a

Our proposed ESWSA optimization technique is slightly different from standard PSO metaheuristic. In case of PSO, the velocity update formula, i.e., new search direction is always effected by three components namely: current velocity (v ti,d ), current particle memory influences (P tbest,i,d ) and swarm memory influences (G tbest,d ) [39]. Particle memory influence is associated with current best position of a particle and swarm memory influence is associated with the global best position among all particles. Hence, it may be considered that the particles memory influence is responsible for local search and swarm memory influence is responsible for global search of the optimization. It is interesting to note that random parameters are incorporated along with the global search or local search terms so that chance of sticking at local optima for the metaheuristics can be reduced.

However, in case of ESWSA, the velocity update formula, i.e., new search direction is effected by either current velocity and current elephant memory influences or current velocity and swarm memory influences depending on probabilistic value (p), also known as switching probability. Hence, depending on the value of p, velocity will be updated on the basis of current elephant best or global best solution, i.e., ESWSA switch from local search to global search or vice versa on the basis of switching probability. During local search of ESWSA, velocity is updated according to current elephant best position, i.e., search around the current best solution. During global search of ESWSA, velocity is updated according to global best position of elephants, i.e., search towards the global best solution. However, in case of PSO, the search is simultaneously effected by both current and global best solution. This is the main difference between ESWSA and PSO with respect to formal hypothesis. Figure 1a and b shows the depiction of velocity and position update procedure for PSO and ESWSA, respectively, during iteration that determines the new search direction.

Figure 1
figure 1

Depiction of velocity and position update in (a) PSO [39] and (b) proposed ESWSA.

4 Study of ESWSA for global optimization

To validate the proposed ESWSA algorithm, we have chosen 10 different benchmark functions [40, 41], which are De Jong, Rosenbrock, Schwefel P2.22, Noisy Quadric, Rotating hyper-ellipsoid, Ackley, Easom, Griewank, Rastrigin and Alpine. Among these functions, first five are unimodal and the rest are multimodal in nature. The details of these functions, respective search ranges of variables and corresponding global minima points are presented in table 1.

Table 1 Different benchmark functions for global optimization.

4.1 Parametric sensitivity of ESWSA

Selection of suitable parameters of ESWSA is a very important task to achieve best performances for numerical and real-world optimization problems. Use of trial and error scheme or random selection of parameters for a real-life problem may lead to high computational cost and less efficiency. Normally, researchers have used several benchmark functions to tune the parameters of an optimization algorithm to accomplish the best performances and they can also help in revealing some important characteristics about the parametric sensitivity of the algorithm. We carry out extensive simulations on our benchmark functions, which cover a wide range of optimization problems. Thus, the derived rules of thumb can be expected to give generally good performance on unknown problems. In EWSA, following parameters are employed to guide searching behaviour.

  • Inertia weight (ωt): It is a deterministic parameter (use some pre-defined rules to change the parameter values throughout the search) that denotes inertia weight of velocity at current iteration.

  • Switching probability (p): It is a fixed parameter (value remains constant throughout the whole search) that denotes the switching probability between local and global water search.

  • Maximum iteration number (t max ) and number of population (N) are also fixed parameters of ESWSA optimization.

4.1a. Performance on the basis of inertia weight (ωt): Many strategies have been proposed for updating inertia weight during the course of iteration for PSO. However, we consider only three types of inertia update technique for ESWSA. In case of Constant Inertia Weight (CIW) technique [42], value of inertia weight is constant (normally 0.5 is preferred) throughout the iteration. Thus, CIW can be described using the following equation:

$$ \omega^{t} = constant. $$
(4)

In case of Random Inertia Weight (RIW) [16], the value of inertia weight is selected in a random manner and it is very efficient to find out the optima in a dynamic system. For RIW, the value of inertia weight is assigned using the following equation:

$$ \omega^{t} = 0.5 + rand/2 $$
(5)

where rand is a function that generates random number within [0, 1].

Linearly Decreasing Inertia Weight (LDIW) [15, 43, 44] is a very popular and efficient technique in improving the fine-tuning characteristics of the PSO, where the value of inertia weight depends linearly on the iteration number. In case of LDIW, the value of ω is linearly decreased from an initial large value (ω max ) to a final small value (ω min ) according to the following equation:

$$ \omega^{t} = \omega_{max} - \left\{ {\frac{{\omega_{max} - \omega_{min} }}{{t_{max} }}} \right\}t $$
(6)

where t is iteration index and t max denotes maximum number of iterations.

Now, these three techniques were applied on 10-dimensional benchmark problems to observe the impact of these strategies on ESWSA. For this, a swarm of 50 elephant groups (population) and 1000 maximum iterations are considered. Value of switching probability is set as 0.5. All the techniques were simulated using Matlab 7.6 with 2 GB RAM, a Dual Core processor and Windows7 operating System. Each function is tested 50 times for each of the inertia weight update strategies. Resultant median of best solution and deviation from it for each benchmark function and each strategy are shown in figure 2 using boxplots.

Figure 2
figure 2

Box plot of raw simulation results for different benchmark functions.

It can be clearly seen that, LDIW performed better than other strategies as the median of best solution for LDIW is always nearer to the global minima and also smaller than the other two. Moreover, the variation in output is minimum for LDIW. Therfore, we shall use LDIW technique for the rest of our experiment and simulation.F

4.1b Performance on the basis of switching probability (p): Now, ESWSA along with LDIW is applied against all afore-mentioned 10-dimensional benchmark functions for different values of p with the same number of iteration and population. The value of p is selected from the set {0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.99}. The experiment is performed 50 times for each of the functions. The performance evaluation criteria are minimum fitness value, mean of fitness values, median of fitness values and standard deviation of the same. Table 2 shows the values of p that yield the minimum fitness, mean, median and standard deviation. It is observed that for different functions, the best solution is achieved for different values of p. Therefore, we observed the maximum number of occurrences for a particular value of p such that the best performance can be achieved. For an example, the minimum fitness is obtained for benchmark function number 7 for the value p = 0.6. It can be also observed that p = 0.6 gives better performance for most of the cases under different evaluation criteria. Therefore, it can be concluded that p = 0.6 is the most suitable for numerical and real-life problems of optimization. Moreover, from this point we shall consider the value p = 0.6 for further simulation using the proposed ESWSA optimization technique.

Table 2 Performance of ESWSA for different values of p.

4.1c Performance on the basis of population and maximum iteration number: Next, performance of ESWSA optimization is observed for different values of population and maximum iteration number of the algorithm. For each function, value of population and maximum iteration number are selected from the sets {25, 50, 75, 100} and {1000, 2000, 3000, 4000, 5000}, respectively. We use the LDIW technique and value of p is 0.6 for 10-dimensional function minimization problems. The experiment is performed 50 times for each case and table 3 shows the mean value of the best fitness for each function for a set of population and maximum iteration number. It is observed that mean fitness is quite satisfactory for all functions except Rosenbrock and Rastrigin although values of best fitness were very small for these two functions. For all unimodal functions (except Rosenbrock), mean value decreases considerably if population and maximum iteration number increase. Moreover, for multimodal function, increment in maximum iteration number has less effect on mean fitness. However, if we increase population number, the mean fitness value decreases considerably. However, increase in population and iteration number will lead to more computational time. Therefore, observing the following table, N = 50 and t max  = 2000 are preferred in this research work as well as for other real-life optimization problems as it will balance accuracy and computational time.

Table 3 Performance of ESWSA for different values of population and maximum iteration number.

4.2 Comparison with other state-of-art optimization techniques

In this section, the simulation results of ESWSA on the different benchmark functions described in table 1. Moreover, we perform comparison among ESWSA and other algorithms and statistical analysis of the simulation results. For comparison purpose, we have selected some well-known optimization methods, namely BA [16, 45, 46], CS [17, 47, 48], FPA [18, 49] and PSO [15, 43, 44]. For all algorithms, population and maximum iteration number are set to 50 and 2000, respectively. Search space is restricted to 10, i.e., we have considered 10-dimensional function optimization problems. Each algorithm is executed 50 times for each function. The parameters setting for each algorithm in the comparison is described as follows:

  1. 1.

    For ESWSA, value of switching probability (p) is set to 0.6 and the inertia weight factor decreases linearly from 0.9 to 0.4.

  2. 2.

    For BA, loudness decreasing factor (α), pulse rate decreasing factor (γ), minimum and maximum frequency are set to 0.9, 0.9, 0 and 1, respectively, based on earlier work [16, 45, 46].

  3. 3.

    For CS, value of discovery rate of alien eggs (p a ) is set to 0.25, the same as in previous work [17, 47, 48].

  4. 4.

    For FPA, value of probability switch (p) is fixed to 0.8 using the guidelines provided by the reference [18, 49].

  5. 5.

    For PSO, acceleration constants (C1 and C2) are both set to 2 and the inertia weight factor decreases linearly from 0.9 to 0.4 as mentioned by the author [15, 43, 50].

In these numerical experimentations, we have tested and compared the efficiency of the proposed algorithm on the basis of some criteria such as fitness test, reliability test, convergence test and computational complexity, which are described in following subsections.

4.2a Fitness test: Final output or fitness value of an optimization algorithm is the most important criterion to prove its efficiency. Using the afore-mentioned parameters setting, we optimized each benchmark function. Here, we observed 3 criteria (output), namely best (minimum) fitness, worst (maximum) fitness and median of fitness, which are obtained after 50 program runs. Comparative studies based on these criteria are shown in table 4, where most excellent values are shown in italic letters.

Table 4 Comparative study based on best fitness.

From table 4, it can be observed that ESWSA can reach global minima point for all functions except Rastrigin with satisfactory accuracy. For the Rastrigin function, only PSO was able to reach the global minima point. Moreover, for benchmark function 7, ESWSA is able to give the best (minimum) fitness and these values are far better than those from other algorithms. BA is the least efficient in this case.

Next, we have made a comparison based on worst (maximum) fitness, whose value should be as less as possible. From table 5, it can be observed that the proposed ESWSA is able to achieve the minimum value of worst fitness for benchmark function 7. Only CS and PSO are able to achieve minimum value of worst fitness for 2 cases.

Table 5 Comparative study based on worst fitness.

Next, we observe the median of 50 best fitness values corresponding to 50 runs. It is clearly observed (table 6) that median values for ESWSA are better than those from other algorithms for all benchmark functions except Rastrigin and Rosenbrock. In case of Rastrigin and Rosenbrock, CS performs better.

Table 6 Comparative study based on median fitness.

4.2b Reliability test: An optimization algorithm should always reach nearest to global minima as close as possible, i.e., it should be successful and reliable in every single run. However, due to random nature of the metaheuristic, the output may differ. Therefore, in this subsection, we have tested reliability of ESWSA and also make a comparison with the other algorithms on the basis standard deviation and success rate.

From table 7, it is clearly noticed that ESWSA gives output with very small standard deviation for all functions except Rosenbrock and Rastrigin whereas other algorithms yield comparatively larger deviations than that of ESWSA. For 7 cases, it achieves the least deviation where its nearest competitor CS has the least standard deviation for 4 cases. This proves that ESWSA gives output with less variation in output, i.e., it is a more reliable optimization process than other algorithms. Again, BA has the largest standard variation for all benchmark functions.

Table 7 Comparative study based on standard deviation.

A simulation is considered as successful if and only if the best-found fitness value is smaller than or equal to the successful threshold. The overall success rate is calculated by the number of successful runs under a specific success threshold divided by the total number of runs. Thus a larger overall success rate implies a more reliable algorithm. Here, if the absolute value of best-found fitness is smaller than 0.1 we considered it to be successful for all benchmark functions except the Easom function. In case of Easom function, if the fitness value is less than –0.9 and greater than –1.1, then it will be considered as a successful attempt. From table 8, it is observed that the proposed ESWSA has 100% success rate for all functions except Griewank, Rosenbrock and Rastrigin. It is interesting to note that CS has higher successful rate than ESWSA as they have 85% and 82% average success rate, respectively. For Rastrigin, almost every algorithm has 0% success rate except PSO, which has only 4% success rate. BA is the worst algorithm among these.

Table 8 Comparative study based on success rate.

4.2c Convergence test: The final result comparison cannot completely describe the searching performance of an algorithm. Hence, we further conduct a convergence test on the compared algorithms on each 10-D benchmark function. Next, we have tested the convergence speed of the proposed ESWSA algorithm and compare with others. For this purpose, we have calculated the mean of best-found fitness for all 50 runs at each iteration index. Then, we plot them for all algorithms and for all functions, which are shown in figure 3. It is observed that ESWSA converges faster than other algorithms do except for Rotating hyper-ellipsoid, Griewank and Noisy Quadric. For these 3 functions, CS performs better than ESWSA. PSO has the worst convergence speed. Overall, ESWSA has satisfactory convergence speed, which is highly desirable for an optimization algorithm. For most of the cases, ESWSA converges within 250 iterations.

Figure 3
figure 3

Convergences of different benchmark functions.

4.2d Computational efficiency test: Besides the previous tests, the computational complexity is also a major factor for evaluating the efficiency of an evolutionary computation algorithm. For this purpose, we studied the average execution time taken by each algorithm for each benchmark function, which indicates the computational efficiency of the algorithm. Table 9 shows a comparative study based on average execution time. It is clearly shown that ESWSA is faster than all others algorithms for all functions except Rosenbrock. PSO and ESWSA have almost same performance regarding computational complexity. CS takes the largest time to execute. This proves the efficiency of our proposed ESWSA optimization technique.

Table 9 Comparative study based on average execution time.

4.2e Overall performance: Now we summarize the performance of ESWSA based on afore-mentioned evaluation criteria and compare with other techniques. For this purpose, we assigned a performance score against each algorithm for each criterion. The value of this score is calculated as the ratio of number of cases (functions) where an algorithm achieves the best result (criterion) to the total number of cases. Table 10 shows the comparative study based on these scores to evaluate overall efficiency of our proposed algorithm. It is clearly noticed from table 10 that for all conditions, ESWSA performed better than others. Only in case of success rate, CS and ESWSA have similar performance. Therefore, it can be concluded that ESWSA is one of the most efficient optimization techniques for global optimization.

Table 10 Comparative study based on score.

4.3 Scalability of ESWSA

In addition to the 10-dimensional benchmark function tests, we also performed a series of simulations on high dimensional (both 30-D and 50-D) benchmarks to test the scalability of ESWSA. Here, we have used N = 100 and t max  = 5000 for simulation and corresponding results are shown in table 11. It has been observed that ESWSA gives very good accuracy and success rate for De Jong, Schwefel 2.22, Noisy Quadric, Rotating hyper-ellipsoid, Griewank and Alpine functions for both cases. In general, it can be stated that performance of ESWSA is degraded for high-dimensional multimodal function optimization. For unimodal and large-dimensional problem, ESWSA can provide satisfactory accuracy. However, if we increase population number (elephant swarm) and iteration number, it is expected that accuracy will be increased but computational time will also increase consequently. Therefore, for a real-life optimization problem, sufficient population and iteration numbers are needed to be considered for better accuracy and efficiency.

Table 11 Scalability test of ESWSA.

5 Application of ESWSA for constrained optimization

In order to evaluate the optimization power of the proposed ESWSA, in the presence of constraints, two engineering design problems are considered and solved, including three-bar truss and tension/compression spring. These two problems have different natures of objective functions, constraints and decision variables. In this paper, for each problem the constraints are directly handled. It indicates that if a solution cannot satisfy all constraints altogether, it will be not considered as a feasible solution and will be abandoned consequently.

5.1 Three-bar truss design problem

In case of three-bar truss design problem [51, 52], the primary objective is the minimization of the volume of a three-bar truss (statistically loaded) subject to the stress (σ) constraints, length (l) and pressure (P) on every truss member by optimizing the areas of cross section (i.e., x1 and x2). Figure 4 shows a schematic of three-bar truss design problem. This constrained optimization problem consists of three nonlinear inequality constraints and two continuous decision variables and corresponding nonlinear fitness function is given as follows:

$$ \begin{aligned} & {\text{Min}} .\quad f\left( x \right) = \left( {2\sqrt 2 x_{1} + x_{2} } \right)l \\ & {\text{s}} . {\text{t}} .\\ & \quad g_{1} \left( x \right) = \frac{{\sqrt 2 x_{1} + x_{2} }}{{\sqrt 2 x_{1}^{2} + 2x_{1} x_{2} }}P - \sigma \le 0 \\ & \quad g_{2} \left( x \right) = \frac{{x_{2} }}{{\sqrt 2 x_{1}^{2} + 2x_{1} x_{2} }}P - \sigma \le 0 \\ & \quad g_{3} \left( x \right) = \frac{1}{{\sqrt 2 x_{2} + 2x_{1} }}P - \sigma \le 0 \\ & \quad 0 \le x_{i} \le 1;i = 1, 2 \\ & \quad l = 100\, {\text{cm}}, P = 2\,{\text{kN}}/{\text{cm}}^{2} ,\sigma = 2\,{\text{kN}}/{\text{cm}}^{2} . \\ \end{aligned} $$
(7)
Figure 4
figure 4

Three-bar truss design problem [52].

Table 12 shows the performance of different metaheuristics for constrained optimization of three-bar truss design problem [52]. We have considered six criteria for the comparison. We consider a population of 50 and 2000 iterations for each metaheuristic. The program is executed 10 times and corresponding best, worst, mean, median fitness, standard deviation and execution time are presented in table 12. It is seen that all the metaheuristics can reach close to the optimal point 263.89584 as reported in the paper [52]. Best fitness of ESWSA, CS and FPA are the same and their standard deviation is also very small (for ESWSA it is 3×10−6). However, the standard deviation for PSO and BA is significant whereas CS and FPA have almost zero deviation. This indicates that the output of PSO may not be able to reach close to the optimal point for all cases. One of the most important advantages for ESWSA is that it is the most computationally efficient compared with all other metaheuristic, whereas, the time requirement for CS and FPA is the highest (almost three times).

Table 12 Comparison for constrained optimization of three-bar truss design.

5.2 Tension/compression spring design problem

The objective of tension spring design problem [52] is to minimize the weight of a tension/compression spring (f) with respect to three nonlinear and one linear inequality constraints according to Eq. (8). Figure 5 shows a schematic of tension spring design problem. It has three continuous decision variables, namely, wire diameter \( (d\, {\text{or}}\, x_{1} \)), mean coil diameter \( \left( {D\, {\text{or}}\, x_{2} } \right) \) and number of active coils \( \left( {P\, {\text{or}}\, x_{3} } \right) \).

$$ \begin{aligned} & {\text{Min}} .\quad f\left( x \right) = \left( {x_{3} + 2} \right)x_{2} x_{1}^{2} \\ & {\text{s}} . {\text{t}} .\\ & \quad g_{1} \left( x \right) = 1 - \frac{{x_{2}^{3} x_{3} }}{{71785x_{1}^{4} }} \le 0 \\ & \quad g_{2} \left( x \right) = \frac{{4x_{2}^{2} - x_{1} x_{2} }}{{12566\left( {x_{2} x_{1}^{3} - x_{1}^{4} } \right)}} + \frac{1}{{5180x_{1}^{2} }} - 1 \le 0 \\ & \quad g_{3} \left( x \right) = 1 - \frac{{140.45x_{1} }}{{x_{2}^{2} x_{3} }} \le 0 \\ & \quad g_{4} \left( x \right) = \frac{{x_{1} + x_{2} }}{1.5} - 1 \le 0 \\ & \quad 0.05 \le x_{1} \le 2,0.25 \le x_{2} \le 1.3,2 \le x_{3} \le 15. \\ \end{aligned} $$
(8)
Figure 5
figure 5

Schematic of tension/compression spring design problem [52].

Table 13 shows the performance of different metaheuristics for constrained optimization of tension/compression spring design problem [52]. For the earlier optimization setting, it has been seen that all the metaheuristics can reach close to the optimal point 0.012665 as reported in the paper [52]. Standard deviation of ESWSA and CS is very small whereas it is almost zero for FPA. However, the standard deviation for PSO and BA is significant, which indicates that the output of PSO and BA may not be able to reach close to the optimal point for all cases. One of the most important advantages for ESWSA is that its computational time is the least compared with others.

Table 13 Results for constrained optimization of tension/compression spring design problem [52].

6 Application of ESWSA for inference of GRN

Correct inference of genetic regulations [53] inside a cell from the biological database like time-series microarray data [54] (which contains gene expression level of many genes at different time instances) is one of the greatest challenges in post genomic era for the biologist and computer researchers. Recurrent Neural Network (RNN) [55] is one of the most popular and simple approaches to model the dynamics of gene expression as well as to infer correct dependencies among genes, i.e., GRN. In this paper, the proposed ESWSA is tested against benchmark small-scale artificial genetic network inference problem as an application of ESWSA in the field of computational biology and bioinformatics.

A GRN [56] is represented by a directed graph where nodes of the denote genes and regulatory interactions between genes are denoted by directed edges. In canonical RNN model [57, 58] the gene’s regulations are expressed by the following tightly coupled architecture where it is assumed that each of the total \( N \) neurons in the output unit \( e_{i} \left( {t +\Delta t} \right) \) is a gene expression value of next time instant, and the neurons in the input units e i (t) are the gene expression of present state for the same genes; thus they interact with each and everyone:

$$ e_{i} \left( {t +\Delta t} \right) = \frac{{\Delta t}}{{\tau_{i} }}f\left( {\mathop \sum \limits_{j = 1}^{N} w_{i,j} e_{j} \left( t \right) + \beta_{i} } \right) + \left( {1 - \frac{{\Delta t}}{{\tau_{i} }}} \right)e_{i} \left( t \right), \quad i = 1, 2, \ldots ,N $$
(9)

where f() is a nonlinear function (usually the sigmoid function is used, where f(z) = 1/1 + e-z); \( w_{i,j} \) represents the type and strength of the regulatory interaction from j-th gene towards i-th gene. Positive (negative) value of \( w_{i,j} \) represents activation (repression) control of gene-j on gene-i; wi,j = 0 implies that gene-j has no regulatory control on gene-i; β i represents the basal expression level and τ i denotes the decay rate parameter of the i-th gene. \( \Delta t \) is incremental time instance (in this work it is set to 1). Hence, the discrete form of the RNN model for GRN can be described by the following set of N(N+2) unknown parameters \( \varOmega = \left\{ {w_{i,j} , \beta_{i} , \tau_{i} } \right\}, \) which are called as RNN model parameters, where i, j = 1, 2,…,N.

Usually, the inference of RNN-based genetic network from time-series gene expression data is obtained by optimizing the values of RNN parameters with the help of a metaheuristic so that training or learning error is minimized. All metaheuristics use an objective function or a fitness function to measure the goodness of a solution. The most common estimation criterion is squared error [57, 58], which is defined as follows:

$$ f = \mathop \sum \limits_{k = 1}^{M} \mathop \sum \limits_{i = 1}^{N} \mathop \sum \limits_{t = 1}^{T} \left( {e_{cal,k,i,t} - e_{exp,k,i,t} } \right)^{2} $$
(10)

where N is the number of genes, T is the number of time instances of the gene expression data, M is number of datasets, \( e_{cal,k,i,t} \) is numerically calculated gene expression value of k-th dataset at time t of i-th gene using the set of obtained parameters of RNN model and eexp,k,i,t is the actual gene expression level of k-th dataset at time t of i-th gene; f denotes total squared error between the calculated and the observed gene expression data. Therefore, RNN modelling is a nonlinear function optimization problem to discover the optimal RNN parameter by minimizing the fitness function or square error so that calculated gene expression data fit best with the observed gene expression data. Moreover, the genetic network inference problem is decoupled [57], i.e., it is divided into several sub-problems corresponding to each gene to reduce large dimensionality of search space.

6.1 Experimental results for inference of small-scale GRN

To explore the effectiveness of ESWSA optimization for inference of RNN-based GRN, initially a benchmark small artificial regulatory network is chosen that contains four genes with simple regulatory dynamics. The parameters of this artificial GRN are presented in table 14 and the network is shown in figure 6. There are totally 8 regulations in the network. The artificial time-series data were generated by solving the set of differential equations (9) using the following RNN parameters and the initial values of these sets were selected randomly. In real life, these time-series data could be obtained by different biological experiments. The number of time instances per dataset is 50 and number of dataset is 4. Hence, for this work, there were totally 200 data points for each gene. For each gene, 6 parameters \( \varOmega = \left\{ {w_{i,j} , \beta_{i} , \tau_{i} } \right\} \) need to be identified using ESWSA, where these parameters are considered as the position of elephants groups. The search space was selected as wi,j ∊ [− 30,30], β i  ∊ [− 10,10] and τ i  ∊ [0,20], the same as in earlier work [48, 50, 57, 58]. Here, we have used ESWSA to get the decoupled RNN model parameters.

Table 14 Actual RNN model parameters for small artificial genetic network.
Figure 6
figure 6

Original small-scale GRN.

First, the most important performance criterion is measured from network structure point of view where inferred network is compared to the original network structure with respect to edge connectivity. Now, the sensitivity (S n ), specificity (S P ), F-score (F) and accuracy (ACC) of the reconstructed network are defined as follows:

$$ S_{n} = \frac{TP}{TP + FN} $$
(11)
$$ S_{p} = \frac{TN}{TN + FP} $$
(12)
$$ F = \frac{2*TP}{2*TP + FP + FN} $$
(13)
$$ ACC = \frac{TP + TN}{TP + FP + FN + TN} $$
(14)

where TP (True Positive) denotes the number of correctly predicted regulations, TN (True Negative) represents the number of properly predicted non-regulations, FP (False Positive) denotes the number of incorrectly predicted regulations and FN (False Negative) represents the number of falsely predicted non-regulations by the inference algorithm. Moreover, inferred values of the parameters should not deviate much from the original one as its sign and magnitude may affect the connectivity of reconstructed genetic network. Therefore, another performance measurement parameter, Inferred Parametric Error (IPE), is defined that measures the deviation in the magnitude of obtained parameters from original one:

$$ IPE = \mathop \sum \limits_{i,j = 1}^{N} \left| {w_{i,j}^{exp} - w_{i,j}^{cal} } \right| + \mathop \sum \limits_{i = 1}^{N} \left| {\beta_{i}^{exp} - \beta_{i}^{cal} } \right| + \mathop \sum \limits_{i = 1}^{N} \left| {\tau_{i}^{exp} - \tau_{i}^{cal} } \right| $$
(15)

where \( w_{i,j}^{exp} , \beta_{i}^{exp} , \tau_{i}^{exp} \) are the actual values of RNN parameters and \( w_{i,j}^{cal} , \beta_{i}^{cal} , \tau_{i}^{cal} \) are the calculated values of the same. However, due to random and stochastic nature of metaheuristic, outputs may be slightly varied for different runs and this may lead to different network topologies. Therefore, in this work, each algorithm is executed 10 times and final regulations of genes are obtained by taking their mean. Other optimization parameters for ESWSA, BA, CS, FPA and PSO remain the same as those in section 4.2.

Table 15 shows a comparative study on the performance of proposed ESWSA algorithm and other state-of-art techniques for inference of small-scale artificial GRN. It can be seen that ESWSA, CS and FPA are able to detect all TPs and do not include any FPs. Therefore, values of S n , S P , F and ACC are 1 for all of these cases. However, BA and PSO both include 3 FPs. It can be also seen that PSO has the least computational time whereas FPA has the maximum. Moreover, it is interesting to observe that ESWSA has the least training error, computational time and IPE while PSO has the largest training error. Therefore, overall, ESWSA is preferable if we want to balance accuracy of inference capability and computational time of respective algorithm.

Table 15 Comparative study for inference of small artificial GRN using RNN.

Figure 6 shows the original small-scale (4 genes) artificial GRN, where solid arrow-lines denote activation and dashed lines denote inhibition. Figure 7 depicts the GRN reconstructed from time-series gene expression data using ESWSA, CS and FPA techniques. It has been observed that all of these three metaheuristics are able to identify all true regulations (both activations and inhibitions, denoted by green coloured arrow) without including any false positive. However, in case of BA and PSO, both metaheuristics are able to identify only 6 true regulations but also include 3 false regulations (shown by red coloured arrow) in the inferred network. Moreover, for each case, BA and PSO have 2 FNs. Figures 8 and 9 show the small-scale GRN inferred by BA and PSO, respectively.

Figure 7
figure 7

Inferred small-scale GRN by ESWSA, CS and FPA.

Figure 8
figure 8

Small-scale GRN inferred by BA.

Figure 9
figure 9

Small-scale GRN inferred by PSO.

7 Results, analysis and discussion

Both PSO and ESWSA are dependent on two pieces of information: current particle or elephant best and swarm or global best. This allows greater diversity and exploration for both PSO and ESWSA. However, due to combined effect of particle best and global best, PSO may not reach as close as the ESWSA can reach to the global optimal point. We can see from tables 46 that ESWSA performs the best and PSO holds the second position in terms of accuracy, i.e., minimum, maximum and median fitness of numerical benchmark functions. From the obtained accuracy of different numerical benchmark functions and also GRN inference problem, it can be concluded that ESWSA and PSO used information about particle and global best positions in the search direction, and performed better compared with other existing metaheuristics like CS, FPA and BA. From tables 7 and 8, it is also very interesting to observe that ESWSA is better in term of success rate and standard deviation. It indicates that ESWSA is able to reach nearer to the global optima most of the times. For constrained optimization, ESWSA has similar accuracy as those of CS and FPA (see tables 12 and 13). Table 15 also supports this observation when we tried to infer RNN-based GRN using ESWSA. Training error is the least for ESWSA. However, performance of PSO is not up to mark for constrained optimization.

One of the main advantages of PSO is that the momentum effects (using inertia weight) on particle movement can allow faster convergence (e.g., when a particle is moving in the direction of a gradient) and more variety/diversity in search trajectories [39]. We have used similar approaches for ESWSA, namely LDIW [15], to update the velocity. This strategy leads to better convergence of ESWSA compared with other methods. From figure 2, it can be seen that ESWSA attends faster convergence for 7 cases of numerical functions.

Now, if we consider the case of ESWSA and PSO, both use velocity update formula but in different ways. To use the particle memory or swarm memory or both, we need to perform point-wise multiplication (see Eqs. (1) and (2)). However, to determine the search direction, PSO always use both memories whereas ESWSA uses either particle memory or swarm memory depending on the probability switch. Obviously, the point-wise multiplication for ESWSA is almost half of the required multiplication for PSO during velocity update. Hence, computational efficiency of ESWSA is better than that of PSO. On the other hand, as CS and FPA both use Levy’s Distribution for global search, the required computational time is more for CS and FPA. Moreover, as CS needs to perform more operations like selection of nest using random walk or Levy flight, empty a nest, etc., its computational time is the highest. As BA uses more number of tuning parameters and equations for updating the velocity, position, frequency, etc. during iteration, computational efficiency of BA is better than that of CS and FPA but worse than that of ESWSA and PSO. All these observations are made from table 9 for global optimization of different benchmark functions. In case of constrained optimizations (tables 12 and 13), we have also seen that computational time of ESWSA is lower (comparable or not much difference) than that of PSO whereas other methods required more time to execute. CS and FPA needed almost three times the computational time for constrained optimizations. In case of GRN inference problem, ESWSA needs the least computational time (table 12) whereas PSO required more execution time than that of ESWSA but less than that of other metaheuristics.

It is interesting to note that the IPE of GRN inference problem using PSO is very high; this indicates that PSO has been stuck at local minima or values are trying to reach beyond the boundary of the variables. Due to combined effects of particle and swarm memories, large step may be achieved during velocity update and there is always a chance of going beyond limit. Therefore, PSO may sometimes fail to reach nearer to the global optima. For this reason, PSO is not able to learn the exact value of regulatory parameters (i.e., GRN) and consequently parametric error is huge for inference of RNN-based GRN. However, ESWSA achieved the least IPE due to the use of particle and swarm memories alternatively (see the pseudo-code of ESWSA). From figures 8 and 9, it can be observed that BA and PSO are not suitable for learning of optimal parameters from time-series data, i.e., GRN inference, due to inclusion of FPs in the reconstructed network. CS and FPA have similar performance in terms of accuracy for GRN inference, but their IPE, computational time and training error are comparatively higher than those of ESWSA. For constrained optimizations, we have seen similar results, where standard deviation is very significant for PSO. It also indicates that the output of PSO may fluctuate significantly as it is not so robust against sticking at local minima.

Table 10 describes or summarizes overall findings of the exposition of this work. ESWSA is found to be the most efficient optimization in terms of best (minimum) fitness, worst (maximum) fitness, median fitness, standard deviation, success rate, convergence speed and computational time. It is also suitable for large-dimensional optimization problems. In case of constrained optimizations, ESWSA gives best results in terms of computational time and best fitness. As an application of ESWSA, we have tested ESWSA on a benchmark small-scale GRN inference problem, where it also performed better in many ways (accuracy and output training error) than the other state-of-art techniques. Hence, it can be concluded that ESWSA is the most suitable optimization technique compared with the other methods like BA, CS, FPA and PSO.

8 Conclusion

In this paper, the author has proposed a novel ESWSA to solve global optimization problems. This algorithm is based on the water search strategies of social elephants swarm with the help of their different short- and long-distance communication techniques. The velocity and positions of elephants swarms are gradually updated according to the current velocity and either local best or global best position depending on a probabilistic condition. ESWSA is theoretically very simple and relatively easy to implement as there are very few parameters that need to be adjusted during optimization. ESWSA can deal with several different continuous optimization problems and has the potential to be employed to solve real-world problems with satisfactory accuracy.

Initially, in order to evaluate the efficiency of ESWSA, a set of 10 benchmark functions are used, which cover a large variety of different global optimization problem types. Also, compared ESWSA with the state-of-the-art optimization algorithms, namely, BA, CS, FPA and PSO based on different criteria like best-found fitness, worst fitness, mean fitness, median fitness, standard deviation, success rate, convergence speed, computational efficiency, etc. The results showed that the performance of ESWSA is outstanding compared with the afore-listed algorithms and conditions for most of the benchmark functions. This conclusion was validated by both the simulation results and the statistical analysis of the simulation data.

Next, tested ESWSA for two well-known constrained optimization problems, namely three-bar truss and tension spring design problem. It has been observed for both cases that ESWSA performance is outstanding in terms of computational time, best fitness and standard deviation.

As a part of this work, ESWSA has been applied against a benchmark problem of computational biology, i.e., inference of GRN based on RNN. The objective was to learn the parameters of RNN accurately from time-series gene expression dataset by minimizing the training error. It is observed that the proposed ESWSA is able to reach nearest to global minima point and can infer all true regulations of GRN correctly in less computational time compared with the other existing metaheuristics.

In future, research on ESWSA can be carried out in the following areas: scheme, algorithm and real-world application. The local water search and global water search scheme in the current ESWSA may be further improved using advanced and hybrid optimization algorithms. In terms of algorithm research, development of adaptive or self-adaptive strategies for tuning of ESWSA parameters is also another interesting direction of research, which will reduce the time in tuning parameters. Lastly, it will be very fascinating to test the effectiveness of the proposed algorithm for more different real-world applications.