1 Introduction

Recently, nature-inspired stochastic optimization techniques have received much attention. Such optimization mostly mimics social/individual behavior of a group of animal or natural phenomena. Such techniques start the optimization process by creating a set of random solutions and improve them as candidate solutions for a particular problem. Due to the superior performance of such techniques compared to mathematical optimization approaches, the application of stochastic optimization methods can be found in different fields. It has been logically proven by the well-known No Free Lunch (NFL) theorem that there is no optimization technique which for solving all optimization problems [1]. This theorem therefore allows researchers to propose new optimization techniques or improve the current algorithms for solving a wider range of problems. Some of the most popular and well-known algorithms are particle swarm optimization (PSO) [2], genetic algorithm (GA) [3], differential evolution algorithm (DE) [4], simulated annealing (SA) [5], harmony search (HS) [6], ant colony optimization (ACO) [7], gravitational search algorithm (GSA) [8], biogeography-based optimization algorithm (BBO) [912], grey wolf optimizer (GWO) [13], and Krill Herd (KH) algorithm [1420].

The literature shows that hybridizing stochastic optimization techniques is one of the ways for designing superior algorithms and using the advantages of multiple algorithms when solving optimization problems [2128]. The PSOGSA algorithm is a novel hybrid of PSO and GSA, which was proposed in 2010 [29]. It has been proven that this algorithm has a good performance in solving optimization problems. In this algorithm, the search process is carried out by agents, which mimics the behavior of GSA in the exploration phase and PSO in exploitation phase. In fact, this algorithm was proposed to alleviate slow exploitation of the GSA algorithm, which was identified to be one of its main drawbacks.

Since the proposal of the GSA algorithm, many studies have been aimed for improving the performance of this algorithm. In 2011, Hatamlou et al. [30] hybridized GSA with another heuristic method for improving the solutions obtained by GSA for solving clustering problems. In 2012, an position-based learning GSA [31] and Immune Gravitation Optimization Algorithm (IGOA) [32] were employed to improve the convergence speed of GSA. The latter study incorporated the characteristics of antibody diversity and vaccination to the GSA algorithm. Similarly to PSOGSA [29], social thinking and individual thinking of PSO were integrated to GSA for solving a continuous problem called parameter identification of hydraulic turbine governing system [33].

The GSA algorithm was modified by Rashedi et al. [34] to solve binary problems as well. Since BGSA utilizes the same concepts of GSA, however, it suffers from the same drawbacks of GSA mentioned above. Needless to say, the binary version of improved GSA algorithms in the literature would also have superior performance compared to BGSA if they design in a way that uses the same concepts of continuous optimization for binary optimization. This motivates the authors to extend PSOGSA algorithm as one of the best improved GSA algorithms to a binary version and investigate its performance since the original version of PSOGSA has been designed for solving problems with continuous real search spaces (domains) [29].

In real world, there are many optimization problems with discrete binary search spaces (for instance, dimensionally reduction or feature selection in different fields). There are different methods in the literatures to require a continuous algorithm for solving binary problems as well. Some of these methods change the structure of the algorithm, whereas others maintain the algorithm’s mechanism in binary search spaces. In the literatures, stochastic optimization algorithms such as HS, DE, PSO, and GSA [35] have been adapted to solve binary problems. The binary HS algorithm utilizes a set of harmony search considerations and pitch adjustment rules to perform binary optimization [36]. In addition, binary DE uses a probability estimation operator in order to solve discrete problems [37]. The mechanism of binary PSO [38] and binary GSA [34] is quite similar since they both use transfer functions to solve binary problems. The difference of BPSO and BGSA compared to other above-mentioned binary optimization methods is that these two algorithms retain the same concepts of position and velocity updating procedures.

In this paper, a binary version of PSOGSA is introduced called BPSOGSA in order to solve binary problems. A transfer function and new positing updating procedure are integrated to BPSOGSA. The paper also considers the proposal of adaptive values for BPSOGSA in order to balance exploration and exploitation.

The rest of the paper is organized as follows. Section 2 presents a brief introduction to hybrid PSOGSA. Section 3 discusses the basic principles of binary version of PSOGSA. The experimental results are demonstrated in Sect. 4. Finally, Sect. 5 concludes the work and suggests some researches for future works.

2 The hybrid PSOGSA

The hybrid PSOGSA algorithm was introduced by Mirjalili and Mohd Hashim [29]. The basic idea of PSOGSA is to combine the ability of social thinking (gbest) in PSO with exploration capability of GSA. The PSOGSA algorithm was mathematically modeled as follows:

Similarly to PSO and GSA, every search agent has a position vector reflecting the current position in search spaces as follows:

$$\overrightarrow {{X_{i} }} = \left( {x_{i}^{1} , \ldots ,x_{i}^{d} } \right), \quad i = 1,2, \ldots ,N$$
(2.1)

where N is the number of search agents, d is the dimension of the problem, and x d i is the position of the ith agent in the d-th dimension.

So the whole population is represented as a matrix as follows:

$$\vec{X} = \left[ {\begin{array}{*{20}c} {x_{1}^{1} } & {x_{1}^{2} } & \ldots & {x_{1}^{d} } \\ {x_{2}^{1} } & {x_{2}^{2} } & \ldots & {x_{2}^{d} } \\ \vdots & \vdots & \vdots & \vdots \\ {x_{n}^{1} } & {x_{n}^{2} } & \ldots & {x_{n}^{d} } \\ \end{array} } \right]$$
(2.2)

There is another matrix for storing the corresponding fitness value for each search agent as follows:

$$\vec{O} = \left[ {\begin{array}{*{20}c} {o_{1} } \\ {o_{2} } \\ \vdots \\ {o_{n} } \\ \end{array} } \right]$$
(2.3)

The optimization process begins with filling out the position matrix by random values. During optimization, the gravitational force from agent j on agent i at a specific time t is defined as follows:

$$F_{ij}^{d} \left( t \right) = G\left( t \right)\frac{{M_{pi} \left( t \right) \times M_{aj} \left( t \right)}}{{R_{ij} \left( t \right) + \varepsilon }}\left( {x_{j}^{d} \left( t \right) - x_{i}^{d} \left( t \right)} \right)$$
(2.4)

where M aj is the active gravitational mass related to agent j, M pi is the passive gravitational mass related to agent i, G(t) is a gravitational constant at time t, ɛ is a small constant, and R ij (t) is the Euclidian distance between two agents i and j.

The gravitational constant (G) and the Euclidian distance between two agents i and j are calculated as follows:

$$G(t) = G_{0} \times \exp ( - \alpha \times iter /maxiter)$$
(2.5)
$$R_{ij} \left( t \right) = X_{i} \left( t \right),X_{j} (t)_{2}$$
(2.6)

where α is the descending coefficient, G 0 indicates the initial gravitational constant, iter is the current iteration, and maxiter shows the maximum number of iterations.

In a problem space with dimension equals to d, the total force that acts on agent i is calculated by the following equation:

$$F_{i}^{d} \left( t \right) = \mathop \sum \limits_{j = 1,j \ne i}^{N} rand_{j} F_{ij}^{d} \left( t \right)$$
(2.7)

where rand j is a random number generated with uniform distribution in the interval [0, 1].

The law of motion has also been utilized in this algorithm which states that the acceleration of a mass is proportional to the resultant force and inverse of its mass, so the acceleration of all agents are calculated as follows:

$$a_{i}^{d} \left( t \right) = \frac{{F_{i}^{d} \left( t \right)}}{{M_{ii} \left( t \right)}}$$
(2.8)

where d is the dimension of the problem, t is a specific time, and M ii is the inertial mass of agent i.

During optimization, the best obtained solution so far is saved in gbest variable inspired by PSO. The reason for using the best solution is that GSA suffers from slow exploitation and deteriorates in final iterations [29, 39, 40]. In GSA, the masses movements are calculated based on their weights, and the weights are directly calculated by the fitness function. Thus, the masses that have better values of fitness function are considered as heavy objects, and consequently, they move slowly. According to the concepts of EA, particles should wander through the search space at initial iterations. Then, after finding a good solution, they have to gather around that solution in order to exploit the best solution. In GSA, during running time, masses become heavier and heavier. In the final steps of iterations, masses have almost the same weights due to gathering around a promising solution. They approximately attract each other with the same intensity of gravitational forces. Therefore, they are not able to move toward the best solution quickly. This problem can be seen in Fig. 1. This figure shows a simple one-dimensional problem where the fitness function is Y = X 2. This above-mentioned problem is more critical for high dimensional problems.

Fig. 1
figure 1

Operation of GSA’s masses in two iterations

In order to see how PSOGSA alleviate this drawback, a conceptual model is illustrated in Fig. 2. This figure shows that the hybrid PSOGSA employs the best solution obtained so far for guiding the heavy masses toward the global optimum. Apparently, this method speeds up the overall movement of masses as well, resulting in enhancing exploitation ability of PSOGSA.

Fig. 2
figure 2

Operation of PSOGSA’s particles in two iterations

The Eq. (2.9) was proposed as follows for combining PSO and GSA:

$$V_{i} \left( {t + 1} \right) = rand \times V_{i} \left( t \right) + c_{1}^{{\prime }} \times ac_{i} \left( t \right) + c_{2}^{{\prime }} \times (gbest - X_{i} (t))$$
(2.9)

where V i (t) is the velocity of agent i at iteration t, c j is an accelerating factor, rand is a random number generated with uniform distribution between 0 and 1, ac i (t) is the acceleration of agent i at iteration t, and gbest is the best obtained solution so far.

In each iteration, the positions of agents are updated as follows:

$$X_{i} \left( {t + 1} \right) = X_{i} \left( t \right) + V_{i} \left( {t + 1} \right)$$
(2.10)

In PSOGSA, at first, all agents are randomly initialized using uniform distribution. Each agent is considered as a candidate solution. After initialization, gravitational force, gravitational constant, and resultant forces among agents are calculated by Eqs. 2.2, 2.3, and 2.5, respectively. After that, the accelerations of particles are defined by Eq. 2.6. In each iteration, the best attained solution should be updated. After calculating the acceleration and updating the best solution, the velocities of all agents can be calculated by Eq. 2.7. Finally, the positions of agents are updated by Eq. 2.8. The process of updating velocities and positions will be stopped when meeting an end criterion. The steps of hybrid PSOGSA are represented in Fig. 3.

Fig. 3
figure 3

Pseudocode of PSOGSA

It was experimentally proved that PSOGSA is powerful enough to solve optimization problems better than PSO and GSA [29]. However, the original PSOGSA is a continuous algorithm, which is not capable of solving binary problems directly. In the following subsections, the binary version of PSOGSA is proposed and investigated.

3 Binary version of PSOGSA (BPSOGSA)

In the original PSOGSA, agents can continuously move around the search space because of having position vectors with continuous real domain. In order to require the search agents to move in a binary search space, we have to modify position updating (Eq. 2.10). According to [34, 38], a transfer function is also needed to change position of an agent with the probability of its velocity. Transfer functions map the velocities values to the probability values for updating the positions. A set of standard transfer functions can be found in [41]. According to Mirjalili and Lewis [41], a transfer function should be able to provide a high probability of changing the position for a large absolute value of the velocity. In addition, it should present a small probability of changing the position for a small absolute value of the velocity. Moreover, the range of a transfer function should be bounded in the interval of [0, 1] and increased with the increasing of velocity. The utilized function in [34] is presented as Eq. 3.1. This function is also depicted in Fig. 4.

$$S\left( {v_{i,j}^{k} \left( t \right)} \right) = \left| {\tanh \left( {v_{i,j}^{k} \left( t \right)} \right)} \right|$$
(3.1)
Fig. 4
figure 4

Tangent hyperbolic transfer function

We use this equation for mapping velocities of agents in BPSOGSA to probabilities of flipping their position vectors’ elements. After calculating the probabilities, the agents update their positions based on the presented rules in Eq. 3.2.

$${\text{If}}\; rand\; < \;S\left( {v_{i,j}^{k} \left( {t + 1} \right)} \right) \;{\text{then}}\; x_{i,j}^{k} \left( {t + 1} \right) = {\text{complement}}\left( {x_{i,j}^{k} \left( t \right)} \right) \;{\text{else}}\; x_{i,j}^{k} \left( {t + 1} \right) = x_{i,j}^{k} \left( t \right)$$
(3.2)

The pseudocode regarding the general steps of BPSOGSA is culminated in Fig. 5.

Fig. 5
figure 5

Pseudocode of BPSOGSA

In should be noted that we utilize adaptive values for c 1 and c 2 in this study. According to Mirjalili and Lewis [42], adding gbest to the velocity vector have weakened the exploration phase, since it establishes a permanent element of velocity updating. In order to resolve this issue, we utilize adaptive values for c 1 and c 2 as follows [42]:

$$c_{1}^{{\prime }} = - 2\frac{{t^{3} }}{{T^{3} }} + 2$$
(3.3)
$$c_{2}^{{\prime }} = 2\frac{{t^{3} }}{{T^{3} }} + 2$$
(3.4)

where t indicates the current iteration and T is the maximum number of iterations.

The behavior of these two variables is illustrated in Fig. 6. We adaptively decrease c 1 and increase c 2 so that the masses tend to accelerate toward the best solution as the algorithm reaches the exploitation phase. Since there is no clear border between the exploration and exploitation phases in evolutionary algorithm, the adaptive method is one of the best options for allowing an algorithm to gradually transit between these two phases.

Fig. 6
figure 6

Adaptive behavior of coefficients

Theoretically, the BPSOGSA algorithm has the potential to outperform both BPSO and BGSA since it uses the same concepts of PSOGSA. In the following subsection, a comparative study is provided in order to justify the performance of BPSOGSA practically. Note that the source code of this algorithm can be found in http://www.alimirjalili.com/Projects.html.

4 Experimental results and discussion

In this section, 22 benchmark functions dividing into three groups such as unimodal, multimodal, and composite are chosen to evaluate the performance of BPSOGSA [4347]. Tables 1, 2 and 3 present these benchmark functions, range of search space, and the optimum values.

Table 1 Unimodal benchmark functions
Table 2 Multimodal benchmark functions
Table 3 Composite benchmark functions

A comparative study with BGSA, BPSO, and GA is conducted for verifying the performance of BPSOGSA. The dimension of all benchmark functions is set to 5 (m = 5). Fifteen bits are used for representing each variable in binary format (one bit is reserved for the sign). Therefore, the dimensions of test functions and search agents are 75. Note that the initial values of basic parameters of the algorithms are provided in Table 4.

Table 4 Initial parameters for BPSOGSA, BGSA, BPSO, and GA

The experimental results are presented in Tables 5, 6 and 7. The results are collected over 10 independent runs. The average (ave) and standard deviation (std) of the best obtained solutions in the last iterations are reported as the results in bold type.

Table 5 Minimization results of unimodal benchmark functions over 10 independent runs
Table 6 Minimization results of multimodal benchmark functions over 10 independent runs
Table 7 Minimization results of composite benchmark functions over 10 independent runs

4.1 Unimodal benchmark functions

The unimodal test functions are useful to examine exploitation of the algorithms. As per the results of unimodal functions in Table 5, the BPSOGSA algorithm shows the best results on three of the unimodal benchmark functions in terms of the mean and standard deviation the results. According to the statistical results in Fig. 7, BPSOGSA and BPSO provide better results on 43 % of the unimodal benchmark functions, followed by GA with 14 %. BGSA did not show good performance on this set of functions due to the above-mentioned drawback (slow exploitation). Therefore, this is evidence that the proposed algorithm has high performance in finding the global solution of unimodal benchmark functions.

Fig. 7
figure 7

Statistical results for unimodal benchmark functions

According to the convergence curves of algorithm when solving unimodal test functions in Fig. 8, the BPSOGSA and BPSO algorithms have the fastest convergence rate. These findings prove that the BPSOGSA algorithm seems to show the best exploitation ability and convergence rate.

Fig. 8
figure 8

Convergence curves for unimodal benchmark functions

4.2 Multimodal benchmark functions

The results of the multimodal benchmark functions are provided in Table 6 and Figs. 9, 10. Multimodal test functions are helpful for benchmarking local optima avoidance of the algorithms. According to the results of Table 6 and Fig. 9, BPSOGSA shows the best results in four of the multimodal benchmark functions (45 % of the functions). However, the BPSO and GA algorithms outperform BPSOGSA in two (22 %) and three (33 %) multimodal test functions, respectively. Once again, the BGSA algorithm could not provide competitive results.

Fig. 9
figure 9

Statistical results for unimodal multimodal benchmark functions

Fig. 10
figure 10

Convergence curves for multimodal benchmark functions

The convergence curves in Fig. 10 prove that BPSOGSA has the fastest convergence behavior in four out of nine functions. These findings prove that the BPSOGSA algorithm is able to avoid local optima, and this avoidance does not have negative impact of the convergence speed. In addition, the results can also evidence high exploration of BPSOGSA.

4.3 Composite benchmark functions

Composite functions are the most challenging test functions and suitable for benchmarking exploration and exploitation combined. The results in Table 7 and Figs. 11, 12 show that the BPSOGSA is evidently better than other algorithms on 84 % of the composite functions. For the function f 22, BGSA has the best results, followed by BPSOGSA.

Fig. 11
figure 11

Statistical results for composite benchmark functions

Fig. 12
figure 12

Convergence curves for composite benchmark functions

The convergence of algorithms when solving composite functions are illustrated in Fig. 12. This figure shows that BPSOGSA has the fastest convergence rate. These findings prove that BPSOGSA properly and efficiently balances exploration and exploitation, which is due to the employed adaptive values for \(c_{1}^{'}\) and \(c_{2}^{'}\).

Figure 13 shows the overall statistical results on all the benchmark functions whereby the BPSOGSA algorithm has the best result for 55 % of the benchmark functions.

Fig. 13
figure 13

Overall statistical results

To summarize, results prove BPSOGSA have better performance than BGSA, BPSO, and GA. Therefore, it can be said that the binary version of hybrid PSOGSA also has the strengths of both PSO and GSA in solving binary problems. According to this comprehensive comparative study and discussion, we state that this algorithm has merit among other binary algorithms.

The reason of the high performance of BPSOGSA on unimodal test functions is due to the fact the algorithm has higher exploitation compared to GSA. The social component of PSO allows BPSOGSA to exploit accurately around the best mass obtained so far. High exploration of BPSOGSA algorithm originates from the intrinsic characteristic of the GSA algorithm, wherein all search agents have impact on each other at each iteration (in contrast to the search agents of PSO that only sees pbest and gbest). This high exploration ability assists BPSOGSA to outperform other algorithms on multimodal test functions. The reason of significantly better results of BPSOGSA on composite test functions is due to adaptive values for \(c_{1}^{'}\) and \(c_{2}^{'}\). These two variables require BPSOGSA to balance between exploration and exploitation, so it emphasizes exploration in initial steps of iteration. However, exploitation is promoted as iteration increases. This cases local minima avoidance and accelerated convergence toward the global optimum over the course of iterations. Finally, the employed v-shaped transfer function obliges search agents of BPSOGSA to switch their positions when they are moving into unpromising areas of the search space, making it more likely to move toward promising areas of search space in contrast to s-shaped transfer functions.

5 Conclusion

In this paper, a binary version of hybrid PSOGSA called BPSOGSA was proposed by utilizing the same concepts of the continuous version in terms of search behavior. In order to justify the performance BPSOGSA, 22 benchmark functions were employed, and the results were compared with BGSA, BPSO, and GA. The results proved that BPSOGSA was able to provide competitive results and has merit among binary heuristic optimization algorithms in binary search spaces. According to the findings, the BPSOGSA algorithm successfully inherits the advantages of the PSOGSA. On one hand, BPSOGSA shows superior exploration since all search agents participate in updating position of a search agent. On the other hand, the exploitation of BPSOGSA is very accurate due to the social component of PSO integrated that causes accelerated convergence. The findings also proved that the adaptive values for \(c_{1}^{'}\) and \(c_{2}^{'}\) balance exploration and exploitation, so BPSOGSA is able to avoid local optima and converge to the promising regions of the search space. The transfer function employed also proved to be advantageous since there is no obligation for search agents to take 0 or 1.

For future studies, it is recommended to apply BPSOGSA in real optimization problems in order to further evaluate the efficiencies of BPSOGSA in solving real-world problems. Investigating the effect of different transfer functions on BPSOGSA would be interesting as well.