1 Introduction

The optimization methods are divided into two main groups. The traditional mathematical programming techniques belong to this category (Vanderplaats 1999; Rao 2009; Haftka et al. 1990). Most of the techniques are based on choosing some search directions employing the first and higher-order derivatives of the functions under consideration. The search directions and the step lengths are modified to reach the desired optimal solution. The final results may not lead to global optimum for multimodal functions. The methods are referred to as gradient-based approaches (GBA).

The second type of method is modern optimization methods that are based on statistical search directions and derived from different behaviors in nature (Yang 2010; Parmee 2001; Gandomi et al. 2013; Du and Swamy 2016; Yang et al. 2016; Siddique and Adeli 2017; Zhou et al. 2013; Akhtar et al. 2020). In these methods, the relevant variables are randomly selected in the design space and with intelligent computational methods, the results gradually move toward the optimal design. As all the design space is explored, it is possible to calculate the near-global optimal design (Osińskia et al. 2013). There are several intelligent methods based on the social life of creatures, like particle swarm optimization (PSO) (Kennedy and Eberhart 1995), artificial immune systems (AIS) (Dasgupta 2006; Farmer et al. 1986), ant colony search algorithm (Dorigo and Stützle 2004) and harmony search algorithm (Geem 2010; Geem et al. 2001). Another category of science-based principles inspired by different disciplines like physics and chemistry such as simulated annealing(Kirkpatrick et al. 1983; Aarts and Korst 1989), quantum computing (Benioff 1980), central force optimization (Formato 2007, 2008), gravitational search algorithm (Rashedi et al. 2009, 2018) and chemical reaction optimization (Guggenheim and Modern 1967; Lam and Li 2010).

Among the heuristic optimization methods, the PSO method is attractive that has been studied by many researchers (Kashani et al. 2020). Different variants of PSO have been successfully presented in the literature (Kumar et al. 2020, 2021; Das et al. 2021). Another interesting approach is the gravitational search algorithm (GSA) that has been recently presented by researchers at the Shahid Bahonar University of Kerman with various modifications (Ebrahimi Mood et al. 2015).

The weakness of the standard random metaheuristic methods is the slow convergence and the approaches require a high number of particles in the design space with many iterations. If these methods are merged with deterministic methods such as gradient-based approaches (GBA), the optimal results will be achieved more efficiently with rapid convergence.

In this paper, the PSO and GSA methods with the GBA are combined and a new innovative method called GPSG (GSA + PSO + GBA) is introduced. To evaluate the performance of the approach, 25 mathematical optimization functions of CEC2005 (Suganthan et al. 2005) and 29 complicated multimodal functions of CEC2017 (Awad et al. 2017) are examined. Besides, three truss design problems from the literature are chosen to verify the convergence of the approach. The numerical results of PSO, GSA and GPSG are compared and it is observed that the performance of GPSG is much better than other approaches. In the next sections, first, a brief description of the methods of PSO and GSA are outlined and then the details of GPSG are presented. The numerical results of the three approaches are compared with some of the methods in the literature.

2 Optimization algorithms

This section presents the basic ideas of PSO, GSA, GBA and the proposed GPSG. The pseudo-code of the approach is outlined to clarify the main steps of the approach.

2.1 Particle swarm optimization (PSO)

Kennedy and Eberhart (1995) developed a metaheuristic optimization method based on a random search. In this algorithm, a random initial population with a certain number of particles is generated in the design space as indicated in (1).

$${{\varvec{X}}}_{i}=\left({x}_{i}^{1}{,\dots ,x}_{i}^{d},\dots ,{x}_{i}^{n}\right)\text{ for }i=1:{NP},$$
(1)

where \({{\varvec{X}}}_{i}\) is the vector of the design variables of the particle \(i\), \(n\) is the number of design variables, \(d\) denotes the dimension of the problem, and \(NP\) is the number of all particles.

The particles share their information with other particles. So each particle can adjust its position according to its previous experience and makes the best use of its own and its neighbors.

As a result, in the progress of optimization, over time to obtain the best possible response, the movement of each particle is based on self-awareness and the intelligence of the group particles.

In the PSO method, the particle velocity can be calculated according to (2).

$${{\varvec{V}}}_{i}^{t+1}=\omega {{\varvec{V}}}_{i}^{t}+{c}_{1}{r}_{1}\left({{\varvec{P}}}_{i}^{t}-{{\varvec{X}}}_{i}^{t}\right)+{c}_{2}{r}_{2}\left({{\varvec{P}}}_{\text{g}}^{t}-{{\varvec{X}}}_{i}^{t}\right),$$
(2)

where \({{\varvec{V}}}_{i}^{t}\) is the velocity and \({{\varvec{X}}}_{i}^{t}\) is the positions of any particle at time \(t\). The vectors \({{\varvec{P}}}_{i}^{t}\) (Pbest) are the best particle position i and \({{\varvec{P}}}_{\text{g}}^{t}\) (Gbest) are the best particle position in the whole society and the index g stands for global. The parameter \(\omega \) is the weight of inertia to apply the importance of the velocity of the preceding iterations that gradually decreases linearly from 0.9 to 0.4 overtime. The scalars \({c}_{1}\) and \({c}_{2}\) are coefficients to determine the significance of Pbest and Gbest, respectively. In this study, an experimental value of 0.5 is considered for both coefficients. The coefficients of \({r}_{1}\) and \({r}_{2}\) are random scalars with a uniform distribution in the interval of zero and one to maintain the randomness of the algorithm. Note that initially, all particle velocities are set to zero. The first part of (2) is a portion of the speed of the previous iteration. The second and third parts are the effects of the particle in question and the whole group particles, respectively, from the beginning of the path to the moment t. PSO keeps the memory of all the best results of the previous iterations.

The particles are updated according to (3).

$${{\varvec{X}}}_{i}^{t+1}={{\varvec{X}}}_{i}^{t}+{{\varvec{V}}}_{i}^{t+1}.$$
(3)

By repeating the process, the optimal solution to the problem can be obtained.

2.2 Gravitational search algorithm (GSA)

To minimize the objective function, a set of particles is randomly assigned in the design space and updated over time. The position of each particle is defined as (1). The gravitational search algorithm is inspired by the law of gravity in nature using Newton’s laws. In summary, the acceleration vector of the ith particle in iteration \(t\) can be specified as (Rashedi et al. 2009, 2018);

$${{\varvec{a}}}_{i}^{t}=G\left(t\right)\sum_{j=1}^{n} \left[{\text{ rand}}_{j}{\frac{{M}_{j}\left(t\right)}{{R}_{ij}\left(t\right)+\upvarepsilon }({{\varvec{X}}}_{j}^{t}}{-{\varvec{X}}}_{i}^{t})\right],$$
(4)

where \(G\) is the gravitational constant. \({M}_{j}\) represents the mass of the jth particle and can be evaluated from the objective functions of particles (Rashedi et al. 2009). \({R}_{ij}\) is the distance between the particles of i and j. randj is a random scalar with uniform distribution in the interval of zero and one, and \(\upvarepsilon \) is a small number to prevent numerical errors.

The gravitational constant \({G}_{0}\) is defined as (Rashedi et al. 2009);

$$G\left(t\right)={G}_{0}{e}^{-\alpha \frac{t}{T}},$$
(5)

where the parameters \({G}_{0}\) and \(\alpha \) are two constant coefficients. \(t\) represents the current iteration and \(T\) is the maximum number of iterations.

The updated mass of the particles is evaluated by (6) and normalized according to the relation (7).

$${m}_{i}\left(t\right)=\frac{{F}_{i}(t)-\text{worst}(t)}{\text{best}(t)-\text{worst}(t)},$$
(6)
$${M}_{i}\left(t\right)=\frac{{m}_{i}\left(t\right)}{\sum_{j=1}^{NP}{ m}_{j}\left(t\right)},$$
(7)

in which

$${best}\left({t}\right)={\text{min}}\,{F}_{{i}}\left({t}\right)\text{ for} \, i=1:{NP},$$
(8)
$${worst}\left(t\right)=\text{max}{F}_{i}\left(t\right)\text{ for }i=1:{NP},$$
(9)

where the fitness value (objective function) of each of the ith particles is evaluated as \({F}_{i}(t)\) at time \(t\).

The updated gravitational velocity vector is calculated from (10), assuming a one-second interval between the iterations.

$${{\varvec{V}}}_{i}^{t+1}={\text{rand}}_{i} .{{\varvec{V}}}_{i}^{t}+{{\varvec{a}}}_{i}^{t}.$$
(10)

Then, the position of the particles is updated according to (3).

2.3 The first order gradient-based algorithm (GBA)

Using the Taylor series, retaining the first-order terms, the position of the updated particles is obtained as (11). The direction of the search is referred to as the gradient based-method (GBM) (Vanderplaats 1999; Rao 2009; Haftka et al. 1990).

$${{\varvec{X}}}_{i}^{t+1}={{\varvec{X}}}_{i}^{t}-\Upsilon \nabla F\left({{\varvec{X}}}_{i}^{t}\right),$$
(11)

in which \(\Upsilon\) is the step length and the operator \(\nabla \) is the gradients. By employing (11), the direction of search is in the negative gradients of the function under consideration that is referred to as steepest descent resulting in the most reduction in the function. Special care should be taken in choosing the step length \(\Upsilon\). The gradient directions are randomized for compatibility with the metaheuristic methods as explained in the next section.

2.4 Hybrid method (GPSG)

The metaheuristic methods such as PSO and GSA have a good capability of exploring the whole design space. The combination of PSO and GSA improves the quality of exploration (Tsai et al. 2013). However, all the heuristic approaches or their combination have low power of exploitation (local search). On the other hand, the GBA is powerful for exploitation and weak in exploration (global search) that may be trapped in local optima. Thus the merging of the three approaches is a good idea for obtaining a fast and powerful method. The heuristic methods are random but the gradient methods are deterministic. In the combination approaches, the idea of randomization must be employed. Besides, a fraction of the speed in each method should be combined.

The velocity of the particles for PSO, GSA and GBA is given in (12), (13) and (14), respectively, with some modifications.

$${{\varvec{V}}\_\text{PSO}}_{i}^{t+1}={c}_{1}{r}_{1}\left({{\varvec{P}}}_{i}^{t}-{{\varvec{X}}}_{i}^{t}\right)+{c}_{2}{r}_{2}\left({{\varvec{P}}}_{\text{g}}^{t}-{{\varvec{X}}}_{i}^{t}\right),$$
(12)
$${{\varvec{V}}\_\text{GSA}}_{i}^{t+1}={{\text{rand}}_{i}.{\varvec{V}}\_\text{GPSG}}_{i}^{t}+{{\varvec{a}}}_{i}^{t},$$
(13)
$${{\varvec{V}}\_\text{GBA}}_{i}^{t+1}=\frac{-\nabla F({{\varvec{X}}}_{i}^{t})}{|\nabla F({{\varvec{X}}}_{i}^{t})|}.sv.$$
(14)

The parameter \(sv\) is given by

$$sv = \left|{{\varvec{V}}\_\text{PSO}}_{i}^{t+1} + {{\varvec{V}}\_\text{GSA}}_{i}^{t+1}\right|.$$
(15)

The velocity of GBA is normalized to achieve a unit vector along the negative gradient direction. Then, it is multiplied by \(sv\) to adjust the length of the search direction of the GBA corresponding to the resultant of the PSO and GSA algorithms. On the other hand, \(sv\) is a step length for the unit vector of the negative gradients.

Comparing (12) and the original PSO velocity indicates that the first term of (2) is omitted because its effects are considered in the velocity of the GSA (13).

Finally, the velocity of the combined approach (GPSG) is obtained by (16).

$${{\varvec{V}}\_\text{GPSG}}_{i}^{t+1}={C}_{t} \left({{{\varvec{V}}}\_{\text{PSO}}}_{i}^{t+1}+{{{\varvec{V}}}\_{\text{GSA}}}_{i}^{t+1}+{C}_{g}.{r}_{3}. {{{\varvec{V}}}\_{\text{GBA}}}_{i}^{t+1}\right).$$
(16)

In this formulation, \({r}_{3}\) is a random number between zero and one. \({C}_{g}=2\) is a statistically appropriate coefficient as the average of \({C}_{g}*{r}_{3}\) tends to one. Therefore, the magnitude of the GBA (\(sv\)) does not change much.

The value of \({C}_{t}\) is considered as 0.5 to prevent statistically increasing the resultant of the magnitude of the three vectors corresponding to the algorithms.

Finally, the position of each particle is updated by the GPSG velocity presented in (17).

$${{\varvec{X}}}_{i}^{t+1}={{\varvec{X}}}_{i}^{t}+{{\varvec{V}}\_\text{GPSG}}_{i}^{t+1}.$$
(17)

The main idea of the proposed method is that for minimization problems, in the direction of negative gradient of the objective function (GBA), the reduction of the function is more than any other direction. However, the drawback of the gradient direction is that the optimal result falls in a local solution. Thus, the direction of the gradient vector is randomized and its length is normalized for compatibility with the resultant directions generated by PSO and GSA. In each iteration, the resultant of the PSO, GSA and GBA (GPSG) are employed for the optimization process. The fast convergence of numerical results with a small number of the initial population indicates the high ability of the exploration and exploitation of the proposed GPSG method.

The steps of the GPSG hybrid method are presented in the following pseudo-code (comments for more information represented with %);

figure a

2.5 Optimization problem formulation

The general form of a constrained optimization problem can be specified as follows:

$$\text{Minimize}: f\left({\varvec{X}}\right),$$
(18)
$$\text{Subject to}: {g}_{k} \left({\varvec{X}}\right)\le 0 \,k=1:K,$$
(19)

where \(f\left({\varvec{X}}\right)\) and \({g}_{k}\left({\varvec{X}}\right)\) represent the objective function and constraints, respectively. The value of K represents the number of constraints.

To convert the constrained structural problems into the unconstrained functions, the penalty function should be used as (Salajegheh and Salajegheh 2019);

$$F\left({\varvec{X}}\right)=c f\left({\varvec{X}}\right),$$
(20)
$$c=1+\sum_{i=1}^{K}\psi \left(i\right) {q(i)}^{\lambda (i)} \; i=1:K,$$
(21)
$$\psi \left(i\right)=\mu {e}^{1+q(i)},$$
(22)
$$q\left(i\right)=\text{max}\left({g}_{i},0\right),$$
(23)
$$\lambda \left(i\right)=\left\{\begin{array}{c}1\quad if\; q\left(i\right)\le 1\\ 2\quad if\; q\left(i\right)>1\end{array}\right.,$$
(24)

where \(c\) is a scalar coefficient for penalties that increase the objective function if the constraints are violated. \(F\left({\varvec{X}}\right)\) is called the penalized objective function. The parameters \(\mu \) and \(\lambda \) are the coefficients needed to calculate the value of \(c\). In this study, \(\mu \) is considered as 0.5.

3 Numerical investigation

In this research, the mathematical functions of CEC2005 (Suganthan et al. 2005) with 10 variables (n = 10) and CEC2017 (Awad et al. 2017) with 30 variables are used to explore the capabilities of the proposed approach. In addition, three standard benchmark structural problems are considered for the validation of the GPSG algorithm. These are discussed as follows:

3.1 Results of CEC2005 mathematical functions

The results of the methods of PSO, GSA and GPSG for a population of 5 (NP) are presented graphically for some of the functions in Fig. 1. The required data in the process of optimization are presented in Table 1.

Fig. 1
figure 1

Convergence history for CEC2005 with \({NP}\) = 5

Table 1 Scaling parameters

The vertical axis represents the difference between the obtained results and the exact solution (X*) at each iteration. Each graph is the outcome of the average of 30 independent runs.

The two methods of PSO and GSA have very poor convergence and the methods trap in a local optimum for some of the cases. But in the combined GPSG method, the convergence of the optimization process is smooth with much better results.

Besides, the effects of the number of the initial population are considered for the methods. Populations of 5, 10, 20 and 30 are examined. The results are presented in Figs. 2, 3 and 4. It can be observed that the methods of PSO and GSA are very sensitive to the number of population. However, in the GPSG, the population does not affect the results much. With a small population, the enhanced method behaves well.

Fig. 2
figure 2

Convergence history for F1 by algorithms with different number of particles

Fig. 3
figure 3

Convergence history for F17 by algorithms with different number of particles

Fig. 4
figure 4

Convergence history for F25 by GSA algorithm with different number of particles

The summary of the results is given in Table 2. For all the problems, mean, rank, standard deviation, median and best response for the final iteration are given. The rank is evaluated based on the results of the average of the final iteration of the 30 runs, for the same population.

Table 2 Results of CEC2005 with different populations

From the statistical point of view, the standard deviation (SD.) indicates that the diversity of the final results of the independent runs are lower for the proposed approach, which shows the superiority of the method.

The mean values of the average of the 25 functions are found and then their ranks are evaluated for each population. The final results are given in Table 3. It is seen that the GPSG possesses the first rank in all the cases.

Table 3 Average rank of methods with different populations

3.2 Comparison of the proposed method with several metaheuristic algorithms

The outlined approach with 20 particles is compared with different variations of GSA for the CEC2005 (Ebrahimi et al. 2015). The required initial parameters are chosen similarly. The numerical results are presented in Table 4. It can be observed that 15 functions have the first rank and the average rank of all the 25 functions is better than variants of GSA.

Table 4 Comparison of GPSG with variants of GSA for CEC2005

For CEC2017 the average, best and standard deviation (SD.) results for each function are indicated in Table 5. The average results of the proposed approach compared with 8 other methods, according to the available information. The superiority of the GPSG among the others is observed according to the average rank. Also, the same number of function evaluations is chosen for all methods. The standard deviations (SD.) are obtained for GPSG and the other methods, the results of SD. are not available for comparison.

Table 5 Comparison of GPSG with other methods for CEC2017

3.3 Structural problems

The optimal design of structures is the main topic among structural engineers (Mashayekhi et al. 2012, 2016; Gholizadeh 2013; Khatibinia and Yazdani 2018; Bhullar et al. 2020). In this section, three design problems are chosen for truss structures. The weight of the structures is taken as the objective function and the constraints are bounds on member stresses and joint displacements. The cross-sectional areas are continuous design variables. In all the problems, the value of \(\alpha \) in (5) and the maximum iterations (\(T\)) are chosen as 4 and 200, respectively.

3.3.1 10-bar plane truss

The 10-bar truss is optimized as shown in Fig. 5. The required information for the truss is given in Table 6.

Fig. 5
figure 5

10-bar plane truss

Table 6 Information for the 10-bar plane truss

The results are given in Table 7. The numerical results indicate that the best results are obtained with the combined GPSG method, in which for all the number of particles, similar results are approximately achieved.

Table 7 Results of 10-member truss algorithms with different populations

The convergence trend for different populations is illustrated in Fig. 6. It is concluded that the GPSG method yields better results which demonstrates the efficiency of this method.

Fig. 6
figure 6

Convergence history for the 10-bar truss with different number of particles

The ranking of the three algorithms is given in Table 8. The ranking of the GPSG is first.

Table 8 Average ranking with different populations for 10-bar plane truss

3.3.2 72-bar truss

The 72-member truss shown in Fig. 7 consists of 16 member types due to geometrical shape. The aim is to minimize the weight of the structure. All of the specifications of the structure are given in Table 9.

Fig. 7
figure 7

72-bar truss

Table 9 Information of the 72-bar truss

The results are presented in Table 10. It can be concluded that the GPSG hybrid method has the most favorable results. The number of the initial population does not affect the results. The average of the hybrid method with only five particles is better than the PSO and GSA methods with 30 particles.

Table 10 Results of 72-member algorithms with different populations

The convergence history of the methods is shown in Fig. 8. The GPSG method yields better results that demonstrate its effectiveness.

Fig. 8
figure 8

Convergence history for the 72-member truss with different number of particles

The ranking of the three algorithms is shown in Table 11 and the GPSG method is ranked first.

Table 11 Average ranking with different populations for 72-member truss

3.3.3 120-member dome under asymmetric vertical load

The 120-member dome shown in Fig. 9 is composed of 7 member types due to the existing geometric symmetry. Therefore, the number of design variables is reduced from 120 to 7. Stress constraints have been used following AISC-ASD regulations. The permissible tensile and compressive stresses are given in (25 and 26), respectively.

$${\sigma }_{T}=0.6{F}_{y},$$
(25)
$${\sigma }_{c}=\frac{\left[{F}_{y}\left(1-\frac{{\lambda }_{i}^{2}}{2{C}_{c}^{2}} \right)\right]}{\frac{5}{3}+\frac{3{\lambda }_{i}}{8{C}_{c}}-\frac{{\lambda }_{i}^{3}}{8{C}_{c}^{3}}}\quad \text{ if }{ \lambda }_{i}<{C}_{c} \left(\text{for inelastic buckling}\right),$$
(26a)
$${\sigma }_{C}=\frac{12}{23} \frac{{\pi }^{2}E}{{\lambda }_{i}^{2}} \quad \text{ if }{\lambda }_{i}\geq{C}_{c} \left(\text{for elastic buckling}\right),$$
(26b)

where \(E\) is the modulus of elasticity, \({F}_{y}\) is the yield stress of steel, and \({C}_{c}\) is the boundary value between the elastic and inelastic buckling states. The effective length coefficient \(k\)=1, and \({r}_{i}\) is the radius of gyration of each member, evaluated as (26). The asymmetric vertical load is applied to the free nodes in the z-direction according to Table 12.

Fig. 9
figure 9

120-member dome

Table 12 External asymmetric forces applied to the 120-member dome
$${C}_{c}=\sqrt{\frac{{2\pi }^{2}E}{{F}_{y}}},$$
(27)
$${\lambda }_{i}=\frac{{kL}_{i}}{{r}_{i}},$$
(28)
$${r}_{i}=a{A}_{i}^{b},$$
(29)
$$\text{For hollow pipe section}: a=0.4993 , b=0.6777.$$
(30)

The required specifications for the 120-member dome are given in Table 13.

Table 13 Properties of the 120-member dome

The results are presented in Table 14. The best results correspond to the combined GPSG method, in which the deviation between different performances is low.

Table 14 Results of 120-member algorithms with different populations

The convergence trend of the results is presented in Fig. 10. The results show that for different populations, the GPSG is the most appropriate method.

Fig. 10
figure 10

Convergence history for the 120-member truss with different number of particles

The rankings of the three GSA, PSO and GPSG algorithms are presented in Table 15. Similar to the previous examples, the first rank belongs to the suggested new method of GSPG.

Table 15 Average ranking with different populations for 120-member Truss

4 Conclusions

In advanced methods of optimization, the main goal is to find the optimal solution of multimodal functions efficiently. There are two main categories in this field. The traditional gradient-based methods (GBM) start from a pre-assigned solution and move along a search direction using the gradients of the function under consideration. The GBM methods are reliable, efficient and fast for unimodal functions but may trap into a local optimum for multimodal functions and the initial point is important for the convergence. The second category is referred to as multipoint metaheuristic approaches. These methods are based on a number of the initial population and the search directions are made on some statistical ideas. The methods lead to finding the global point if enough initial population is chosen and the search directions are organized logically. However, the exploitation of the approaches is weak and unsatisfactory.

Based on these difficulties, in the present research, the two categories are combined to achieve a new successful approach. A compromise is made between the two categories in terms of their capabilities and shortcomings.

Among the vast number of metaheuristic techniques, the two rather successful methods of PSO and GSA are selected and their combination is merged with GBM. The integration of the three methods is called GPSG.

The resultant of the search directions of the three methods are made in such a way to control the overall speed of the GPSG with proper move limits, employing the multipoint initial population and the philosophy of the stochastic ideas.

To verify the proposed method, 25 complicated multimodal functions of CEC2005 and 29 functions of CEC2017 from the literature as benchmark examples are tested. Besides, three structural design problems of two and three-dimensional truss structures with stress and displacement constraints are optimized for optimal weight.

The numerical results indicate the superiority of the proposed approach compared to both methods of PSO and GSA. In addition, the results of CEC2005 are compared with four variants of GSA and the results of CEC2017 are compared with eight other available methods. The first mean rank belongs to the proposed approach. The power of GPSG is investigated in terms of the exploration and exploitation demands. The new approach can reach the appropriate optimal solution with a less initial population with lower independent runs. The convergence history of the approach is smooth and the results are more efficient, reliable and stable.

It was found that the combination of GBA with either of the PSO and GSA works well; however, the efficiency of the integration of the three approaches is greatly enriched.

As the metaheuristic approaches are not suitable for all the optimization problems, the search is continuously under progress to reach more suitable approaches. It is intended to incorporate higher-order gradient directions with other variants in the future.