1 Introduction

Optimization refers to the process of finding the optimal solution under specific conditions  [1, 2]. Although a lot of research shows that traditional mathematical methods can solve continuous, unimodal, differential, and linear problems, optimization problems in the real world are often nonlinear, discontinuous, non-differentiable, and multimodal  [3]. Moreover, some complex optimization problems cannot be solved in sufficient time or accuracy by classical methods  [4]. Therefore, many researchers have proposed a new solution, i.e., metaheuristic algorithms. Compared with traditional mathematical methods, metaheuristic algorithms do not rely on gradient information and are applicable to different problems and fields  [2, 5, 6].

In recent years, many metaheuristic algorithms based on some mechanisms and principles of nature have been proposed by researchers and are widely used in practical engineering optimization problems. According to different search mechanisms, researchers divide metaheuristic algorithms into three classes: evolutionary algorithms, swarm intelligence algorithms, and physics-based algorithms  [4, 7]. The evolutionary algorithms mainly simulate the evolutionary behavior of natural creatures, and by selecting, crossovers and mutations, the good genes are retained to the next generation, thereby promoting the improvement of population in the iterative process. The genetic algorithm (GA) [8] and the differential evolution algorithm (DE) [9] are typical representatives of evolutionary algorithms. The swarm intelligence algorithms mainly simulate the group behavior of animals, which is characterized by continuously collecting spatial information during the iterative process of the algorithm. There are many swarm optimization algorithms, such as the particle swarm optimization (PSO) [10, 11], the gray wolf optimizer (GWO) [12], etc. Physics-based algorithms are inspired by fundamental laws and phenomena in physics, among which the simulated annealing algorithm (SA)  [13] and the gravity search algorithm (GSA) [14] are well known.

Although many metaheuristic algorithms have been proposed in recent years to optimize complex engineering problems, when the complexity of the problem becomes large and the local optimal solutions are numerous, the disadvantages of these algorithms tend to be exposed. The search mechanisms of many metaheuristic algorithms fail to establish a balance between exploration and exploration, which results in these algorithms either being easily trapped in a local optimal solution or slow in convergence  [15, 16]. To overcome these problems, hybrid metaheuristic algorithms are developed. A hybrid metaheuristic algorithm combines the advantages of different metaheuristic algorithms to improve the robustness and efficiency of the hybrid algorithm  [15,16,17,18]. In recent years, many researchers have been studying hybrid heuristic, among which the hybrid particle swarm optimization–gravitational search algorithm (PSOGSA)  [19] is a common hybrid optimization algorithm. Recently, researchers have proposed some new hybrid heuristic algorithms which are the state transition simulated annealing algorithm (STASA) [20] and the water cycle-moth flame optimization algorithms (WCMFO) [2].

In this paper, we focus on the MS and the FWA. The MS is a new metaheuristic optimization algorithm proposed by Wang  [21]. The algorithm is designed to solve optimization problems mainly by simulating the behavior of moths at night. Although the MS searches quickly with high accuracy, the exploration capability of the algorithm is weak. In contrast, the FWA has a strong capability to explore, but the convergence accuracy is low [22, 23]. Therefore, in this paper, we introduce the explosion and mutation operators of the FWA into the MS, and propose a new hybrid algorithm called MSFWA.

The MSFWA combines the global exploration capability of the FWA with the local exploitation capability of the MS, so that the algorithm has both the exploration and exploitation capabilities, which greatly improves the optimization performance of the hybrid algorithm. In order to test the optimization ability and stability of the MSFWA, the algorithm is tested on 23 benchmark functions and six engineering application problems.

The remainder of this article is organized as follows. In Sect. 2, the MS algorithm and the FWA search mechanism are introduced. The proposed MSFWA is described in detail in Sect. 3. In Sect. 4, the proposed algorithm is tested on 23 benchmark functions, and the test results are compared with some commonly used optimization algorithms. The ability of the MSFWA to solve six practical engineering problems is evaluated in Sect. 5. Finally, the study results are summarized in Sect. 6.

2 A brief introduction of MS and FWA

The MS is a recently proposed evolutionary algorithm based on the light-by-moth behavior of moths [21]. The algorithm is inspired by the behavior of night moths flying toward a light source. The FWA is a new metaheuristic algorithm proposed by Tan inspired by fireworks explosion [22].

2.1 Moth-flame optimization algorithm

In the MS [21], moths are divided into three categories. The best moths are called light sources, and maintain their original position unchanged. Some moths that are closer to the best moths always fly around the best moths. Moths that are relatively far from the best moths tend to fly directly toward the best moths.

2.1.1 Light source

The best moth (i.e., the light source) will keep its position and guide the behavior of other moths. As shown in Eq. (1).

$$\begin{aligned} x_i^{t + 1} = x_{\mathrm{best}}^t \end{aligned}$$
(1)

where the parameters t and \(t+1\) refer to the current generation and the next generation, respectively, \(x^t_{\mathrm{best}}\) is the best moth in the tth generation, and \(x^{t+1}_{i}\) is a moth in the \(t+1\)-th generation.

2.1.2 Levy flights

Moths that are close to the best moths will fly around the best moths in Levy flight, in which they will perform a deep search near the source to find a new source. A moth in this category can update its position according to the following formula.

$$\begin{aligned} x_i^{t + 1}&= x_i^t + \alpha L(s) \end{aligned}$$
(2)
$$\begin{aligned} L(s)&= \frac{{(\beta - 1)\Gamma (\beta - 1)\sin \left( \frac{{\pi (\beta - 1)}}{2}\right) }}{{\pi {s^\beta }}} \end{aligned}$$
(3)
$$\begin{aligned} \alpha&= {S_{\max }}/{t^2} \end{aligned}$$
(4)

where \(x^t_i\) and \(x^{t+1}_i\) represent the positions of the tth and \(t+1\)-th moths, respectively. \(\alpha \) is the step size that varies with the number of iterations. \(S_{\max }\) is the maximum step size for setting different values depending on the problem, and is usually set to 1. L(s) represents Levy flight, as shown in Eq. (2), where s is a number greater than 0 and \(\gamma (x)\) is a gamma function. According to experimental results, when \(\beta = 1.5\), Levy flights can maximize the search efficiency of unknown space; therefore, in this study \(\beta \) was set to 1.5.

2.1.3 Direct flights

Moths that are farther away from the best moths will fly toward the light source. After their position is updated, some moths will exceed the position of the light source, and some moths remain far away from the light source. The probabilities that the moth exceeds the light source and approaches the light source are each 50%. The position of the moth is updated according to Eq. (5):

$$\begin{aligned} x_i^{t + 1} = \left\{ \begin{array}{ll} \lambda \times (x_i^t + \varphi \times (x_{\mathrm{best}}^t - x_i^t)),& \quad q < 0.5 (a)\\ \lambda \times (x_i^t + \frac{1}{\varphi } \times (x_{\mathrm{best}}^t - x_i^t)),& \quad q \ge 0.5 (b) \end{array} \right. \end{aligned}$$
(5)

where \(q \in [0,1]\) is a random number and \(\varphi \) is the acceleration factor, which is set to the golden ratio according to the experiment. The \(\lambda \), a scale factor, not only speeds up the convergence of the algorithm but also increases the diversity of the swarm. In this study, \(\lambda \) is set to a random number obeying the standard normal distribution.

2.2 Fireworks algorithm

In the FWA [22], the fireworks explosion produces sparks, and some explosion sparks from the original spark continue to explode. The process of repeatedly exploding fireworks and selecting fireworks is repeated to find the optimal solution. Fireworks explosion and Gaussian variation are two ways to generate sparks.

2.2.1 Explosion operator

The explosion amplitude and the number of explosion sparks are the main factors affecting the explosion operator. The explosion operator that has better fitness than others has smaller explosion amplitude and larger explosion number. Eqs. (6) and (7) are used to determine the number and magnitude of sparks generated by the explosion operator, respectively

$$\begin{aligned} {S_i}&= {M_\mathrm{e}} \cdot \frac{{{y_{\max }} - f({x_i}) + \varepsilon }}{{\sum \nolimits _{i = 1}^N {({y_{\max }} - f({x_i})) + \varepsilon } }} \end{aligned}$$
(6)
$$\begin{aligned} {A_i}&= A \cdot \frac{{f({x_i}) - {y_{\min }} + \varepsilon }}{{\sum \nolimits _{i = 1}^N {(f({x_i}) - {y_{\min }}) + \varepsilon } }} \end{aligned}$$
(7)

where \(S_i\) and \(A_i\) represent the number and magnitude (amplitude), respectively, of sparks generated by the explosion spark \(x_i\). The \(f(x_i)\) represents the fitness value of the spark \(x_i\). \(y_{\min } = \text {min}[f(x_i)]\), \(y_{\max } = \text {min}[f(x_i)]\), \(M_\mathrm{e}\) is a factor that limits the number of explosion sparks, and A is a factor that limits the magnitude of the explosion spark.

To avoid the excessive influence of good sparks, the number of explosion sparks is limited, as described by Eq.(8):

$$\begin{aligned} {S_i} = \left\{ {\begin{array}{ll} \text {round}\,(a \times {M_\mathrm{e}}),& \quad {S_i} < a \times {M_\mathrm{e}}\\ \text {round}\,(b \times {M_\mathrm{e}}),& \quad {S_i} > b \times {M_\mathrm{e}}\\ \text {round}\,({S_i}),& \quad \text {otherwise} \end{array}} \right. \end{aligned}$$
(8)

where a and b represent the maximum and minimum values, respectively, of the spark generated by the explosion operator. The detailed process of evaluating the spark generated by the explosion operator can be found in [22].

2.2.2 Mutation operator

In the FWA, some fireworks were selected for Gaussian variation to increase population diversity, as shown in Eq. (9)

$$\begin{aligned} x_i^{t + 1} = x_i^t \cdot \text {Gaussian}(1,1) \end{aligned}$$
(9)

where Gaussian (1, 1) represents a random number with a Gaussian distribution with a mean of 1 and a variance of 1.

2.2.3 Selection strategy

After each generation of fireworks explosions and variations, choose N from the contemporary fireworks and sparks as the next generation of fireworks. The best fireworks are retained to the next generation, in addition, the \(N - 1\) fireworks are selected based on the probabilities proportional to their distances to others.

3 Hybrid moth fireworks algorithm

The MS has good search capabilities. In the MS [21], the contemporary best moth (light source) directs other moths to around and fly directly, so that the moths constantly optimize their position toward the optimal solution. Unfortunately, the MS has poor exploration capability and easy to fall into local optimal solution. The explosion operator of the FWA has strong exploration capability. The Gaussian mutation operator provides the possibility to jump out of the local optimum, but the search accuracy of the FWA is not high [23]. The strengths and weaknesses of the two algorithms suggest the possibility of combining the MS and the FWA, taking advantage of their strengths and improving the optimization ability of the combined algorithm.

Fig. 1
figure 1

Different behaviors of moths

In the MSFWA, moths are classified into five categories according to different behaviors as described below and shown in Fig. 1.

Guide moth This is the best individual among contemporary moth, guiding other moth movements, and is also known as light sources.

Surrounding moths These are moths that are closer to the best individual (guide moth), flying around the light source.

Direct flying moths These are the moths farther from the moth that fly directly to the light source.

Exploring moths Exploring moths are located far from the best individuals, are unaffected by guide moths, and are looking for better positions under their own exploration capabilities.

Mutant moths Mutant moths occur randomly in the moth population, are genetically altered, and fly randomly in space.

In Sect. 3.1, the exploring moths and mutant moths are described more fully.

3.1 Exploring moths

From the FWA, a blast operator is introduced into the MS and is called “exploring moths”. If the moth away from the light source does not see the light source, it will fly to the best position in the space of its own radius. This improves the development capability of the algorithm to a certain extent, as shown in Eq. (10):

$$\begin{aligned} \widehat{x}^d_i&= x_i^t + \Delta h\nonumber \\ \Delta h&= {A_i} \cdot \text {round}( - 1,1) \cdot \hat{B}\nonumber \\ {A_i}&= {A_{\max }} \cdot \varepsilon \end{aligned}$$
(10)

where \(\varepsilon \) is a control parameter, the value of which was set to 0.6. \(N_i\) refers to the search radius of each generation of moths. \(A_{\max }\) and \(A_{\min }\) are the maximum and minimum radii of the moth search, respectively. In this study, \(A_{\max }\) was set to 1 and \(A_{\min }\) was set to 0.0001. \(\widehat{x}_i^d\) is the dth position where the ith exploring moth plans to search. \(\widehat{B}\) is a D-dimensional vector, the value of the dth dimension is 0, and the values of other dimensions are 1.

3.2 Mutant moths

In the MSFWA, the Gaussian explosion operator from the FWA is introduced into the MS as a variant (i.e., mutant) moth. In each generation, there are M moths for Gaussian variation, and the mutant moths move randomly in a certain direction in space, as described by Eq. (11):

$$\begin{aligned} x_{i,j}^{t + 1} = x_{i,j}^t + (x_{\mathrm{best},i}^t - x_{i,j}^t) \times \text {Gaussian}(0,1) \end{aligned}$$
(11)

where Gaussian (0,1) is a Gaussian distribution random number with a mean of 0 and a variance of 1.

The pseudocode of the MSFWA is shown in Fig. 2.

Fig. 2
figure 2

Pseudocode of the MSFWA algorithm

4 Results and discussion

Because the emphasis of metaheuristic algorithms can differ, the algorithm behaves differently in different applications. Thus, different benchmark functions should be used to test the capability of a metaheuristic algorithm. In this research, the proposed MSFWA was tested on 23 test functions [24], among which were seven single-peak benchmark functions, six multi-peak benchmark functions, and 10 fixed-dimensional benchmark functions. The unimodal benchmark function has no local optimal solution and has a unique global optimal solution. These benchmark functions are designed to evaluate the exploitation capabilities of metaheuristic algorithms. In contrast, multimodal and fixed-dimensional benchmark functions are designed to evaluate the exploration capabilities of metaheuristic algorithms. These functions have many local optimal solutions and unique global optimal solutions. The final test results were subjected to a rank sum test to analyze their performance.

4.1 Benchmark functions test

To prove the effectiveness of the MSFWA, its performance was compared with that of PSO [11], moth-flame optimization algorithm (MFO) [25], sine-cosine algorithm (SCA) [26], PSOGSA  [19], GWO [12], MS  [21] and FWA [22]. To compare the convergence speed and accuracy, the number of populations of all algorithms was set to 50, and the number of iterations was set to 1000. The comparison algorithm parameter settings were the same as used in the corresponding referenced documents. The parameter settings of each algorithm are listed in Table 1.

Table 1 Parameter settings of each algorithm

The test programs of these algorithms were run independently on a desktop computer with Intel (R) i3-7100 CPU @ 3.90 GHz in the window 10 environment. To reduce errors when comparing the algorithms, each algorithm was applied 30 times and the mean and variance of the results from each algorithm were compared.

First, the performance of the MSFWA on the unimodal test function was evaluated. The mathematical expressions of the unimodal test functions are listed in Table 2. The dimensions of each function, the range of variable values, and the optimal solution were also listed. As shown in Fig. 3, the unimodal test function had no local optimal solution, but only one global optimal solution.

Table 2 Unimodal benchmark functions
Fig. 3
figure 3

Two-dimensional view of the unimodal functions

Table 3 lists the results for each of the algorithms applied to the single-peak benchmark functions. The test results in Table 3 showed that the MSFWA had obvious advantages over other algorithms when applied to the single-peak benchmark functions. Except for function F5, the performance of the MSFWA was better than that of the other algorithms, and for the F5 function, the MSFWA performed worse than only GWO. In Fig. 4, we had drawn the function convergence diagrams of functions F5 and F6. We can see that MSFWA algorithm had faster convergence speed and higher accuracy compared with other algorithms on functions F6 and F5, which illustrated that MSFWA had significant advantages in convergence speed and convergence precision compared with other algorithms.

Table 3 Statistical results of the algorithms in 30 runs on each unimodal benchmark functions
Fig. 4
figure 4

Convergence plot of the algorithms on single-peak benchmark functions

Second, the MSFWA was evaluated on six multimodal test functions, which had a lot of local optimal solutions that were useful for testing the exploration capabilities of the algorithms. Table 4 lists the expressions for the multimodal benchmark functions. In Fig. 5, we had drawn a two-dimensional function diagram of some multimodal benchmark functions. It can be clearly seen that the characteristic of multimodal benchmark function was that it had multiple local optimal solutions.

Table 4 Multimodal benchmark functions
Fig. 5
figure 5

Two-dimensional view of the multimodal benchmark functions

The results from applying the eight algorithms on the multimodal test functions are recorded in Table 5. The MSFWA again exhibited significant advantages over the other algorithms in analyzing the F9, F10, F12 and F13 test functions, and performed worse than only the MFO when applied to the F8 test function. In analyses of the F11 test function, only the FWA outperformed the MSFWA.

Table 5 Statistical results of the algorithms in 30 runs on each multimodal benchmark function

The convergence diagrams for each algorithm applied to the multimodal benchmark functions are plotted in Fig. 6. As shown in Fig. 6, the MSFWA successfully avoided falling into the local optimal solution on the test functions F9, F10, F12, and F13. Therefore, the algorithm found the global optimal solution faster than other algorithms. Mainly the exploring moths and mutant moths provided the MSFWA with a powerful spatial search capability.

Fig. 6
figure 6

Convergence plot of the algorithms on multimodal benchmark functions

Finally, the MSFWA was tested on 10 fixed-dimensional multimodal benchmark functions. The function expressions, dimensions, variable ranges, and optimal solutions are listed in Table 6. The two-dimensional diagrams of test functions F14, F15, F16 and F18 are shown in Fig. 7, from which we can see the characteristics of each test function. As in previous tests, the eight algorithms were tested on each function 30 times. The average value and standard deviation of the test results are recorded in Table 7.

Table 6 Fixed-dimensional multimodal benchmark functions
Fig. 7
figure 7

Two-dimensional view of the fixed-dimensional multimodal benchmark functions

As can be seen from Table 7, the MSFWA performed best on five of the 10 fixed-dimensional test functions, ranked second best on two functions and third best on three functions. Furthermore, the standard deviation of the MSFWA results was relatively small, indicating that the algorithm was relatively stable. The convergence curves of some functions are shown in Fig. 8, in which the MSFWA converged faster and had higher accuracy than other algorithms on test function F15. On the test function F23, the convergence speed of the MSFWA was only worse than that of MFO and MS and the convergence accuracy of the MSFWA algorithm was better than that of other algorithms. Therefore, the performance of the MSFWA was better than the other algorithms when applied to fixed-dimensional functions.

Table 7 Statistical results of the algorithms in 30 runs on each fixed-dimensional multimodal benchmark functions
Fig. 8
figure 8

Convergence plot of the algorithms on fixed-dimensional multimodal benchmark functions

4.2 Statistical analysis

The mean and variance of algorithm solutions are general evaluation indicators for evaluating the superiority of an algorithm. Other statistical analyses were performed to ensure that the results described in Sect. 4.1 were not random. Friedman test was an effective statistical method for analyzing significant differences between algorithms. In this study, nonparametric tests at a significance level of \(\alpha = 0.05\) were performed on the results of the 30 independent algorithm applications. The average ranking obtained from the Friedman test was used to evaluate the superiority of the MSFWA. The lower was the algorithm ranking, the better was the performance of the algorithm. The average rating of each algorithm at the 95% confidence level is listed in Table 8.

Based on the average ranking and standard deviation presented in Table 8, the MSFWA was the best algorithm and superior in most functions. The Friedman test results are portrayed in the box diagram of Fig. 9. As shown, the MSFWA had better search performance and higher stability than other algorithms. The worst ranking of the MSFWA was compared to other algorithms, it ranked lowest in the lowest ranking of the algorithms. Therefore, the MSFWA performed well on test problems.

Table 8 Results of the Friedman test
Fig. 9
figure 9

A schematic view of the results of the Friedman test

4.3 Time complexity of the MSFWA

The time complexity of the MSFWA mainly includes two main factors: the iteration time of the algorithm and the time required to calculate the objective function of the test function. According to the pseudocode of the algorithm, the time complexity of the algorithm is \(O(\text {iteration}{s_{\max }} \times \left( {N + M} \right) \times d) + O(\text {iteration}{s_{\max }} \times Obj)\). The first part \(O(\text {iteration}{s_{\max }} \times \left( {N + M} \right) \times d)\) is mainly the update time complexity of each operator of the MSFWA algorithm, where N and M are the number of moth populations and the number of mutations, respectively, and d is the dimension. The second part \(O(\text {iteration}{s_{\max }} \times Obj)\) refers to the time complexity of the function being tested using an algorithm. Therefore, the time complexity of the algorithm depends on the complexity of the loop iteration and the complexity of the problem to be solved.

5 Engineering optimization problems

Testing hybrid algorithms on practical engineering problems is commonly practiced because engineering problems are different from most basic test functions, and the optimal solution is unknown. Moreover, there are various constraints on engineering applications  [27]. Therefore, it was necessary to test the MSFWA on actual engineering problems. Six engineering application problems were tested: tension/compression spring, I-beam problem, welded beam, gear train design, Cantilever beam design and three-bar truss. In solving engineering application problems, constraints must be considered, and there are many “penalty” functions that can be used to do so [28]. In this study, individual values that were beyond the bounds were replaced by randomly generated values from within the constraints.

5.1 Gear train design problem

The gear train design problem is a renowned unconstrained optimization problem in mechanical engineering. As shown in Fig. 10, determining the gear ratio in the minimized map is the main challenge of the problem. Gears A, B, C and D are the design variables of the optimization problem  [29, 30]. The modeling expression for this problem is described by Eq. (12):

$$\begin{aligned} \begin{array}{l} {\mathrm{Consider }}\ \quad \mathbf {x} = [{x_1},{x_2},{x_3},{x_4}] = [{n_A},{n_B},{n_C},{n_D}]\\ {\mathrm{Minimize }}\quad f(\mathbf {x}) = {(\frac{1}{{6.931}} + \frac{{{x_2}{x_3}}}{{{x_1}{x_4}}})^2}\\ {\mathrm{Variable range }}\quad 12 \le {x_1},{x_2},{x_3},{x_4} \le 60 \end{array} \end{aligned}$$
(12)
Fig. 10
figure 10

Gear train design problem

The gear train design problem is a discrete optimization problem. The gear variables are rounded to find the optimal number of gears. In the analysis of this problem, the MSFWA was compared with ALM  [31], GA [32], ABC  [33], MBA [34], ISA  [29], MOV [35], MFO  [25], ALO [36], CS  [37], PSOSCALF [6]. The best gear ratios determined by the MSFWA and the other algorithms are listed in Table 9. The MSFWA and MBA, ISA, MOV, MFO, ALO, CS, PSOSCALF (and a few other algorithms) searched for the same gear ratio (i.e., reached the same optimization decision), which showed that the MSFWA can solve the actual discrete problem well.

Table 9 Comparison result of the gear train design problem

5.2 Cantilever beam design problem

As shown in Fig. 11, the cantilever beam design problem is represented by five square hollow blocks joined together to form a beam. The beams are rigidly supported at one end and a vertical force acts on the free end of the cantilever. The purpose of this problem was to minimize the weight of the cantilever. The modeling expression for this problem is described by Eq. (13):

$$\begin{aligned} \begin{array}{l} {\mathrm{Consider }}\ \quad \mathbf {x} = [{x_1},{x_2},{x_3},{x_4},{x_5}]\\ {\mathrm{Minimize }}\quad f(\mathbf {x}) = 0.0624({x_1} + {x_2} + {x_3} + {x_4} + {x_5})\\ {\mathrm{Subject to }}\quad g(\mathbf {x}) = \frac{{61}}{{{x_1}^3}} + \frac{{37}}{{{x_2}^3}} + \frac{{19}}{{{x_3}^3}} + \frac{7}{{{x_4}^3}} + \frac{1}{{{x_5}^3}} - 1\\ {\mathrm{Variable range }}\quad 0.01 \le {x_i} \le 100,i = 1,2,3,4,5 \end{array} \end{aligned}$$
(13)
Fig. 11
figure 11

Cantilever beam design problem

The comparison between the MSFWA and MFO [25], MMA [38], GCA-I  [38], GCA-II [38], CS [37] and SOS [39] is listed in Table 10. The results showed that the MSFWA can effectively solve the cantilever beam problem, which indicated that the MSFWA can be applied to minimizing the weights of this type of beam.

Table 10 Comparison results for cantilever design problem

5.3 Tension/compression spring design problem

The tension/compression spring design is another engineering application in which the goal is to minimize the construction cost of the spring, which is limited by four nonlinear constraints. As shown in Fig. 12, wire diameter (d), average diameter of the spring (D), and number of effective coils (N) are the three main parameters of the optimization problem. The modeling expressions and constraints of this problem are described by Eq. (14).

$$\begin{aligned} \begin{array}{l} {\mathrm{Consider }}\ \quad \mathbf {x} = [{x_1},{x_2},{x_3}] = [d,D,N]\\ {\mathrm{Minimize }} \quad f(\mathbf {x}) = ({x_3} + 2){x_2}x_1^2\\ {\mathrm{Subject to }}\quad {g_1}(x) = 1 - \frac{{x_2^3{x_3}}}{{71785x_1^4}} \le 0\\ \qquad \qquad \quad {g_2}(x) = \frac{{4x_2^2 - {x_1}{x_2}}}{{12566({x_2}x_1^3 - x_1^4)}} + \frac{1}{{5108x_1^2}} - 1 \le 0\\ \qquad \qquad \quad {g_3}(x) = 1 - \frac{{140.45{x_1}}}{{x_2^2{x_3}}} \le 0\\ \qquad \qquad \quad {g_4}(x) = \frac{{{x_1} + {x_2}}}{{1.5}} - 1 \le 0\\ {\mathrm{Variable range }}\quad 0.05 \le {x_1} \le 2,0.25 \le {x_2} \le 1.3\\ \quad \qquad \qquad \qquad 2 \le {x_3} \le 15 \end{array} \end{aligned}$$
(14)
Fig. 12
figure 12

Schematic of the tension/compression spring

This design problem was optimized using the MSFWA as well as mathematical optimization [40], constraint correction [6], DE [41], GSA [42], GA [43], PSO  [44], GWO [12], WOA  [42], MFO [25], PSOSCALF  [6]. The best results for each algorithm optimization are listed in Table 11 and showed that the MSFWA algorithm achieved the best optimization effect.

Table 11 Comparison of results for tension/compression spring design problem

5.4 Welded beam design problem

The purpose of the welded beam design problem is to minimize the construction cost of the welded beam (Fig. 13). The four main parameters of the problem are: weld thickness (h), clamping rod length (l), rod height (t) and rod thickness (b). The modeling expression for this nonlinear optimization problem is described by Eq. (15):

$$\begin{aligned} \begin{array}{l} {\mathrm{Consider }}\ \quad \mathbf {x} = [{x_1},{x_2},{x_3},{x_4}] = [h,L,T,b]\\ {\mathrm{Minimize }}\quad f(\mathbf {x}) = 1.10471x_1^2{x_2} + 0.04811{x_3}{x_4}(14 + {x_2})\\ {\mathrm{Subject to }}\quad {g_1}(\mathbf {x}) = \tau (x) - {\tau _{\max }} \le 0\\ \qquad \qquad \quad {g_2}(\mathbf {x}) = \sigma (x) - {\sigma _{\max }} \le 0\\ \qquad \qquad \quad {g_3}(\mathbf {x}) = \delta (x) - {\delta _{\max }} \le 0\\ \qquad \qquad \quad {g_4}(\mathbf {x}) = {x_1} - {x_4} \le 0\\ \qquad \qquad \quad {g_5}(\mathbf {x}) = 0.10471x_1^2 + 0.04811{x_3}{x_4}(14 + {x_2}) - 5 \le 0\\ \qquad \qquad \quad {g_6}(\mathbf {x}) = 0.125 - {x_1} \le 0\\ \qquad \qquad \quad {g_7}(\mathbf {x}) = P - {P_c}(x) \le 0\\ {\mathrm{Variable range }} \, \quad 0.1 \le {x_1} \le 2,0.1 \le {x_2} \le 10\\ \quad \qquad \qquad \qquad 0.1 \le {x_3} \le 10,1 \le {x_4} \le 2 \end{array} \end{aligned}$$
(15)

The values of the parameters P, L,\(\tau ,\delta ,\sigma \), E and G and the Detailed constraints about \(\tau (x),\delta (x), \sigma (x),P_c(x)\) in Eq. (12) can be found in [17].

Fig. 13
figure 13

Schematic of welded beam design

The problem of the welded beam design is a common engineering application problem that has been studied using different algorithms (e.g., Random [45], GA  [46], Simplex [45], HS  [47], GSA [42], MOV  [35], MFO [25], CPSO  [48], WOA [42], CBO [49], GWO [12] among others). The optimization results of these algorithms were compared with those from the MSFWA (Table 12). As shown, the MSFWA provided the best optimization result, which indicated that the MSFWA had obvious advantages over the other tested algorithms in solving the nonlinear constraint problem.

Table 12 Comparison of results for the welded beam design problem

5.5 I-beam design problem

Another typical engineering optimization problem is the I-beam design problem, which aims to minimize the vertical deflection of the beam. As shown in Fig. 14, the length (b), the height (h), and the thicknesses tf and tw are four parameters of the problem. The modeling expression for this problem is described by Eq. (16):

$$\begin{aligned} \begin{array}{l} {\mathrm{Consider }}\ \quad \mathbf {x} = [{x_1},{x_2},{x_3},{x_4}] = [b,h,{t_w},{t_f}]\\ {\mathrm{Minimize }} \quad f(\mathbf {x}) = \frac{{5000}}{{\frac{{{t_w}{{(h - 2{t_f})}^3}}}{{12}} + \frac{{bt_{_f}^3}}{6} + {{\left( {\frac{{h - {t_f}}}{2}} \right) }^2}}}\\ {\mathrm{Subject to }}\quad {g_1}(x) = 2b{t_w} + {t_w}(h - 2{t_f}) \le 0\\ {\mathrm{Variable range }}\quad 10 \le {x_1} \le 50, 10 \le {x_2} \le 80\\ \quad \qquad \qquad \qquad 0.9 \le {x_3} \text{and}{x_4} \le 5 \end{array} \end{aligned}$$
(16)
Fig. 14
figure 14

beam design problem

The results from applying the MSFWA and other optimization algorithms (IARSM [50], ARSM  [50], CS [37], SOS  [39], MFO [25], PSOSCALF  [6] ) to solve the I-beam design problem are listed in Table 13. As shown, the MSFWA algorithm optimization results were significantly better than those produced by IARSM, ARSM, CS, SOS, and were similar to those from the MFO and the PSOSCALF. This comparison showed that the MSFWA could solve the I-beam design problem well.

Table 13 Comparison results for I-beam design problem

5.6 Truss design problem with three-bar

The three-bar truss design problem is a famous structural design problem in civil engineering. The goal of this problem is to search the optimum cross section that minimizes the weight of the truss. As shown in Fig. 15, there are two parameters in this problem: cross sections A1 and A2. The mathematical model of the problem is described as follows:

$$\begin{aligned} \begin{array}{l} {\mathrm{Consider }}\ \quad \mathbf {x} = [{x_1},{x_2}] = [{A_1},{A_2}]\\ {\mathrm{Minimize }}\, \quad f(\mathbf {x}) = (2\sqrt{2} {x_1} + {x_2})l\\ {\mathrm{Subject to }}\quad {g_1}(\mathbf {x}) = \frac{{\sqrt{2} {x_1} + {x_2}}}{{\sqrt{2} {x_1}^2 + 2{x_1}{x_2}}}P - \sigma \\ \qquad \qquad \quad {g_2}(\mathbf {x}) = \frac{{{x_2}}}{{\sqrt{2} {x_1}^2 + 2{x_1}{x_2}}}P - \sigma \\ \qquad \qquad \quad {g_3}(\mathbf {x}) = \frac{1}{{\sqrt{2} {x_2} + {x_1}}}P - \sigma \\ {\mathrm{Variable range }}\quad 0.05 \le {x_1},{x_2} \le 2\\ {\mathrm{Where }}\quad l = 100\,\text{cm}, P = 2\,\text{KN/c}{\text{m}^2},\sigma = 2\,\text{KN/c}{\text{m}^2} \end{array} \end{aligned}$$
(17)
Fig. 15
figure 15

Three-bar truss design problem

Compared with other engineering problems, the problem was relatively simple. The MSFWA was compared with CS [37], ALO  [36], MFO [25], MOV  [35], MBA [34], DEDS  [51], PSO-DE [52], PSOSCALF [6] and other algorithms and the results are listed in Table 14. These results showed that the optimal solution found by the MSFWA was close to that found by other algorithms, indicating that the MSFWA could effectively solve the nonlinear constraint problem.

Table 14 Comparison results of the three-bar truss design problem

6 Conclusions and future work

The purpose of this research was to propose a hybrid metaheuristic algorithm that provided superior optimization results and addressed the shortcomings of the existing MS and FWA. To achieve this objective, the explosion and mutation operators of the FWA were introduced into the MS to form a new MSFWA. The performance of the MSFWA was compared with that of the two base algorithms and other widely used optimization algorithms in solving 23 benchmark functions and six engineering application problems.

The results from the benchmark function testing confirmed that the MSFWA had a faster convergence speed and higher precision than the comparison algorithms. Application of the MSFWA on the six types of practical engineering problems showed that the hybrid algorithm had a strong exploration capability for the unknown search space and could effectively solve practical problems that had multiple constraints. Compared with the individual base MS and FWA, the hybrid MSFWA provided two main improvements:

  1. 1.

    The exploration and exploitation capabilities of the hybrid algorithm were more balanced and improve the convergence performance compared to that of the MS;

  2. 2.

    The MSFWA had an improved global exploration capability and did not easily fall into local optimal solution.

In the future work, we will further improve the algorithm to solve practical problems. At the same time, we will continue to study the ability of the MSFWA to solve multi-objective problems.