1 Introduction

Computationally expensive optimization problems are common in science and engineering because evaluations of the objective functions may require complex simulations based on solving equations (Liu 2011; Liu et al. 2014; Sun et al. 2017; Chugh et al. 2018; Min et al. 2019), such as partial differential equations (Liu and Zeng 2010; Liu 2015b). Users with computationally expensive optimization problems are often interested in whether an algorithm possesses good performance or not under the given number of function evaluations and a not high-accuracy requirement (Moré and Wild 2009).

Particle swarm optimization (PSO) is one of the most popular population-based optimization algorithms (Banks et al. 2007). Through modeling a swarm of birds or fishes’ intelligent behaviors of finding food, PSO was proposed in 1995 (Eberhart and Kennedy 1995) and soon became an important algorithm to solve the following global optimization problem (Banks et al. 2008).

$$\begin{aligned} \min _{x} f(x), ~~\hbox {s.t.}, ~~ x\in \varOmega \subseteq {\mathbb {R}}^n. \end{aligned}$$
(1)

In model (1), the objective function f(x) is required to be minimized through seeking an optimum \(x^*\) in the search space \(\varOmega \). In other words, there is no other n dimensional vector x where the function value f(x) is smaller than \(f(x^*)\),

$$\begin{aligned} f(x^*) \le f(x), ~~\forall x\in \varOmega . \end{aligned}$$
(2)

It is well known that this type of global optimization problems like (1) is often hard to solve, partly because of the lack of mathematical optimal conditions (Floudas and Gounaris 2009) and the potential NP-hard characteristics (Sultanova 2010). There are two different classes of approaches to solve this type of global optimization problems (1). One tries to solve problem (1) by maintaining the convergence of algorithms, and both the branch and bound algorithm and the DIRECT algorithm (Jones et al. 1993; Liu et al. 2015) belong to such approaches. Evolutionary computation belongs to another class of approaches, which adopts a large number of intelligence-based strategies and concepts, e.g., nature-inspired, population-based, heuristic. Among this class of algorithms, PSO is one of most popular algorithms (Bonyadi and Michalewicz 2017).

However, PSO suffers from the “asymptotic inefficiency” phenomenon, which is actually popular in the global optimization (Liu and Zeng 2015). This phenomenon implies that global optimization algorithms can often quickly find good regions which contain potential optimum; however, more and more computational costs are needed to refine the solutions.

Similar to the “asymptotic inefficiency,” there is the “smooth-mode” phenomenon in numerical algebra (Liu and Cheng 2014). The popular method to eliminate the “smooth mode” is to adopt a bilevel strategy (Liu and Cheng 2014). Therefore, our motivation is to adopt the bilevel strategy to construct a framework to alleviate the “asymptotic inefficiency” of population-based optimization algorithms. This new framework consists of two-level search: the fine level and the coarse level. Fine level implies the population searches densely in the search space; on the contrary, the coarse level implies the population conducts a rough search in the search space. In other words, the coarse level means the local search and the fine level means global search.

It is a traditional method to combine a local optimizer to accelerate the speed of convergence and improve the local-search capability of algorithms. However, the selection of local solver is generally experience based. In this paper, we focus on design a bilevel-search framework to alleviate the “asymptotic inefficiency” phenomenon of population-based optimization algorithms and balance the exploration and exploitation. We adopt the proposal framework on SPSO2011. This framework allows SPSO2011 to alternately perform search in fine level and coarse level and thereby avoid the problem of selecting a suitable local solver. Through performing bilevel search, the global search and local search, or the exploration and exploitation, can be well balanced.

The rest of this paper is organized as follows. In Sect. 2, the canonical particle swarm optimization is reviewed briefly. In Sect. 3, we introduce the tried-and-true bilevel strategy and the proposed bilevel-search framework and then apply it to the SPSO2011 algorithm. Moreover, we present a theoretical analysis of our proposal, which shows it is order-2 stable. In Sect. 4, the proposed algorithm is sensitively analyzed and then compared with two popular PSOs to verify its numerical performance. Finally, we summarize the conclusions and future investigation directions in Sect. 5.

2 Particle swarm optimization

In this section, we provide a brief review on particle swarm optimization.

Particle swarm was first proposed to produce computational intelligence by exploiting simple analogues of social interaction, rather than purely individual cognitive abilities (Poli et al. 2007). It modeled swarm intelligence such as birds flocking and fish schooling. Later, it was soon developed into a powerful optimization method, particle swarm optimization (PSO), to solve problem (1).

In PSO, each particle i places in the search space of the problems or functions which need to be solved, and its function value is calculated at its current location \(\overrightarrow{x_{ij}}(k)\) (Poli et al. 2007). Particle i changes its location \(\overrightarrow{x_{ij}}(k+1)\) through updating its velocity \(\overrightarrow{v_{ij}}(k+1)\) in (\(k+1\))th iteration for each dimension j, which combines the best location \(\overrightarrow{g_ij}(k)\) of particle i in its neighborhood space and its historical best location \(\overrightarrow{p_{ij}}(k)\).

$$\begin{aligned}&\overrightarrow{v_{ij}}(k+1) = \omega \overrightarrow{v_{ij}}(k)+C_{1,ij}(\overrightarrow{p_{ij}} (k)-\overrightarrow{x_{ij}}(k))\nonumber \\&\quad +C_{2,ij}(\overrightarrow{g_{ij}}(k)-\overrightarrow{x_{ij}}(k)), \end{aligned}$$
(3a)
$$\begin{aligned}&\overrightarrow{x_{ij}}(k+1) = \overrightarrow{x_{ij}}(k)+\overrightarrow{v_{ij}}(k+1). \end{aligned}$$
(3b)
$$\begin{aligned}&i= 1,\ldots ,n, ~j = 1,\ldots ,d, ~k=1,2,\ldots \end{aligned}$$
(3c)

In (3a), \(C_1\sim U(0,\phi _1)\) and \(C_2\sim U(0,\phi _2)\) are uniformly distributed random numbers generated for each particle i, each dimension j, and each iteration k, n is the size of swarm, and d is the dimension of problem. There are three parameters \(\omega , \phi _1, \phi _2\) in PSO, and the PSO is called canonical in this paper when it satisfies \(\phi _1 = \phi _2\) .

Since the proposal of the particle swarm optimization algorithm, many researchers have been attracted by its simplicity and effective performance and have successively proposed improved algorithms. Some of them are interested in PSO’s applications in solving practical problems and its improvements (Xia et al. 2020a, b; Wei et al. 2020; Wu et al. 2021; Wang et al. 2016; 2020; Liu et al. 2019a, b; Zhang et al. 2020; Martino and Sessa 2020; Ben Ali et al. 2020; Yu et al. 2017; Hussain and Fujimoto 2018; Li et al. 2016). In addition to the above studies, scholars have also proposed the multi-swarm PSO (MsPSO). For instance, the MsPSO was proposed to divide the whole swarm into several subswarms at the beginning of evolutionary process and change the number and size of subswarms to balance the exploration and exploitation (Xia et al. 2018); the original swarm was divided into several subswarms and these subswarms updated the locations’ information based on shuffled frog-leaping algorithm (Bao and Han 2017); different topologies were proposed to adopted in the multi-swarm PSO which was proposed in Parrott and Xiaodong Li (2004) to increase the diversities of particles. Theoretical analysis of PSO also has made significant progress, for instance, recent work on PSO’s order-2 stability analysis (Liu 2015a), which provided a unified analysis framework for existing approaches and presented a useful stable region of PSO’s parameters, and PSO’s order-3 stability analysis (Dong and Zhang 2019). Moreover, we have provided a theoretical analysis on how to select a proper regular topology (Liu et al. 2016). Helpful insight was obtained in Liu et al. (2016) that the optimal topology of PSO was not only problem dependent but also computational budget dependent. See Bonyadi and Michalewicz (2017) for a recent review of PSO and more references therein.

However, PSO often finds good regions which contain potential optima with lower computational cost, while it needs further more computational cost to refine these regions. For instance, the standard particle swarm optimization (SPSO2011) algorithm proposed by Clerc (2011) and Clerc and Kennedy (2002) is adopted to solve the two-dimensional F28 function, ten-dimensional F19 function, and ten-dimensional F27 function in the CEC2014 test set (Liang et al. 2013). The numbers of function evaluations needed for SPSO2011 to find a function value f that satisfies the following accuracy condition are listed in Table 1.

$$\begin{aligned} \frac{f-f_{\mathrm{min}}}{f_{\mathrm{min}}}< \hbox {error}, \end{aligned}$$
(4)

where \(f_{\mathrm{min}}\) is the optima of function.

Table 1 Number of function evaluations needed for SPSO2011 to find a solution which satisfies condition (4)

Table 1 displays the number of function evaluations needed for SPSO2011 to find a solution which satisfies condition (4). When solving ten-dimensional F19 function in the CEC2014 test set (Liang et al. 2013), the number of function evaluations varies from 14,550 to 22,947 when error varies from 0.9 to 0.7; however, the number of function evaluations varies from 34,532 to 64,029 when error varies from 0.5 to 0.3. The increase in computational cost and the improvement in accuracy are not linear, but exponential. The solving process of ten-dimensional F19 is similar with the solving process of ten-dimensional F27 and two-dimensional F28 function in the CEC2014 test set. Such phenomenon that the closer to the global optima, the much more computational cost is required is termed “asymptotic inefficiency.” What is more, the “asymptotic inefficiency” phenomenon of SPSO2011 has the impact on its performance on computationally expensive optimization problems.

Not only PSO suffers from the “asymptotic inefficiency,” this phenomenon also exists in other popular global optimization algorithms. Table 2 [part of Table 2.2 in Liu and Yan (2021)] lists the number of function evaluations needed for five popular global optimization algorithms to solve two-dimensional Ackley function. In Table 2, five algorithms are PS (patternsearch from MATLAB), Nelder–Mead Simplex algorithm (NMS) (Higham 1993), Multilevel Coordinate Search (MCS) (Huyer and Neumaier 1999), Differential Evolutionary algorithm (DE) (Storn and Price 1995), and Brain Storm Optimization (BSO) (Shi 2011). The higher accuracy is, the more computational cost is needed, and the computational cost is not linear growth but approximate exponential growth. In other words, all of these five algorithms suffer from the “asymptotic inefficiency.”

Table 2 Number of function evaluations needed for five algorithms to find a solution which satisfies condition (4) when solving two-dimensional Ackley function

3 Modified particle swarm optimization based on bilevel search

In this section, we propose a modified particle swarm optimization algorithm based on bilevel-search. The bilevel-search framework is an important framework to eliminate or alleviate “asymptotic inefficiency,” which is a common phenomenon for many numerical algorithms, including global optimization algorithms.

3.1 Bilevel-search framework in numerical algebra and numerical optimization

In this subsection, we briefly review the bilevel-search framework, which is adopted in numerical algebra and numerical optimization, and especially focus on its usefulness in eliminating or alleviating an algorithm’s “asymptotic inefficiency.”

“Asymptotic inefficiency” is a phenomenon that more and more computational costs are needed for finding a solution with higher accuracy. Such phenomenon is popular in numerical algorithms. For instance, the “smooth-mode” phenomenon often occurs when solving large-scale linear equations discretized from partial differential equations (Xu 1997; Briggs et al. 2000; Liu et al. 2015). Global optimization algorithms often suffer from similar “asymptotic inefficiency,” too. For example, Tables 1 and 2 show that more and more function evaluations are needed for some popular global optimization algorithms to find a solution with higher accuracy, respectively, while the function evaluations in expensive problems are usually given rather than infinite. In this case, the “asymptotic inefficiency” phenomenon further hinders SPSO2011 from finding a higher-accuracy solution.

Fig. 1
figure 1

An illustration of the bilevel framework in the multigrid method, where a large-scale linear equation needs to be solved at the fine level, while a much smaller-scale linear equation needs to be solved in the coarse level

In order to eliminate or alleviate the “asymptotic inefficiency,” several strategies are proposed in numerical algebra. Among them, the bilevel or multilevel framework is popular and may be even the most effective one to eliminate the “smooth-mode” phenomenon. Through adopting recursively the bilevel framework presented in Fig. 1, the multigrid method can eliminate the “smooth-mode” phenomenon perfectly (Briggs et al. 2000). In Fig. 1, there are fine grids representing the fine level and coarse grids representing the coarse level. Since each grid point represents an unknown quantity, large-scale linear equations need to be solved in the fine level, while much smaller-scale linear equations need to be solved in the coarse level. All these linear equations are discretized from a partial differential equation (PDE) with different step sizes.

The key idea of bilevel methods origins from the observation that the residual of the fine level (large-scale) linear equation can be reduced at the coarse level with much lower computational cost (Briggs et al. 2000). A multilevel method can be obtained naturally through a recursion of the bilevel method (Briggs et al. 2000). This kind of bilevel or multilevel methods relies on geometry information and therefore is often called as the geometry multigrid method. Later, it is extended to algebraic multigrid method (Stübeben 2001) and applied to numerical optimization (Liu and Zeng 2015; Liu and Cheng 2014).

Inspired by the success of the multigrid method, a special bilevel strategy was designed to alleviate the DIRECT global optimization algorithm’s asymptotic ineffectiveness (Liu and Cheng 2014), and it was shown that a three-level recursion of the bilevel DIRECT algorithm possessed the best performance (Liu et al. 2015, 2017b).

In a word, the bilevel framework has been adopted in several numerical algorithms, and it is effective in eliminating or alleviating the “asymptotic inefficiency”.

3.2 Modified SPSO based on bilevel search

Inspired by the success of the bilevel DIRECT algorithm, we propose a bilevel-search framework to alleviate SPSO2011’s “asymptotic inefficiency” in this subsection.

The proposed bilevel version of SPSO2011 includes two main parts which implies two different levels of search. The common part is the same as the traditional SPSO2011 algorithm, where the whole swarm of particles searches in the entire feasible space, and we call it the “swarm-search” in this paper for convenience. Similarly, another part is called the “subswarm-search,” where the subswarm searches in the whole feasible space, too. There are more particles working in the “swarm-search” and it implies the swarm searches in fine level, while there are less particles working in the “subswarm-search” and it implies the subswarm searches in coarse level. Therefore, “bilevel search” is the abbreviation of the swarm search and the subswarm search in this paper.

In fact, the exploration and the exploitation represent the fine-level and the coarse-level search. Generally, the local optimizer is needed to balance the global and local search and the exploration and exploitation. In proposed framework, the swarm search and subswarm search represent the fine-level and coarse-level search. Different from traditional methods, the proposed framework can balance the fine-level and coarse-level search well and does not need the additional local optimizer because of the structure of the proposed framework. In other words, the proposed framework can balance the exploration and the exploitation well through the bilevel search without an additional local optimizer.

An important issue of the bilevel search is how to select the subswarm’s particles and exchange the information between swarm and subswarm. Specifically, we need to determine the size of subswarm and which particles should be selected. We will provide three different strategies, namely elite, random, and semi-elite, to compose a subswarm in the next subsection.

Another issue is the balance between the global search and the local search. Such balance is important for all global optimization algorithms. The proposed bilevel-search framework provides a convenient approach to balance them. Specifically, through controlling the ratio of evaluation generations in two levels, SPSO2011 based on bilevel-search framework can balance the global search and the local search conveniently.

In the proposed bilevel SPSO2011 algorithm, the computational budget FE is regarded as an algorithm parameter, which needs to be determined first. It is a common stopping condition for global optimization due to the absence of the mathematical conditions for global optimum. After the computational budget FE determined, it is divided into two parts for two-level search, respectively. Generally, the division can be described as follows:

$$\begin{aligned} FE_\mathrm{s} = \alpha FE, ~~~FE_\mathrm{ss} = (1-\alpha ) FE, \end{aligned}$$
(5)

where \(\alpha \in (0,1)\) is a ratio parameter and \(FE_\mathrm{s}, FE_\mathrm{ss}\) are the computational costs paid in the swarm search and the subswarm search, respectively.

It is shown in Liu et al. (2016) that the optimal topology of PSO is not only problem dependent but also cost dependent. Therefore, in our algorithm, the size of swarm is not traditionally provided by the users but endogenous. We adopt the empirical formula proposed in Liu et al. (2016) to determine the optimal size of swarm.

$$\begin{aligned} N_\mathrm{s}=c_m\sqrt{FE_\mathrm{s}}, ~~~N_\mathrm{ss}=c_m\sqrt{FE_\mathrm{ss}}, \end{aligned}$$
(6)

where \(c_m\in (0.4, 0.5)\) is a parameter which is relative to the test problems and \(N_\mathrm{s}\), \(N_\mathrm{ss}\) are the optimal population size of the swarm and subswarm calculated according to (6), respectively.

Since the computational costs in both levels equal to their swarm sizes time their evaluation generations, respectively. In order to make sure that the division of computational costs on both levels satisfy Eq. (5), the evaluation generations of both levels must satisfy the following relationship:

$$\begin{aligned} T_\mathrm{s} = T_\mathrm{ss}\sqrt{\frac{\alpha }{1-\alpha }}, \end{aligned}$$
(7)

where \(T_\mathrm{s}, T_\mathrm{ss}\) are the evaluation generations of the whole swarm and the subswarm, respectively. \(T_\mathrm{ss}\) is often small based on experience, which is set to be 1, 2, or 3. Furthermore, the maximize iterations of the proposed algorithm are given by

$$\begin{aligned} Iter_\mathrm{max} = ceil\left[ \frac{FE}{T_\mathrm{s}N_\mathrm{s}+T_\mathrm{ss}N_\mathrm{ss}}\right] , \end{aligned}$$
(8)

where \(ceil(\cdot )\) rounds the input number to the nearest integer toward infinity.

The bilevel-search-based PSO is summarized in Algorithm 1, where the standard PSO (SPSO2011) proposed by Clerc (2011) is adopted as the original PSO. In other words, the bilevel-search framework is applied into SPSO2011 and then we obtain a modified SPSO algorithm. Therefore, we call the proposed algorithm MSPSO.

figure a

In Line 8 of Algorithm 1, the subroutine “SSG” is abbreviated from “subswarm generation,” which generates a subswarm according to three different strategies. After the new subswarm is obtained (Line 9), the particles of new subswarm replace the corresponding particles in the old swarm according to the mapping relationship. The sequence is mapping relationship of the particles, so we need to store the sequence of the particles chosen to consist of the subswarm. The working process of the bilevel-search framework is shown in Fig. 2.

Fig. 2
figure 2

Working process of the proposed bilevel-search framework

3.3 Design of subswarm

In this subsection, we propose three strategies to generate the subswarm.

A natural way is to select some of the best particles, whose historical fitness function values are the best, to compose the subswarm. In other words, the elites are selected from the group, so we call it the elite strategy.

First, we sort the particles by their historical optimal values Swarm.f to obtain the sorted control sequence. Then, we select the set number \(N_\mathrm{ss}\) of sorted particles and save the sequence of the selected particles. We summarize the elite strategy in Algorithm 2.

figure b

Although the elite strategy makes use of the best particles’ experience and therefore enhances the local search ability, it may waste computational costs in searching inside undesired regions. On the contrary, through selecting individuals randomly from the particle swarm, the random strategy seeks to remain good capability of exploration in the subswarm search phase. See Algorithm 3 for details.

figure c

The semi-elite strategy combines the elite strategy and the random strategy. The best particle of the whole swarm is selected into the subswarm to enhance the exploitation ability, and the rest \(N_\mathrm{ss}-1\) individuals are randomly selected from the swarm to remain the exploration ability. See Algorithm 4 for details.

figure d

In Sect. 4.2, we will analyze the effects of these three subswarm strategies and provide some pieces of suggestions on how to choose the subswarm strategies.

3.4 Stability analysis of the MSPSO algorithm

In this subsection, we provide a stability analysis for the MSPSO algorithm. Firstly, the definition of “stability” is defined below, and the difference between “convergence” and “stability” is pointed out, too.

Definition 1

Let \(\{x(k)\}\) be the sequence of the best positions found by an optimization algorithm during k function evaluations. If

$$\begin{aligned} \lim _{k\rightarrow \infty } x(k)=z, \end{aligned}$$
(9)

where z is a solution of the objective function, then we say the algorithm is stable. Moreover, if z is the objective function’s local optimum, then the algorithm is convergent. Furthermore, if z is the objective function’s global optimum, then the algorithm is convergent globally.

In this subsection, we focus on the stability of MSPSO. For stochastic optimization algorithms, it is order-1 stable (Poli 2009) if

$$\begin{aligned} \lim _{k\rightarrow \infty } E[x(k)]=z \end{aligned}$$
(10)

is satisfied, where \(E(\cdot )\) is the mathematical expectation of input, and it is order-2 stable (Liu 2015a) if

$$\begin{aligned} \lim _{k\rightarrow \infty } E[x(k)]=z, ~~\lim _{k\rightarrow \infty } D[x(k)]=0 \end{aligned}$$
(11)

is satisfied, where \(E(\cdot )\) is the mathematical expectation of input and \(D(\cdot )\) is the mean square error of input.

In Liu (2015a), it was proved that the canonical PSO reviewed in Sect. 2 is order-2 stable when the parameters satisfy

$$\begin{aligned} \omega \in (-1,1), ~~\phi _1=\phi _2=\left( 0,\frac{12(1-\omega ^2)}{7-5\omega }\right) . \end{aligned}$$
(12)

Therefore, the region determined by (12) is called the order-2 stable region of the canonical PSO (Liu 2015a). Such a theoretical result is independent on the topology adopted by PSO, and only need the following weak stagnation assumption

$$\begin{aligned} x_d(K+t) = x_d(K), \forall t=1, 2, \ldots \end{aligned}$$
(13)

The weak stagnation assumption implies that the best position found by PSO does not improved anymore since the Kth iteration, where d is the particle which finds the best position. Recently, the weak stagnation assumption is weakened further and even can be canceled (Bonyadi and Michalewicz 2016; Cleghorn and Engelbrecht 2018).

Since only the dominant particle d is needed to be considered in the stability analysis, and the topology is independent, the bilevel-search framework does not change PSO’s stability. Moreover, SPSO’s parameters \(\omega =1/(2\hbox {log}(2)), \phi _1=\phi _2=0.5+\hbox {log}(2)\) satisfy condition (12). Thus, we have the following result.

Proposition 1

The proposed MSPSO algorithm is order-2 stable.

From Proposition 1, we have two conclusions. Firstly, the MSPSO algorithm’s stability is guaranteed by the swarm-search phases. Secondly, the subswarm-search phase is helpful in accelerating MSPSO’s convergence to the stagnation point \(x_d(K)\). In other words, in the bilevel-search framework, MSPSO balances the stability and the convergence by the same SPSO2011 algorithm, which can be regarded as the most important property of MSPSO. Therefore, the bilevel-search framework provides a novel and convenient approach to balance the exploration and the exploitation of the global optimization algorithms.

4 Numerical experiments

In this section, we report some numerical results. The following algorithms are selected to compare with our proposal:

  • SPSO2011: the standard particle swarm optimization algorithm proposed by Clerc (2011), whose theoretical properties are supported by Clerc and Kennedy (2002);

  • particleswarm: the particle swarm optimization algorithm in MATLAB.

The Hedar test set (Hedar 2005), the CEC2014 test set (Liang et al. 2013), and the CEC2014 computational expensive test set (Liu et al. 2013) are adopted for our numerical experiments. There are 27 functions in the Hedar test set, some of them have variable dimensions, some of them have several variants, and totally there are 68 problems in the Hedar test set. There are 30 functions in the CEC2014 test set, we test all 2D and 10D of these functions, and totally 52 problems are tested in our experiments. There are 24 functions in the CEC2014 computational expensive test set.

In our numerical experiments, the algorithms will stop only when 20,000 function evaluations are consumed in the Hedar test set, 100,000 function evaluations in the CEC2014 test set, and 1500 function evaluations in the CEC2014 computational expensive test set. Each problem in the Hedar test set will be independently tested 50 times, and each problem in the CEC2014 test set and the CEC2014 computational expensive test set will be independently tested 30 times.

After gathering all test data, the data profile method is adopted to analyze the data. The data profile method, first proposed in Moré and Wild (2009), is very popular in the mathematical programming community, and it is also suitable for dynamic and convenient analysis (Liu et al. 2017a), which can exhibit several algorithms’ performance in one picture, each curve corresponds to an algorithm, and it adopts the following condition as the “stopping condition”.

$$\begin{aligned} f(x_0)-f(x)\ge (1-\tau )(f(x_0)-f_L), \end{aligned}$$
(14)

where \(f_L\) is the smallest function value found by all tested algorithms, each test problem corresponds to a \(f_L\), \(x_0\) is the starting point and is also corresponded to each test problem, and \(\tau \) is the accuracy parameter. When the optimal function value of test problems is known, the \(f_L\) can be replaced by the true optimal function value, which is proposed in Yan and Liu (2021), a variant of the original data profile method, and therefore, the “stopping condition” is modified as follows:

$$\begin{aligned} f(x) \le f^*+\tau (1+\left| f^*\right| ), \end{aligned}$$
(15)

where \(f^*\) is the known global optimal function value which also corresponds to each test problem and \(\tau \) is the accuracy requirements which may be different from \(\tau \) in Eq. (14). Because the global optimal function value of each problem in the CEC2014 test set, the CEC2014 computational expensive test set, and the Hedar test set is known, the modified data profile method proposed in Yan and Liu (2021) is adopted in this paper.

Three types of numerical experiments are implemented in this paper. In Sect. 4.1, the sensitivity analysis of the parameters adopted in our algorithm is presented. In Sect. 4.2, we test three proposed generation strategies of subswarm. Finally, in Sect. 4.3, we compare our algorithm with the “particleswarm” algorithm in MATLAB.

4.1 Sensitivity analysis of parameters

In this subsection, we analyze the sensitivity of the parameters \(\alpha \) and \(T_\mathrm{ss}\).

Firstly, MSPSO is sensitive on the value of \(\alpha \). MSPSO with large value of \(\alpha \) performs better than small value of \(\alpha \). In other words, it is much better to spend most computational budget on swarm search rather than on the subswarm search. Therefore, in MSPSO, \(\alpha \) often equals 0.9, i.e., \(90\%\) of the computational budget is paid on the swarm search. An advantage of this setting is that we have the following simple relationship between both levels’ evaluation generations

$$\begin{aligned} T_\mathrm{s} = 3T_\mathrm{ss}. \end{aligned}$$
(16)

Then, we consider the sensitivity of \(T_\mathrm{ss}\). For this purpose, the following algorithms are compared numerically with SPSO2011 (Clerc 2011):

  • MSPSO3v1: where \(T_\mathrm{ss}=1\) in MSPSO, and therefore \(T_\mathrm{s}=3\) according to (16);

  • MSPSO6v2: where \(T_\mathrm{ss}=2, T_\mathrm{s}=6\) in MSPSO;

  • MSPSO9v3: where \(T_\mathrm{ss}=3, T_\mathrm{s}=9\) in MSPSO.

In order to test the sensitivity of \(T_\mathrm{ss}\), the above three MSPSO algorithms all adopt the semi-elite strategy for convenience.

Fig. 3
figure 3

Sensitivity analysis of \(T_\mathrm{ss}\) on the CEC2014 test set

Figure 3 shows the comparison result of these four algorithms on the CEC2014 test set, and all these algorithms stop only when 100, 000 function evaluations are consumed. The horizontal axis is the relative computational cost measured as

$$\begin{aligned} \kappa = \frac{\hbox {NFE}}{n+1}, \end{aligned}$$
(17)

where NFE is the number of function evaluations and n is the dimension of problem. The vertical axis is the proportion of problems solved. Therefore, for any given relative computational cost, the higher the curve is, the better the algorithm is.

From Fig. 3, we can see that all MSPSOs perform better than SPSO2011 on the CEC2014 test set. Specifically, all MSPSOs can solve about \(58\%\) problems, while SPSO2011 only solves about \(48\%\) problems. Therefore, the performance differences between all MSPSOs and SPSO2011 are about \(10\%\), which is significant. That is to say, no matter the value of \(T_\mathrm{s}s\) equals to 1, 2, or 3, the MSPSO is always better than SPSO2011 on the CEC2014 test set. In this sense, we conclude that the parameter \(T_\mathrm{ss}\) of MSPSO is insensitive.

Among these three MSPSOs, we recommend the MSPS O9v3. From Fig. 3, we notice that the MSPSO9v3 solves as many problems as MSPSO3v1 at a lower computational cost; besides, the MSPSO9v3 solves more problems than MSPSO6v2. Table 3 lists the performance of three proposed algorithms and SPSO2011 on each function of CEC2014 test set. Table 3 exhibits the gap between SPSO2011 based on the proposed frame work and itself on the CEC2014 test set. It is worthy of notice that nine iterations of swarm search and 3 iterations of subswarm search are consumed in the each iteration of MSPSO9v3, and it implies that relatively rich evaluations of subswarm search are better than insufficient evaluations on computationally expensive test sets, e.g., the CEC2014 test set.

Table 3 Average and standard deviation of four algorithms on CEC2014 test set

Figure 4 shows the comparison result of these four algorithms on the Hedar test set with the computational budget of 20,000 function evaluations. From Fig. 4, we see that all MSPSOs still perform better than SPSO2011. In this sense, MSPSO is still insensitive on \(T_\mathrm{ss}\).

From Fig. 4, we find that MSPSO9v3 performs the best, and it can solve about \(55\%\) functions on the Hedar test set. Combining Figs. 3 and 4, the overall performance of MSPSO9v3 is the best, especially with high accuracy \(\tau \) in Fig. 4. Table 4 lists the performance of three proposed algorithms and SPSO2011 on each function of the Hedar test set. Table 4 exhibits the gap between SPSO2011 based on proposed framework and itself on the Hedar test set. With high-accuracy requirement, the Hedar test set becomes relatively difficult to solve. In this case, the MSPSO9v3 still performs the best, which supports the speculation above. In other words, the relatively rich evaluations of subswarm search are necessary for computationally expensive optimization problems. Therefore, we recommend the MSPSO9v3 for the computationally expensive optimization problems, which is supported by the above two results.

4.2 Three types of subswarms

In this subsection, we consider the effects of different types of subswarm. For this purpose, we compare the following algorithms with SPSO2011 algorithm:

  • MSPSOelite : MSPSO which adopts the elite subswarm strategy;

  • MSPSOsemi-elite : MSPSO which adopts the semi-elite subswarm strategy;

  • MSPSOrandom : MSPSO which adopts the random subswarm strategy.

In all these three MSPSOs, the parameter \(T_\mathrm{ss}\) equals 3.

Fig. 4
figure 4

Sensitivity analysis of \(T_\mathrm{ss}\) on the Hedar test set

Table 4 Average and standard deviation of four algorithms on Hedar test set
Fig. 5
figure 5

Three different subswarms’ effect on the Hedar test set

Fig. 6
figure 6

Three different subswarms’ effect on the CEC2014 test set

Figures 5 and 6 show the comparison results on Hedar and CEC2014 test set, respectively. Both of them imply that MSPSOs perform better than SPSO2011, whichever the type of subswarm is adopted. These results support again that the bilevel-search framework can improve the performance of SPSO2011, and the proposed MSPSO algorithm is insensitive to the subswarm strategies.

From Fig. 5, we notice that the MSPSOsemi-elite and MSPSOrandom perform the best. However, according to Fig. 6, the semi-elite strategy performs better than the others, and the random strategy performs better than the elite strategy, which means the introduction of randomness is beneficial to avoid falling into local optimums of the highly multimodel or extremely nonlinear problems. Therefore, the semi-elite strategy performs the best, which combines the benefits of the elite and random strategy. Therefore, we present the following suggestions.

  • The semi-elite subswarm is suggested to be set default for MSPSO.

  • The random subswarm is preferred when the fitness function is extremely multimodel or highly nonlinear .

  • The elite subswarm is helpful when the fitness function is relatively easy to solve, for example, the single mode or convex problems.

4.3 Comparison with particleswarm in MATLAB

In order to verify the proposed algorithm’s performance, in this subsection, we compare the following algorithms with MSPSO where \(T_\mathrm{ss}=3\) and the semi-elite subswarm strategy is adopted:

  • SPSO2011: the standard particle swarm optimization algorithm proposed by Clerc (2011), whose theoretical properties are supported by Clerc and Kennedy (2002), and only set the population size as 100 in this part of numerical experiment;

  • particleswarm: the particle swarm optimization algorithm in MATLAB, and we only change the MaxIterations and MaxStallIterations to ensure that all function evaluations are consumed and keep the default settings of other parameters.

Firstly, we come back to SPSO2011’s “asymptotic inefficiency” on the F28 problem in CEC2014 (Table 1). For comparisons, both MSPSO and particleswarm are adopted to solve the F28 problem. Table 5 shows the number of function evaluations needed for three algorithms to find a solution which satisfies condition (4). From Table 5, we can see that both SPSO2011 and particleswarm suffer from the “asymptotic inefficiency,” while the proposed MSPSO algorithm alleviates the “eventually inefficiency” significantly.

Table 5 Number of function evaluations needed to find F28’s solution which satisfies condition (4)

Figures 7 and 8 show the comparison results on Hedar and CEC2014 test set, respectively. We can see that the proposed MSPSO algorithm performs very well on both test sets. Specifically, from Fig. 7, although particleswarm performs the best when \(\kappa <2000\), MSPSO outperforms both particleswarm and SPSO2011 when \(\kappa >2000\). Finally, MSPSO can solve about \(55\%\) problems on Hedar test set, \(5\%\) better than SPSO2011 and \(10\%\) better than particleswarm. From Fig. 8, the performance difference among MSPSO and the other two algorithms achieves almost \(10\%\) on the CEC2014 test set.

From both Figs. 7, 8 and Table 6 for any given \(\kappa \), MSPSO performs better than SPSO2011. In other words, MSPSO can solve more problems than SPSO2011 with almost any given function evaluations, which implies that the bilevel-search framework is successful in alleviating SPSO2011’s ”asymptotic inefficiency,” and therefore, it is suitable for computationally expensive optimization problems. From both figures, the performance difference between MSPSO and particleswarm becomes larger and larger, which implies that MSPSO is a promising optimization algorithm.

Figures 9 and 10 display the comparison results on the CEC2014 computational expensive test set. Figure 9 exhibits the modified profile corves of MSPSO, SPSO2011, and particleswarm with known global optima. From Fig. 9, we can find that SPSO2011 solves about \(29\%\) problems, particleswarm solves about \(0.08\%\) problems, and MSPSO solves about \(33\%\) problems which is the best on the CEC2014 computational expensive test set. In order to show the difference between MSPSO and SPSO2011 on the CEC2014 computational expensive test set, the original data profile curves of two algorithms are displayed in Fig. 10. From Fig. 10, we notice that the difference between MSPSO and SPSO2011 is significant on the CEC2014 computational expensive test set.

Fig. 7
figure 7

Comparison results on the Hedar test set

Fig. 8
figure 8

Comparison results on the CEC2014 test set

Table 6 Number of functions solved by three algorithms on two test sets
Fig. 9
figure 9

Comparison results on the CEC2014 computational expensive test set

Fig. 10
figure 10

Original data profile curves on the CEC2014 computational expensive test set

5 Conclusion and future work

A bilevel-search framework is proposed to alleviate population-based optimization algorithms’ “asymptotic inefficiency.” The proposed framework is adopted to alleviate SPSO2011’s “asymptotic inefficiency” for computationally expensive optimization problems, which is similar to the “smooth-mode” phenomenon when the traditional methods are adopted to solve the large-scale linear equations. The proposed framework achieves bilevel search, the swarm search and the subswarm search, by controlling the size of original swarm and subswarm. Since the bilevel or multilevel algorithms can eliminate the “smooth mode,” our proposal is hopeful in alleviating PSO’s asymptotic inefficiency and therefore improving PSO’s performance for computationally expensive optimization problems. The theoretical analysis and numerical results support this judgment.

Specifically, the proposed MSPSO is shown to be order-2 stable in theory. According to the numerical results, the proposed MSPSO algorithm improves the performance of the SPSO2011 algorithm significantly: about \(5\%\) on the Hedar test set, about \(10\%\) on the CEC2014 test set, about \(4\%\) on the CEC2014 computational expensive problem test set. Extensive experiments show that the proposed MSPSO algorithm is a very promising global optimization algorithm to solve the computationally expensive optimization problems, when it is compared with SPSO2011 and the particleswarm in the MATLAB.

An important property of the proposed MSPSO algorithm is that it can execute both global search and local search in the same SPSO2011 framework. It is quite different from the traditional approaches which adopt local solvers to improve the local searching abilities of global optimization algorithms. The bilevel-search framework of MSPSO includes the global and local search which are the fine-level and coarse-level search. Therefore, an advantage of the bilevel-search framework is that it provides convenient balance between exploitation and exploration without the additional local optimizer, which is key for MPSO’s success.

There is still room for improvement for the design of the strategy of constructing subswarm adopted in the proposal. The proposed bilevel framework may be not only suitable for the SPSO2011 algorithm but also suitable for other population-based global optimization algorithms. Moreover, it is valuable to investigate a multilevel-search framework for computationally expensive problems.