Keywords

1 Introduction

Particle Swarm Optimization (PSO) was first proposed by Kennedy and Eberhart [1, 2] in 1995 as an evolutionary single-objective optimization algorithm. N particles are initialised at random positions/velocities in the search space, and the \(i^{\text {th}}\) particle updates its trajectory according to

$$\begin{aligned} v_{i}^{(t+1)}&= wv_{i}^{(t)} + c_1r_1(pbest_{i}^{(t)} - x_{i}^{(t)}) + c_2r_2(gbest^{(t)} - x_{i}^{(t)}) \end{aligned}$$
(1)
$$\begin{aligned} x_{i}^{(t+1)}&= x_{i}^{(t)} + v_{i}^{(t+1)} \end{aligned}$$
(2)

\(r_1\) and \(r_2\) are random numbers drawn from the uniform distribution U(0, 1). \(pbest_{i}^{(t)}\) is the best position (in terms of minimizing the objective) that particle i has visited upto time t. \(gbest^{(t)}\) is the best position among all particles that has been achieved. After sufficient iterations, all particles assume positions \(x_i\) near gbest with the particle velocities \(v_i\approx 0\). In this state, we say that the swarm has converged.

[3] proposes the EMPSO algorithm to speed up the convergence and avoid local minima in single-objective problems. It is a vanilla PSO algorithm aided by exponentially-averaged momentum (EM). Their PSO update equations are as follows

$$\begin{aligned} M_{i}^{(t+1)}&= \beta M_{i}^{(t)} + (1-\beta )v_{i}^{(t)} \end{aligned}$$
(3)
$$\begin{aligned} v_{i}^{(t+1)}&= M_{i}^{(t+1)} + c_1r_1(pbest_{i}^{(t)} - x_{i}^{(t)}) + c_2r_2(gbest^{(t)} - x_{i}^{(t)}) \end{aligned}$$
(4)

Eq. (3) computes the exponentially-averaged velocity of the \(i^{\text {th}}\) particle upto timestep t. The position update equation for EMPSO remains the same as Eq. (2). The momentum factor must obey \(0<\beta <1\).Footnote 1 By recursively expanding Eq. (3), a particle’s momentum is an exponentially weighted sum of all its previous velocities

$$\begin{aligned} M_{i}^{(t+1)}&= (1-\beta )v_{i}^{(t)} + \beta (1 - \beta )v_{i}^{(t-1)} + \beta ^{t-2}(1 - \beta )v_{i}^{(2)} + \beta ^{t-1} (1-\beta )v_{i}^{(1)} \end{aligned}$$
(5)

In certain single-objective problems, [3] report a \(50\%\) reduction in the iterations taken to convergence for EMPSO relative to vanilla PSO. Due to its superior performance over the vanilla algorithm in the single-objective setting, we hypothesize that similar benefits of EM would be seen in multi-objective problems. The central setting of multi-objective optimization (MOO) is the following problem

$$\mathop {\textrm{minimize}}\limits _{\textbf{x}\in \mathbb {R}^n} \textbf{f}(\textbf{x}) = [f_1(x), f_2(x), \ldots , f_k(x)]$$

i.e., given an input space \(\mathbb {R}^n\), we want to optimize k functions \(f_1, f_2, \ldots , f_k\) in the objective space. In practice, MOO solvers find an Pareto front which represents a non-dominated set of decision variables \(\textbf{x}_i\in \mathbb {R}^n\). Simplistically, it is a set of solutions where each member of the set is as good a solution as any other. A comprehensive introduction to MOO can be found in [4]. SMPSO [5] is the state-of-the-art MOO solver that is based on vanilla PSO. It uses a constricted vanilla PSO whose update equation is

$$\begin{aligned}&v_{i}^{(t+1)} = \chi [wv_{i}^{(t)} + c_1r_1(pbest_{i}^{(t)} - x_{i}^{(t)}) + c_2r_2(gbest^{(t)} - x_{i}^{(t)})] \end{aligned}$$
(6)

where \(\chi \) is the constriction factor [6] defined as follows

$$\begin{aligned} \chi = \left\{ \begin{array}{ll} \frac{2}{2-\phi -\sqrt{\phi ^2-4\phi }} &{} \phi > 4 \\ 1 &{} \phi \le 4 \end{array} \right. \end{aligned}$$
(7)

with \(\phi =c_1+c_2\). Hence \(\chi \) is a function of \(c_1,c_2\). Since the constriction factor is with respect to vanilla PSO, we denote it as \(\chi \equiv \chi ^{(v)}(\phi )\)Footnote 2. The position update equation for constricted vanilla PSO remains the same as Eq. (2). We describe SMPSO in Algorithm 1.

figure a

Line 1 initializes the particle’s positions in the input space along with random velocities. As per [4], the external archive for storing leaders is initialized in line 2. Line 5 updates the swarm obeying constricted Vanilla PSO Eqs. (6, 7). Line 6 follows the regular position update equation as Eq. (2). Line 7 performs a turbulence mutation which introduces a diversity of solutions in the swarm, so that they don’t converge to a single point. Finally, the particles are evaluated and the external archive is updated in lines 8–10. In particular, we focus on line 5 of Algorithm 1 and expand it in Algorithm 2.

figure b

Lines 2–3 draw \(r_1,r_2\) from a uniform distribution U(0, 1) and lines 4–5 draw \(c_1, c_2\sim U(1.5, 2.5)\). Line 6 computes \(\phi \) and line 7 computes the constriction factor \(\chi ^{(v)}(\phi )\). Lines 8–9 update the particles velocity according to Eq. (6) where x[i] and v[i] are the position and velocity vectors respectively of the \(i^{\text {th}}\) particle. Finally, line 10 performs a velocity constriction based on the boundary of the search spaceFootnote 3. SMPSO claims that its superiority over other MOO solvers, such as OMOPSO [7] and NSGA-II [8], is rooted in the randomized selection of \(c_1, c_2\) along with the constriction factor \(\chi ^{(v)}(\phi )\) which maintains a diversity of solutions in the swarm.

2 Motivations

Apart from the external archive, leader selection and mutation, the performance of SMPSO is governed by the dynamics of the swarm which is solely dictated by the \( computeSpeed ()\) subroutine (Algorithm 2). Thus, the incorporation of EM in SMPSO must occur within the \( computeSpeed ()\) function (line 5 of Algorithm 1). As a first attempt, we formulate the desired \( computeSpeed() \) in Algorithm 3. We name our EM-aided SMPSO algorithm as EM-SMPSO.

figure c

Akin to Algorithm 2, we draw \(\beta \sim U(0,1)\) in line 6. Line 8 computes the appropriate constriction factor for EM-SMPSO. Note that the function \( ConstrictionFactor() \) now takes two arguments \((\phi ,\beta )\) instead of one. This is because EM directly affects the swarm dynamics and hence we need a different constriction factor \(\chi \equiv \chi ^{(m)}(\phi ,\beta )\). Lines 9–11 are the update equations of constricted EMPSO. It can be shown that the constriction co-efficient isFootnote 4

$$\begin{aligned} \chi ^{(m)}(\phi ,\beta ) = \left\{ \begin{array}{ll} \frac{2}{2-\phi -\sqrt{\phi ^2-4(1-\beta )\phi }} &{} \phi > 4(1+\beta )^{-1} \\ 1 &{} \text {otherwise} \end{array} \right. \end{aligned}$$
(8)

From a theoretical standpoint, adopting a positive/negative constriction co-efficient are equivalent because only the modulus \(|\lambda |\) is significant [9]. Moreover, note that \(\beta =0\) implies that the effect of momentum is absent and it can be easily confirmed that \(\chi ^{(m)}(\phi , 0)=\chi ^{(v)}(\phi )\). Thus, our derivation is consistent with that of vanilla PSO.

Fig. 1.
figure 1

Pareto Fronts on ZDT1 and ZDT2

In Fig. 1, we present the Pareto fronts of EM-SMPSO (Algorithm 3) on the ZDT [10] bi-objective problems. The nature of the fronts are poor compared to that obtained by SMPSO i.e., significantly fewer points in the external archive and a fragmented Pareto front. The SMPSO Pareto fronts, on the other hand, are smooth and dense. The Pareto fronts were obtained using the jmetalpy [11] framework. In the single-objective realm, a blanket introduction of EM into the swarm dynamics significantly improved performance compared to vanilla PSO across various objective functions. Whereas, in the multi-objective case, that is not the case as is demonstrated by the Pareto fronts. It is instuctive to analyse that component of SMPSO which is pivotal to its superior performance — the constriction factor. Drawing \(c_1,c_2\sim U(1.5, 2.5)\) entails that \(\phi \sim U(3, 5)\) according to Algorithm (3). The midpoint of this distribution is \(\phi =4\), which is also the value at which the two separate branches of \(\chi ^{(v)}(\phi )\) are defined in Eq. (7). We say that the constriction factor is active if the first branch is taken. Hence in the entire evolutionFootnote 5 of the swarm, the constriction factor is activated with probability \(\frac{1}{2}\). It is in this sense that SMPSO is a fairly constricted algorithm — the constriction factor is activated/unactivated with equal chance. EM-SMPSO with \(\phi \sim U(3,5)\) and \(\beta \sim U(0,1)\) is not a fairly constricted algorithm because of the way \(\chi ^{(m)}(\phi ,\beta )\) is defined. We prove this fact in Sect. 3

3 Finding a Fairly Constricted Algorithm

We first develop simple mathematical formulae to assess the fairness of any variant of EM-SMPSO algorithm where \(\phi \sim U(\phi _1, \phi _2)\) and \(\beta \sim U(\beta _1, \beta _2)\). The respective probability densities for the variables \(\phi , \beta \) are \(p_{\phi }(\phi ), p_{\beta }(\beta )\) respectively. Let \( E \) be the event that \(\phi >4(1+\beta )^{-1}\) corresponding to the definition in Eq. (8). We wish to find a formula for P(E)

$$\begin{aligned} P(E) = \int \int _{\phi>4(1+\beta )^{-1}} p_{\phi }(\phi )p_{\beta }(\beta ) \textrm{d}{\phi }\,\textrm{d}{\beta } \nonumber = \int \int _{\beta >4\phi ^{-1}-1} p_{\beta }(\beta )p_{\phi }(\phi ) \textrm{d}{\beta }\,\textrm{d}{\phi } \end{aligned}$$

Using simple calculus, Eq. (9) can be simplified to

$$\begin{aligned} P(E) = \int _{\phi _l}^{\phi _g} \int _{4\phi ^{-1}-1}^{\beta _2}p_\beta (\beta )p_\phi (\phi )\textrm{d}{\beta }\,\textrm{d}{\phi } + \int _{\phi _g}^{\phi _2}p_\phi (\phi )\textrm{d}{\phi } \end{aligned}$$
(9)

where \(\phi _l = max(\phi _1, 4(1+\beta _2)^{-1})\) and \(\phi _g = min(4(1+\beta _1)^{-1}, \phi _2)\). Additionally, we define the unfairness metric \(\mu =P(E)-\frac{1}{2}\) to ease mathematical analysis. Note that it satisfies \(-0.5\le \mu \le 0.5\). It is a measure of how far away an algorithm is from being fairly constricted. \(\mu =0\) is a fairly constricted algorithm whereas \(\mu >0\) is over-constricted, and \(\mu <0\) is under-constricted. It can be shown that Algorithm (3) corresponds to an unfairness value

$$\begin{aligned} \mu&= 1-2\ln (4/3) \approx 0.42 \end{aligned}$$

It is an over-constricted algorithm compared to SMPSO by a large margin. Thus, we have been able to reason about the suboptimal nature of the Pareto Fronts using fairness analysis of the constriction factor. We wish to utilize the full range of the momentum parameter and hence set \(\beta _1=0, \beta _2=1\). In computing the probability integral, we posit \(\phi _l=\phi _1\) and \(\phi _g=\phi _2\) which amounts to exercising the choices of \(\phi _1\ge 2\) and \(\phi _2\le 4\) respectively. Hence

$$\begin{aligned} P(E) = \int _{\phi _1}^{\phi _2}\int _{4/\phi -1}^{1}\textrm{d}{\beta }\frac{\textrm{d}{\phi }}{\phi _2-\phi _1} = 2 - 4\frac{\ln (\phi _2/\phi _1)}{\phi _2-\phi _1} \end{aligned}$$
(10)

With an intelligent choice of \(\phi _1\) and solving a transcendental equation, it can be shown that using \(c_1,c_2\sim U(1, 1.7336)\) and \(\beta \sim U(0,1)\) would result in a fairly constricted algorithm. We call it Fairly Constricted Particle Swarm Optimization (FCPSO)Footnote 6. Note that there may exist other parameter sets that are also fairly constricting. In this work, we have derived only one fairly constricting set and subsequently used it for benchmarking.

4 Results

We first present the Pareto fronts of the ZDT1 and ZDT3 problems. From a first qualitative look, the Pareto fronts of FCPSO match that of SMPSO. The solution points are densely packed, and well-connected unlike the fragmented Pareto fronts of the naive EM-SMPSO algorithm (Fig. 1).

Fig. 2.
figure 2

Pareto Fronts of FCPSO

4.1 Assessment with Quality Indicators

We choose the following 5 quality indicators to assess the performance of FCPSO. Namely, they are Inverted Generational Distance (IGD), Spacing (SP), Hypervolume (HV), and the \(\epsilon \)-indicator (EPS). Computation of these indicators is done with the jmetalpy framework. A thorough description of these indicators can be found in [12]. The measurement of all indicators was done after letting the swarm evolve for 25, 000 function evaluations. In the case of measuring function evaluation itself, we allow the swarm to evolve until \(95\%\) of the hypervolume (HV hereafter) of the theoretically computed Pareto front is reached. The theoretical fronts were obtained from [13, 14] and [15]. All quality indicator values of FCPSO are accompanied by corresponding values from SMPSO for the sake of comparison. Each measurement was repeated 20 times for statistical testing. The resultant p-values have been written as subscripts in the tables. We have not shown values of all quality indicators for all problems due to space constraints, however (Fig. 2).

Table 1. ZDT & DTLZ : FE and HV

Bi-objective ZDT and Tri-objective DTLZ: The ZDT [10] suite of multi-objective problems is a list of biobjective problems for the evaluation of MOO algorithms. Along with the triobjective DTLZ [16] problems, these constitute the basic problems that any MOO algorithm must be able to solve accurately. We show that FCPSO matches the performance of the state-of-the-art SMPSO in these problems. Please refer to Table 1 for the FE and HV measurements. FCPSO matches SMPSO in most problems, occasionally outperforming (and underperforming) SMPSO. A peculiarity to be noted for the DTLZ-2, 4 and 5 problems is that the number of FEs is not characteristic of the other problems and hence we have not included their statistical p-values. This is probably due to the particular nature of the problem, or the swarm initialisation implemented in jmetalpy.

5-Objective and 10-Objective DTLZ: We test the algorithm on a harder variant of the DTLZ problems with 5 objectives. The quality indicator values are shown in Tables 2 and 3. For 10-objective DTLZ, we do not have HV values as jmetalpy took too long to evaluate HV values. Thus, we only have IGD, EPS and SP values pertaining to these problems. The values are available in Table 4 and 5.

Table 2. 5-DTLZ : HV and IGD

FCPSO outperforms SMPSO in all problems except DTLZ2, DTLZ4, DTLZ5 with respect to the spacing (SP) quality indicator in both 5-objective and 10-objective realm. There is a notable exception, however, where SMPSO dominates with respect to SP in 10-objectives. Nevertheless, the gap between it and FCPSO is not significantly high.

Table 3. 5-DTLZ : EPS and SP
Table 4. 10-DTLZ : IGD and EPS
Table 5. Spacing : 10-DTLZ and 10-WFG

5-Objective and 10-Objective WFG: The WFG test suite was proposed in [17] to overcome the limitations of ZDT/DTLZ test suites. For one, ZDT is limited to 2 objectives only. Secondly, the DTLZ problems are not deceptive (a notion developed in [17]) and none of them feature a large flat landscape. Moreover, they state that the nature of the Pareto-front for DTLZ-5,6 is unclear beyond 3 objectives. Lastly, the complexity of each of the previous mentioned problem is fixed for a particular problem. Hence, the WFG problems are expressed as a generalised scheme of transformations that lead an input vector to a point in the objective space. The WFG test suite is harder and a rigorous attempt at creating an infallible, robust benchmark for MOO solvers.

Table 6. 5-WFG : HV and IGD
Table 7. 5-WFG : EPS and SP

Tables 6 and 7 contain the results for 5-objective WFG problems. The results for 10-objective WFG problems are in Tables 5 and 8. FCPSO matches SMPSO with a small margin in most problems, if not outperforming it.Footnote 7

Table 8. 10-WFG : IGD and EPS

5 Discussion, Conclusion and Future Works

At the time of appearance, SMPSO was the state-of-the-art MOO solver compared to other algorithms such as OMOPSO, NSGA-II. Its success is tied to the use of velocity constriction, which we have theoretically analysed and extended to the case of exponentially-averaged momentum. Moreover, there is a dearth of literature on the stochastic analysis of evolutionary algorithms. In the realm of single-objective PSO, [18] has analysed the stability of PSO considering the stochastic nature of \(r_1, r_2\) of the PSO update Eq. [1]. We have successfully performed an analysis in a similar vein. The idea proposed in this work is simple, but it could be applied for the stochastic analysis of evolutionary algorithms.

In this paper, we have discussed the motivations for introducing exponentially-averaged momentum in the SMPSO framework. Having defined specific notions for constriction fairness, we have successfully incorporated exponentially-averaged momentum to SMPSO and demonstrated its performance in MOO problems. It would be beneficial to develop a large number of parameter schemes that are also fairly constricting and compare their performance. Finding a parameterization \((\phi _1, \phi _2, \beta _1, \beta _2)\) that ranges smoothly over the entire range of unfairness would help in comprehensively profiling quality indicators. Moreover, the unfairness value of an EM-SMPSO algorithm is not absolute in itself i.e., multiple parameter schemes could result in the same value of unfairness. A thorough assessment could enable the creation of selection mechanisms, density estimators, alternate notions of elitism tailored to the usage of EM in swarm-based MOO algorithms.