1 Introduction

In 1995, the idea of function-optimization was introduced with the help of a swarm of conceptual particles that fly through the parameter space searching for the optima (Kennedy and Eberhart 1995; Eberhart and Kennedy 1995). Currently, the basic particle swarm optimizer (PSO) and its variants constitute one of the most well-known families of global optimizers over the real or continuous parameter space. In PSO, each candidate solution is modeled as a particle and several such particles together form a swarm. Particles fly through the multidimensional search space following a typical dynamics. At any instance, each particle has a position and a velocity. Initially, a population of particles is initialized with randomly chosen position-vectors and velocities. Each particle then adapts its search pattern by learning from its own experience as well as the experience of the others. The aim of each particle is to move into a better region of the search space with a definite velocity influenced by the information collected by its own self and the other members of the swarm throughout its lifetime. PSO does not require any gradient information to optimize a numerical objective function. It is simple and only uses elementary mathematical operators. Since 1995, PSO has been extensively investigated by the researchers and this has resulted into nearly uncountable number of variants of the algorithm. Dynamics of the particles and parameter selection have been investigated both theoretically and empirically. PSO has been applied to a wide range of real-world problems ranging from diverse domains. For comprehensive surveys on the research on and with PSO, the readers are directed to the references such as (Eberhart and Shi 2004; Engelbrecht 2006; Clerc 2008; Valle et al. 2008; Liang et al. 2006; Banks et al. 2007; Poli 2008).

The main drawback of PSO, being a stochastic process, is false and/or premature convergence, especially for multimodal functions. Multifariousness is lost as the globally best member influences the dynamics of the particles in the swarm. Premature convergence occurs when a globally best particle, at one of the local optima, traps the population. This had led researchers to propose alternate variants and modifications, which are mostly algorithmic and leads to better performance. The modifications include tuning of selection parameters to achieve a better exploration/exploitation trade-off (Shi and Eberhart 1999; Ratnaweera et al. 2004; Clerc and Kennedy 2002), designing various proximity topologies to replace the traditional global topology (Kennedy 1999; Suganthan 1999; Kennedy and Mendes 2002, 2006), using multiswarm techniques (Nasir et al. 2012; Van den Bergh and Engelbrecht 2004; Liang and Suganthan 2005; Lovbjerg et al. 2001), and hybridizing auxiliary search methods with PSO (Chen et al. 2007; Andrews 2006; Higashi and Iba 2003; Zhang and Xie 2003; Shelokar et al. 2007; Das et al. 2005). PSO has also been modified by using deterministic sampling such as in the species-based optimization algorithm (Cho et al. 2011) and using simplex search techniques in unconstrained optimization problems (Fan and Zahara 2007). However, majority of such variants preserve the population diversity at the cost of slow convergence or complicated algorithmic structures. Avoiding premature convergence without greatly reducing the convergence speed and without making the algorithmic structure too complicated still remains a challenge for the PSO researchers. The findings reported in this paper can be considered as a humble attempt to eradicate some of these problems.

This article presents a new variant of nature-inspired PSO algorithm that exploits the concept of governance in human society. The new technique is called Democracy-Inspired Particle Swarm Optimizer with the concept of Peer Groups (DPG-PSO). Two leaders are selected on the basis of votes, and these leaders have influence over the particles who vote for them. This technique can competitively solve multidimensional, non-linear, non-convex, and multimodal optimization problems exploiting the concept of the new peer-influenced topology. DPG-PSO is tested against standard benchmark functions and is statistically found to be more robust and accurate than seven other PSO variants.

The rest of the paper is organized in the following way. Section 2 briefly explains the background of PSO along with the concept of democracy and peer group. Section 3 introduces the democracy-inspired optimization technique outlining the basic steps. Section 4 presents and discusses the experimental comparisons of the proposed method with other well-known PSO variants like Ageing Leader Challenger Particle Swarm Optimization (ALC-PSO) (Chen et al. 2013), Fully Informed Particle Swarm (FIPS) (Mendes et al. 2004), Dynamic Multi-Swarm Particle Swarm Optimization (DMS-PSO) (Liang and Suganthan 2005), Orthogonal Learning Particle Swarm Optimization (OLPSO) (Zhan et al. 2010), Comprehensive Learning Particle Swarm Optimization (CLPSO) (Liang et al. 2006), Hierarchical Particle Swarm Optimization with Time Varying Acceleration Coefficients (HPSO-TVAC) (Ratnaweera et al. 2004), and GPSO (Shi and Eberhart 1998) and duly discusses the experimental results. Section 5 discusses on the control parameters of the proposed technique to show its robustness. Section 6 compares the democratic and undemocratic approaches for the DPG-PSO algorithm to show the superiority of the proposed democracy-inspired algorithm. The absence of center-seeking bias of the algorithm is illustrated in Sect.7 with the help of the experimental results on shifted functions. Finally conclusions are drawn in Sect. 8.

2 Background

2.1 Basic PSO and related works on PSO

PSO, as the name suggests, uses a swarm of particles each of which model a candidate solution of the problem at hand. A particle is characterized by its position vector \(x_{i}\) = {\(x_{i}^{1}, x_{i}^{2}\) ,...,x \(_{i}^{D}\)} representing a point in the D-dimensional real space (R\(^\mathrm{D})\), and its velocity vector \(v_{i}\) ={\(v_{i}^{1},v_{i}^{2},\ldots ,v_{i}^{D}\)}. The velocity and position of a particle are updated in the following way:

$$\begin{aligned}&v_i^d \leftarrow v_i^d +c_1 *rand1_i^d *( {pbest_i^d -x_i^d })+\;c_2 *rand2_i^d\nonumber \\&\quad \qquad *( {gbest_i^d -x_i^d }), \end{aligned}$$
(1)
$$\begin{aligned} x_i^d \leftarrow x_i^d +v_i^d , \end{aligned}$$
(2)

where \(x_i^d\) and \(v_i^d\), respectively, represent the d-th component of the position and velocity of the i-th particle. The best position of the particle, i.e., the position at which the i-th particle yields its best fitness value is termed \(pbest_i\) where \(pbest_i\) = { \(pbest_i^1\), \(pbest_i^2\),..., \(pbest_i^D\) }. Using similar terminology the global best position of the i-th particle is given by \(gbest_i\) = {\(gbest_i^1\), \(gbest_i^2\),..., \(gbest_i^D\) }. Note that for the gbest PSO topology \(gbest_i\) denotes the best position found so far in the entire swarm and the subscript i is not necessary. On the other hand for the lbest PSO model, \(gbest_i\) denotes the best position found by some particle in a neighborhood of the i-th particle. There are different types of neighborhood topologies, namely the ring, pyramid, Von-neumann, and others. Ring neighborhood topology has not only been extensively used in PSO (Kennedy and Mendes 2002), but also implemented in the differential evolution (DE) family of algorithms, see for example (Omran et al. 2006; Das et al. 2009). Without loss of generality, our method can be integrated with both the gbest and lbest PSO models with any proximity topology.

The constant \(c_1\) scales the attraction of a particle towards the best position (hitherto-optimum position) it has found so far. On the other hand, the constant \(c_2\) does the same with the best position found so far by the entire swarm or in a neighborhood of the current particle. These are often called acceleration coefficients. Venter and Sobieski (2003) termed ‘\(c_1\)’ as self-confidence and ‘\(c_2\)’ as \(swarm confidence. rand1_i^d\) and \(rand2_i^d\) are uniformly distributed random numbers bounded within the range [0,1] and are instantiated freshly for each ordered pair (d, i). The velocity of a particle is limited to a maximum denoted by \(v_\mathrm{max}~=~[v_\mathrm{max}^1,~v_\mathrm{max}^2,~{\ldots },~v_\mathrm{max}^D]\). If the velocity exceeds this limiting condition, appropriate steps are taken to reduce it. Reassignment of the velocity component to a value of \(sign(\vert v_i^d\vert )v_\mathrm{max}^d\) remains a popular means of limitation.

Parameter control and adaptation have attracted the PSO researchers for a long time. The inertia weight \(\omega \) was first introduced by Shi and Eberhart (1998) to dampen the inertial velocity from previous iteration and thus to influence convergence. They showed that while larger inertia weight is advantageous in wide-scale exploration, a smaller one may increase the ability of local refinement. Since then, almost all the PSO variants have used the inertia weight as an integral part of the velocity update rule. To balance the local and global search abilities, Shi and Eberhart put forth a scheme to decrease \(\omega \) linearly with number of iterations in the following way:

$$\begin{aligned} \omega =\omega _\mathrm{max} -( {\omega _\mathrm{max} -\omega _\mathrm{min} })\,.\,\frac{g}{G}, \end{aligned}$$
(3)

where g is the iteration index and G is a predefined maximum number of iterations. \(\omega _\mathrm{max} \)and \(\omega _\mathrm{min} \)are usually set as 0.9 and 0.4, respectively. Thus, the use of time varying inertia weight \(\omega \) provides the necessary balance between the local and global search abilities of a particle. A fuzzy rule-based non-linear adaptation of \(\omega \) was proposed by Shi and Eberhart (2001). In context to the dynamic system optimization, a significant modification of inertia weight as \(\omega =0.5+rand( {0,1})*5\) was experimented (Eberhart and Shi 2001). Logarithmic and exponential decreases of the inertia weight have been investigated in Eberhart and Shi (2001); Liao et al. 2012). Quande et al. presented a novel PSO with a piecewise-varied inertia weight. The piecewise function thus chosen is a decreasing function as both its parts are so. The first part of the piecewise function thus chosen is non-linear which helps in the search for the global optimum and restricts premature convergence. The second part is linear and found in the conventional PSO (Shayeghi et al. 2010). A variation for realizing the convergence behavior of the particles is by the use of a constriction factor \(\chi \) introduced by Clerc and Kennedy (2002) in the following way:

$$\begin{aligned}&v_i^d \leftarrow ( v_i^d +c_1 *rand1_i^d *( {pbest_i^d -x_i^d })+c_2 *rand2_i^d\nonumber \\&\quad \qquad *( {gbest_i^d -x_i^d })), \end{aligned}$$
(4)

where \(\chi =\frac{2}{\left| {2-\phi -\sqrt{\phi ^2-4\phi } } \right| }\) and \(\phi =c_1 +c_2 \).

Besides the inertia weight and constriction factor, acceleration coefficients can greatly influence the performance of PSO as indicated by the “social only” and “cognitive-only” models (Quande et al. 2010). A fixed value of 2 as suggested in Kennedy and Eberhart (1995), Eberhart and Kennedy (1995) is mostly preferred by the researchers for both the coefficients. Time Varying Acceleration Coefficients (TVAC) employed in a PSO variant called HPSO (Hierarchical PSO)-TVAC (Ratnaweera et al. 2004) and the use of ad hoc values of acceleration coefficients are some of the modifications that highlight the influence of acceleration coefficients on the performance of PSO. In lbest PSO models, due to the neighborhood structures, the particles show better resistance towards trapping in local optima. But the disadvantage of these models is the deficient leadership capability in the swarm as a result of which many particles exert greater influence on any one particle, resulting in a de-centralized behavior. Such PSO variants usually exhibit a slow convergence characteristic. In context to the lbest PSOs, a multitude of neighborhood topologies has been investigated; see for example works like (Kennedy and Mendes 2002). Usually a large neighborhood proves to be effective for simpler optimization problems, while a restricted neighborhood is better for complicated problems with multiple modes. Topologies which change over time have also been researched and these have been shown to be robust on various functional landscapes (Suganthan 1999; Kennedy 1997). Fully Informed Particle Swarm (FIPS) algorithm was proposed by Mendes et al. (2004). FIPS adjusts the particle velocity by taking into account the positions found in other neighborhoods in addition to considering the particle’s neighborhood optimum. The same idea was used in conjunction with Euclidean neighborhoods to induce significant niching behavior in PSO (for the detection of multiple optima simultaneously on a multimodal fitness landscape) by Qu et al. (2012).

Even the particle in the optimum position may not have the best fitness when all the dimensions are concerned. The swarm should benefit from utilizing the best possible position along every dimension separately and this was the inspiration for the Comprehensive Learning PSO (CLPSO) which gave better results because of the increase in swarm diversity. Randomized variable neighborhoods are used in the standard PSO model (SPSO’07) (Bratto and Kennedy 2007). Janson and Middendorf proposed a hierarchical version of PSO (HPSO) (Janson and Middendorf 2005) that arranges the particles in a hierarchy, which defines the neighborhood structure. Every particle has itself and its parents in the system as neighbors.

Several efforts have been made to devise an improved hybrid algorithm by synergizing PSO with other local or global optimization techniques. Incorporating the operators of a Genetic Algorithm (GA) inside the conventional PSO framework, such as selection (Angeline 1998), crossover (Chen et al. 2007), and mutation (Andrews 2006; Higashi and Iba 2003), have led to efficient optimization in many instances. Several attempts have been made to hybridize the features of PSO with another very powerful real-parameter optimizer called Differential Evolution (DE) and a detailed account can be found in Xin et al. (2012). Apart from synergy with other algorithms, deflection, stretching, and repulsion techniques (Parsopoulos and Vrahatis 2004) were combined with PSO to improve its performance. Mutation strategies for parameters like inertia weight (Miranda and Fonseca 2002), diversity maintaining scheme, as well as distance-dependent schemes to prevent particles from congestion in a particular area of the search space (Lovbjerg and Krink 2002; Blackwell and Bentley 2002) have also been proposed for PSO.

Some researchers attempted to integrate the concepts of aging and lifespan in PSO framework for obtaining improved search on multimodal landscapes. A trapezoidal aging function (Dehuri et al. 2006) was used to mark the lifespan of particles in context to data clustering. A PSO with aging leaders and challenger particles has been recently proposed (Chen et al. 2012). Multipopulation approaches (also more familiarly called multiswarms) (Tillett et al. 2005) have received considerable attention from the PSO researchers over the past few years. In case of PSO variants with multiple sub-populations, the sub-populations themselves can be treated as a special type of neighborhood structure. Considering this fact, such category of PSO variants can also suffer from slower convergence speed. Recently, a very competitive variant of PSO was proposed (Zhan et al. 2011) by using an orthogonal learning strategy to discover more useful information that lies in the learning experiences of a particle via the orthogonal experimental design (OED).

Different types of topologies have been used to improve the performance of PSO. Some PSO algorithms use a fully connected neighborhood topology. The best solution found by any particles acts as the focal point of aggregation for all particles. Each particle has information concerning all the particles in the swarm. On the other hand, there are also topographies where every particle has knowledge pertaining to only a restricted neighborhood. One common topology is the ring topology (Kennedy and Mendes 2002), in which each and every particle is connected with just two neighbors. Kennedy claimed that PSO with a small neighborhood might perform better on complex problems, because it slowly searches for the optimum while PSO with a large neighborhood would perform better on simple problems where it will converge faster albeit with a higher chance of premature convergence, if possible for that function. A dynamically adjusted neighborhood topology (Suganthan 1999) was introduced with a gradually increasing neighborhood. Hu and Eberhart (2002) adopted a topology where nearest particles on the basis of performance are included in its new neighborhood in each generation. In fully informed particle swarm (FIPS) proposed by Mendes and Kennedy, each individual is influenced by the differently weighted successes of all its neighbors, rather than just the best one and itself. A particular problem may be better handled by one structure but some other application might require another structure to be correctly solved.

New adaptive techniques are introduced to classical particle swarm optimization to give an adaptive two-layer particle swarm optimization algorithm with elitist learning strategy (ATLPSO-ELS) (Lim and Isa 2014), which has better search capability than classical. In ATLPSO-ELS, both the current swarm and the memory swarm are evolved in a certain fashion that the tendency of the latter swarm to distribute around the problem’s optima. Better control of exploration/exploitation searches in both current and memory swarms is achieved by two adaptive divisions of labor modules to self-adaptively divide the swarms into explorative and exploitative groups. Additionally, OED and stochastic perturbation techniques are used to develop modules to enhance the search efficiency of swarms and to mitigate premature convergence.

Traditional PSO consisting of only two searching layers is also replaced by new multilayer particle swarm optimization (MLPSO) (Wang et al. 2014) to overcome premature convergence into the local minima. The MLPSO strategy increases the diversity of searching swarms to efficiently solve complex problems.

DPSO (Kaveh and Zolghadr 2013) was the first optimisation technique which was inspired by principles of democracy. But its scope as well as implementation is markedly different from the technique proposed in this paper. Principally, as far as the algorithm is concerned, in DPSO, democracy is implemented by engaging every particle in decision making directly, whereas in DPG, the particles enjoy the right to select their own respective leader and exercise their democracy in an indirect manner through this leader. In DPSO, there is an extra term in the velocity update equation which is a direction vector constructed on the basis of the entire population while in DPG, there is no extra term in the velocity update equation. Additionally, in DPSO, the basic topology remains the same as in basic PSO, but in DPG a new variant of the ring topology, referred to as the peer-based topology, is used.

2.2 Concepts of democracy and peer group algorithm

Basic PSO introduced the concept of leader (the globally best particle) and it has already proved its importance. However, single leadership does not always ensure a better result in optimization algorithms and can lead to unwanted premature convergence. This realization calls for the introduction of democratic form of government. In the present day, democracy is based on the people electing people from their midst who take the decisions and perform actions which can be potentially beneficial to the state as a whole, thus making democracy a form of governance by the people, of the people, and for the people. Democracy also allows the concept of cross fertilizations of ideas between people which are beneficial to them, thus true democracy is seen when people gain knowledge and advance together (Fishkin 2005).

Added advantage of democracy is that it checks the manipulation of the total masses by one leader and introduces in real sense the concept of Opposition. Choice is essential in any democratic structure. The population should always have the benefit of a competent alternative to the ruling party which can replace the incumbent leader and lead the population into a more advanced developmental state. The presence of the Opposition forces the leader to perform and in the absence of any competition the leader may not have enough inspiration to perform as much as he/she can or should.

Without inherent stability and unity in a society, no leader can guide it to its goals. Thus, each individual is influenced by his/her friends and families to a large extent even though individuals in their friend and family circle may not have exceptionally strong leadership qualities. This realization leads to the introduction of a new topology based on the concept of peer group or peer circle in the PSO system.

In this paper, the focus is on exploiting the human society concept of the freedom in governance and peer influence. In basic principle, this novel strategy allows each particle to choose their leader by using the concept of democracy. This shifts from the previous use of democracy concept in PSO (Kaveh and Zolghadr 2013; Kaveh 2014) in that the choice of the people is carried out indirectly through the leader, rather than directly as reflected by the direction vector in the velocity update equation in DPSO. The particles use a closed topology where they interact with only certain nearby particles which form their respective peer groups. As the result, the particles properly update themselves by choosing the proper leader for themselves and by taking proper inputs from its peer circle.

3 Democracy-inspired particle swarm optimization with concept of peer group (DPG-PSO)

3.1 Procedure for DPG-PSO

The original PSO uses one leader of the swarm; however, DPG-PSO uses two leaders. One of the leaders is as its basic Governor who leads a majority of the particles and the other acts as the Opposition who leads certain group of particles and always checks the power and sees to whether the leader is leading the society towards the optimum position. The selection of the leader is done by a voting mechanism where each particle votes for the Governor or Opposition and selects them as their leader to lead them for the particular run. The concept of peer group states that the particles should consider its lbest. Thus the concept of gbest in globally best topology PSO is modified with the introduction of the concept of leader with both Governor and Opposition having a chance of leading the society. Hence, the misleading effects which occur due to gbest are expected to be reduced.

Thus the velocity and position update equation is modified to

$$\begin{aligned}&v_{ij} \leftarrow \omega *v_i^d +c_1 *rand_1 *( {pbest_i^d -x_i^d })\nonumber \\&\qquad \quad *( {1-\frac{FE}{FE_{max} }})+c_2 *rand_2 *( {lbest_i^d -x_i^d })*\nonumber \\&\qquad \quad ( {1-\frac{FE}{FE_{max} }})+c_3 *rand_3 *( {Leader_i^d -x_i^d })\nonumber \\&\qquad \quad *\exp ( {1-vote_{leader} }), \end{aligned}$$
(5)
$$\begin{aligned}&x_i^{d} \;\;\leftarrow x_i^d +c_4 \times rand_4 \;\times v_i^d. \end{aligned}$$
(6)

3.2 Modified velocity update equation

The original PSO velocity equation has been modified to include the ideas of peer group and democracy as shown in (5) and (6). The velocity update equation now contains four terms:

  1. 1)

    Inertia factor

    The concept of inertia or the particle’s tendency to follow its velocity is here as same as that of basic PSO equation, given by \(\omega \times v_{i}^{d}\).

  2. 2)

    Personal best

    The distance of a particle’s current position \(x_{i}^{d}\) from its personal best pbest \(_{i}^{d}\) is calculated. This is multiplied by acceleration factor \(c_{1} \)and a uniform random number rand \(_{1}~\in \) [0, 1]. However to ensure faster convergence, this term is multiplied with another factor that controls the impact of this term with time and slowly reduces it. After each iteration, a particle is expected to reach properly to its optimum position. Consequently the distance between the pbest \(_{i}^{d}\) and \(x_i^d \) is expected to decrease and thus, this term fails to update the velocity much. Therefore the impact of this term is made to diminish with time using the expression:

    $$\begin{aligned}&c_2 \times rand_1 \times ( {pbest_i^d -x_i^d })\times ( {1-\frac{FE}{FE_\mathrm{max} }}). \end{aligned}$$
  3. 3)

    Local best

    Each particle searches its peer circle and finds the best position among them. Encompassing the concept of peer group, this term tends towards its own local best position. Thus the difference between particle’s current position \(x_{i}^{d}\) from its local best \(lbest_{i}^{d}\) is calculated and multiplied by another acceleration factor \(c_{2}\) and a random real number \(rand_{2}\) uniformly distributed within [0, 1]. Here also to ensure fast convergence we slowly reduce the impact of this term using the expression:

    $$\begin{aligned} c_2 \times rand_2 \times ( {lbest_i^d -x_i^d })\times \left( {1-\frac{FE}{FE_{max} }}\right) . \end{aligned}$$
  4. 4)

    Leader factor

    This term embodies the concept of democracy and also gives significance to the votes of the people. Leader is chosen as either the Governor or the Opposition through votes. The particles are attracted to their voted Leader and tend to move towards it. Thus, the difference between \(Leader_{i}^{d}\) and \(x_{i}^{d}\) is taken and multiplied by new coefficient \(c_{3}\) called attraction coefficient and a uniform random number \(rand_3 \in [0,1]\). It is seen that with passage of time oVote decreases and gVote goes on increasing due to expected better leadership quality of Governor. Thus, the influence of Opposition starts weakening. This although good for much faster convergence but can hamper the convergence if the Governor who starts leading a majority of mass gets caught up in local minima. It is at time as these that the Opposition is expected to lead people to the global optima. Thus the effect of those following Opposition should be strengthened even if oBest is very less. Thus an empirically developed expression is multiplied with this term to fight such scenario.

    $$\begin{aligned} c_3 \times rand_3 \times ( {Leader_i^d -x_i^d })\times \exp ( {1-vote_{Leader} }). \end{aligned}$$

Voting is the most essential factor of democracy. To implement voting mechanism a mathematical model of a biased roulette wheel scheme is used where the asymmetric ranges corresponding to the two candidates are constructed by dividing the range [0,1] into two parts each of which is proportional to the number of votes of the respective candidate namely gVote and oVote calculated by the normalization procedure. The need for initially biasing the range is that initially the population is generated with random values hence they lack the ability to decide on their own which Leader can lead them better. Hence initially the best particle from the population is selected as Governor and second best particle as Opposition and gVote is given a value \(\Phi \) and the oVote is set to 1\(-\Phi \). This initial value \(\Phi \) bias the votes. However in consecutive runs each particle for each dimension poll a vote which is a random number generated within the range [0, 1]. Based on this vote its Leader is selected in the following manner:

$$\begin{aligned} Leader_i^d = \left\{ {{\begin{array}{ll} Governor^{d},&{} \qquad vote_i^d \le gVote \\ Opposition^{d},&{} \qquad vote_i^d >gVote \\ \end{array} }} \right. \end{aligned}$$
(7)

Once the Leader is selected the particles update the value of gVote and oVote to show their Leader is more preferred than the other. The vote count of each Leader is increased with one part of N * M. Thus along with the Leader selection the vote is updated in the following manner.

$$\begin{aligned} \begin{array}{l} If \,vote_i^d <gVote,\;\;\;\;\;\;\;\;\;\;gVote\leftarrow gVote+\frac{1}{N*M} \\ \;\;\;\;\;Else,\;\;\;\;oVote\leftarrow oVote+\frac{1}{N*M}. \\ \end{array} \end{aligned}$$
(8)

The votes polled in favor of any candidate are taken in a cumulative format. This helps in realizing which Leader has throughout the time acted as a better Leader. Thus votes got at the end of each iterations are normalized using (9), (10) and carried forward for the next iteration.

$$\begin{aligned} gVote\leftarrow \frac{gVote}{gVote+oVote} \end{aligned}$$
(9)
$$\begin{aligned} oVote\leftarrow \frac{oVote}{gVote+oVote} \end{aligned}$$
(10)

vote \(_{Leader}\)is the normalized number of votes received by a particular leader, be it the Governor or the Opposition. The importance of the empirical expression exp(1 - vote \(_{Leader})\) is seen in the following way:

Case 1: \(\text{ Leader }_\mathrm{i}^\mathrm{d} \leftarrow Governor_d .\)In this case, vote \(_{Leader}\) \(\leftarrow \) gVote which almost tends to 1 with passage of time results in \(\exp ( {1-gVote})\approx \exp ( 0)\approx 1\forall gVote\rightarrow 1\). Thus it does not alter the impact for Governor much if he has taken majority into confidence or the manner he used to influence them.

Case 2: \(Leader_i^d \leftarrow Opposition_d .\) In this case, vote \(_{Leader}\) \(\leftarrow \) oVote and if oVote is very small then \(\text{ exp }( {1-oVote})\approx \exp ( 1)\approx 2.3\forall oVote\rightarrow 0\). Thus the Opposition will still be able to influence the people who vote for him and in case when the Governor fails and leads the majority of swarm to local optima. In such situations, Opposition leads his believers and slowly the whole swarm to the global optima. The voting mechanism is illustrated under biased roulette scheme in Fig. 1.

Fig. 1
figure 1

Voting mechanism under biased roulette scheme. a shows the division of range [0,1] in the initial run. b Shows the distribution of range when the Governor is better Leader and hence gVote increases at the cost of oVote. c Shows the distribution of range when the Opposition is better Leader

3.3 Modified position update equation

The basic position update equation of original PSO has also been changed to introduce certain randomness and weight factor to the velocity term. \(c_{4}\) is the velocity scaling factor. rand \(_{4}\) is a uniform random number within the range [0,1].

In DPG-PSO, a particle does not change its position depending largely velocity term. Here its current position is made more dominant than its velocity term by scaling down the latter using \(c_{4}\) which is taken to a value less than 1. To understand the dynamics behind the above point let us take a scenario where a particle has already reached global best, however other particles are still distributed far apart from him. Due to the tendency of the particle to move towards the swarm it might get dislocated from the global optima. Due to \(c_{4}\) the position term will try to dominate over the velocity term and keep particle at the reached global optima. The impact of velocity term has been stochastically varied for different dimension to enhance the explorative nature of particles using rand \(_{4}\).

3.4 Peer-based topology

The learning process of particles in any PSO is highly dependent on the topology and how they interact in the defined neighborhood structure. The basic PSO has a global neighborhood (or mesh topology) where every particle can learn from the best particle in the entire swarm. This results in faster convergence but increases the liability of the swarm to be trapped in a local optimum. The explorative nature of the swarm can be enhanced by using a topology where the neighborhood of a particle includes a few individuals rather than only one particle such as the Ring topology.

An example of a simple ring topology defined on the index graph of the particles is illustrated in Fig. 2. We assume there are N particles to be organized on a ring topology with respect to their indices, such that particles \(P_{N}(t)\)and \(P_{2}(t)\)are the two immediate neighbors of particle \(P_{1}(t)\). Only the immediate neighbors play a role in the position and velocity update equations of any particle.

Fig. 2
figure 2

Diagram showing the ring neighborhood topology in PSO. The shaded region indicates a neighborhood of the \(i^\mathrm{th}\) particle containing two other particles when i = 1 and swarm size N

Drawing inspiration from ring topology we introduce a new topology called peer-based topology in this paper where particles interact not only with the immediate two neighbors as in ring topology but certain particles beyond. Here the neighborhood size is 2 and the particles are arranged in a topology as shown in Fig. 3.

Fig. 3
figure 3

Diagram showing interaction between Particle \(i (P_{i})\) and its neighbors in peer-based topology

A partly connected topology is one where the population does not have a global best. An influential variant of ring topology is one where diffusion of possible knowledge of optima is better conducted than in the basic ring topology of PSO. A trade-off between the partly connected and influential topology is necessary for a democracy-inspired voting-based PSO so that the populism scheme can be successfully implemented in the swarm of the particles without any particular Leader being disproportionately influential.

In a ring topology all the particles may be thought to form a ring where the neighborhood size is 2. In the proposed topology the basic ring structure is altered by changing the neighborhood size from the constant value of 2. The neighborhood varies from 2 to 3 depending on the particle dynamics.

To understand how this new topology works let us see how lbest is updated by using this topology in contrast with the ring topology. In ring topology structure, the local best of each particle is updated by computing the cost function of the particle i and comparing it with the cost value of lbest of its neighboring particles i + 1 and \(i -\)1. If cost value of particle i is lower than lbest of its neighboring particles, then particle i assumes the lbest position of the respective particle i + 1 or \(i -\) 1. In peer-based topology, in addition to the ring topology-based interaction, each of the particles i can also assume the lbest position of its immediate neighbors, if its cost function is lower than the cost function of lbest of the particles i + 2 or \(i \quad -\) 2. This results in a fundamental difference from basic ring topology in a particular case: Suppose that the lbest of particle i + 1 or \(i -\) 1 is the pbest of particle i, and this pbest has a lower cost value than present cost value of particle i. Even then the particle i is capable of assuming the lbest position of particles i + 1 or \(i -\) 1, if it manages to have a lower cost function than that of lbest of particles i + 2 or \(i -\) 2. Thus a particle can influence the behavior of its neighbors even though it might be weaker than its own personal best, reflecting the nature of peer-influenced human societies.

To illustrate the concept of the peer-based topology let us consider a minimisation problem. Five connected particles (two on either side of the central particle i) among all the population is shown in Fig. 3 having hypothetical cost values of their lbest and pbest positions. Also shown is the (hypothetical) cost value of current location of particle i, on which the algorithm is stuck on at any particular time. Simplified values are given for representative purposes and the minimum is assumed to be at 0, such that smaller cost value corresponds to better fitness.

Thus particles i - 2 and i + 2 change their lbest values under the influence of particle i’s new current value. This is because cost value of lbest of particle \(i-\) 2 is 2 and cost value of lbest of particle i + 2 is 2.5, both of which are less than current cost value of particle i, that is 1.5. The updated values of lbest are shown in Fig. 4. And this updating action is indicated by the arrows in Fig. 4.

Fig. 4
figure 4

Diagram showing a part of the particle population for illustration purposes: After updating cost values

3.5 Generation of opposition and its update equation

Opposition is generated by imbibing certain properties from best particle of the swarm (Governor) and also having some of its random properties. The basic aim while generating the Opposition is to introduce certain randomness in its properties so that he is not a replica of Governor (the best particle). This guarantees that the Opposition is different from Governor and he can lead the particles to better result when the Governor fails to do so.

While generating Opposition, a random number rnd \(_{d}\) uniformly distributed within [0, 1] is generated for each dimension and is compared with a parameter pro. If rnd \(_{d}\) \(<\) pro, then the Opposition \(_{d}\) is set to random real number within the range \([X_\mathrm{min} ,X_\mathrm{max} ]\) else Opposition \(_{d}\) is taken from best particle.

$$\begin{aligned} Opposition_d = \left\{ {{\begin{array}{cc} \,rand\times (X_{max} -X_\mathrm{min} )+X_\mathrm{min},&{}\qquad \,rnd_d <pro \\ \,Governor_d ,&{}\qquad \,rnd_d \ge pro \\ \end{array}}} \right. \nonumber \\ \end{aligned}$$
(11)

where \(d\in \) {1, 2, 3, ...N}

To prevent Opposition having too much random properties so that the convergences takes a lot of time, the parameter pro is taken to be small and equal to 1/N empirically.

Opposition updates itself in each run in the same stochastic manner as it is generated; by learning from the best particle of swarm at the end of each iterations.

The flowchart of DPG-PSO is illustrated in Fig. 5 and the steps involved are given as follows.

Fig. 5
figure 5

Flowchart representation of the DPG-PSO algorithm.

Step1 Initialization: The initial positions and velocity of all the particles are randomly generated within the n-dimensional search space. The best particle is selected as Governor and Opposition is formed by process of mutations. The vote for the Governor gVote and the Opposition oVote is set to initial value to control voting mechanism using biased roulette system.

Step 2 Selecting the Leader using voting system: Each particles generate a random number within the range [0, 1] which acts as its vote. Based on this vote, its Leader is selected.

Step 3 Velocity and position Updating: Each particle updates its velocity and its position using (5) and (6), respectively.

Step 4 Updating pbest, lbest, Governor, and Opposition: If the new position of particle \(i(i\in \left\{ {1,2,\ldots M} \right\} )\) is better than pbest \(_{i}\), then pbest \(_{i}\) is updated to \(x_{i}\). After the completion of each iteration, lBest \(_{i}\) is updated if the position of particles in its peer group becomes better than lbest \(_{i}\). Governor also updates itself to become the best position in this iteration and Opposition updates itself taking certain properties of the best particle.

Step 5 Updating the votes: The individual votes got in this iteration is added to the previous votes of Governor and Opposition. In end of each iterations gVote and oVote get normalized with respect to total votes in the iteration.

Step 6 Terminal Condition Check: If the number of FEs exceeds \(x_{FE}\), the algorithm terminates. Otherwise go to Step 2 for next iteration.

According to the above procedure, DPG-PSO introduces two new concepts of voting allowing each particle to choose their Leader between Governor and Opposition. It also withdraws the concept of gbest and uses lbest where the particles interact with two or three of its neighboring particles and finds lbest.

4 Experimental verifications and comparisons

4.1 Benchmark function tested

Experimental tests are performed on the eighteen standard benchmark functions listed in Table 2. These functions have been frequently used for the testing of new PSO variants. Table 2 also shows the global optimum value and the search ranges of the defined functions. Since they are minimizing functions, the optimum is the global minima for the problems. These benchmark functions are tested on seven well-known variants of PSO like GPSO, FIPS, HPSO-TVAC, DMS-PSO, ALCPSO, CLPSO, and OLPSO. The eighteen benchmark functions are classified into three categories, the first group \(f_1 -f_6 \) having six unimodal functions. The second group has four multimodal functions \(f_7 -f_{10} \) which are more complex with higher dimensionality and multiple local optima. The last group \(f_{11} -f_{18} \) contains 3 rotated and 5 shifted functions. For the details of the shift vector and the orthogonal rotation matrices corresponding to functions \(f_{11} -f_{18} \), see (Suganthan et al. 2005).

4.2 Experimental setup

The experimental results tabulated in Table 3 are obtained by running these algorithms on the benchmark functions of Table 2, are done with parameter settings indicated in Table 1. All the algorithms are tested for 30 dimensions with a swarm size of 20 particles for the PSO variants. To remove statistical dependence, the algorithms are independently run 30 times each and their results are tabulated in Table 2. The simulation environment is Matlab 2013a in a workstation with Intel Core i5 and 2.50 GHz Processor. For the competitor algorithms, we use the same parametric settings as mentioned in the respective literature. The parametric setup for the compared algorithms has been provided in Table 1.

Table 1 Parameter settings of the different variants of PSO used (meaning of the symbols are provided in the respective literatures)
Table 2 Details of the benchmark functions
Table 3 Experimental results tested on benchmark functions showing mean error, minimum error, and standard deviation over a period of 30 runs and 200000 FEs, Wilcoxon’s rank-sum test (denoted by P values) and average ranking of the algorithms based on mean error

4.3 Experimental results

DPG-PSO algorithm and the other well-known variants of PSO are tested on the eighteen benchmark functions \(f_{1}\) - \(f_{18}\) of Table 2. Table 3 gives the mean error, standard deviation and least error value over a period of 30 runs to remove statistical dependence. Ranking has been done according to the mean error value obtained by the algorithms. For minimization problems, the best error value indicates the run securing the least error among the 30 runs for a particular algorithm. In Table 3, the best results comprising mean error and the least error are marked in bold face for each of the functions described in Table 2. Error value is defined as. \({\left| {f(\varvec{x}_\mathrm{best} ) - f(\varvec{x}_\mathrm{opt} )} \right| }\)where \({\varvec{x}_\mathrm{best}}\) is the best solution returned by DPG-PSO and \({\varvec{x}_\mathrm{opt}}\) is the global minimum position of the corresponding function. The functions are tested with 30 dimensions, a swarm size of 20 and exhaustion of 2,00,000 function evaluations (FEs) is taken as terminating criteria. If results of a particular test algorithm are unavailable for certain benchmark functions, N/A or not available is written. Average rank is next calculated by the mean of all the ranks that could be evaluated for the benchmark functions for all the algorithms to test the overall performance of the algorithms.

4.4 Comparison

  1. 1.

    Unimodal functions:

    Table 3 shows for the six unimodal benchmark functions \(f_1 -f_6 \) of Table 2, DPG-PSO secures the 1st rank for 5 unimodal functions, and secures the 8th rank for Rosenbrock’s function \(f_{5}\). Convergence plots of Fig. 6a, c, e, g, i, and k show that the DPG-PSO descends to the global optima quiet steeply. From Table 3, all the other techniques fail to optimize them decently, except \(f_{6}\) where the mean and least error value obtained by all the algorithms are same and thus cannot be ranked. \(f_{3 }\)and \(f_{5}\) are the only two functions where DPG-PSO stagnates and fails to reach the global minimum. But despite that, it manages to secure the 1\(^\mathrm{st}\) rank for \(f_{3}\) when compared to other optimization techniques.

  2. 2)

    Multimodal functions:

    Table 3 shows that for multimodal functions \(f_7 -f_{10} \), DPG-PSO gives the least mean error as well as the mean best error among all the algorithms. Furthermore, the standard deviation over a period of 30 runs is 0 for this algorithm. Figures 6d, f, j, m show that the numerous local minima fail to interfere with DPG-PSO while other variants get stagnated in the local minima. DPG-PSO finds the global minima for all the test functions except \(f_{8}\), where it yields a mean error value of 8.88e\(-\)16. However it still works better than other algorithms and secures the first rank for \(f_{8}\).

  3. 3)

    Shifted and rotated functions:

    Table 3 indicates that, for rotated multimodal functions \(\mathrm{f}_{11} -\mathrm{f}_{13} \) of table 3, DPG-PSO finds the global minima of the rotated functions successfully, shown in Fig. 6b, h, and l despite the complicacies. Additionally it also maintains zero standard deviations over a period of 30 runs. DPG-PSO performs more or less fairly for shifted functions \(f_{14}~- \) \(f_{18}\) when compared to other PSO variants. HPSO-TVAC and CLPSO obtain first ranks for functions \(f_{14}\) and \(f_{15}\) respectively, whereas DMS-PSO achieved first ranks for both functions \(f_{16}\) and \(f_{17}\). DPG-PSO achieved ranks 4th, 3rd, 2nd, and 4th out of the 8 algorithms for these four functions, in respective order. Function \(f_{18}\) is included to show that the proposed algorithm performs at par with its peers for the shifted sphere function. The convergence plots of the shifted functions are shown in Fig. 6n, o, p, q, and r, respectively. For the shifted functions, and the shifted as well as rotated functions, none of the single PSO variants can perform consistently good because of the complexity of the functions.

In terms of Wilcoxon’s rank-sum test, this algorithms outperforms the all the other competing algorithms in all but three of the test functions, namely \(f_{5}\), \(f_{14}\), and \(f_{15}\). In \(f_{14}\)and \(f_{15}\) it still manages to secure a score of the order of 10\(^{-7}\)and 10\(^{-4}\), respectively. In \(f_{5}\) it falls behind only two algorithms to secure third rank with a score of the order of 10\(^{-5}\).

Fig. 6
figure 6figure 6

Convergence characteristics os DPG-PSO for various parameter settings. a Varying \(\omega _{1}\)on sphere function\(\mathrm{f}_1\) b Varying \(\upomega _{1}\)on\(\text{ f }_{12} \), c varying c1 on Schwefel’s problem 2.22 f\(_{2}\), d varying \(c_{1}\) on Rastrigin function f\(_{7}\), e varying \(c_{2}\) on unimodal Schwefel’s problem 2.22 f\(_{2}\), f varying \(c_{2}\) on Griewank function f\(_{9}\), g varying \(c_{3}\) on Quadric function f\(_{4}\), h varying \(c_{3}\) on rotated Ackley function f\(_{12}\), i Varying \(c_{4}\) on Schwefel’s f\(_{2}\), j Varying \(c_{4}\) on non-continuous Rastrigin f\(_{10}\), k varying \(c_{4}\) on Ackley f\(_{8}\), l varying \(c_{4}\) on rotated Rastrigin in f\(_{11}\), m varying \(\Phi \) on Griewank function f\(_{9}\), n convergence plot for shifted rotated high conditioned elliptic f\(_{14}\), o convergence plot for shifted Rosenbrock’s function f\(_{15}\). p Convergence plot for shifted rotated Weierstrass \(f_{16}\), q convergence plot for Shifted Rotated Rastrigin’s \(f_{17} \), r convergence plot for Shifted Sphere function \(f_{18}\)

In terms of average ranking, DPG-PSO stands as rank 1 with a score of 1.87 followed by ALC-PSO ranking 2.47, while other algorithms perform more or less in a similar manner obtaining average rank between 4 and 6. This demonstrates the efficiency of the proposed algorithm.

5 Parameter tuning

DPG-PSO contains 6 control parameters which control the swarm behavior of the particles. The inertia factor \(\omega \) is varied between 0.1 and 0.5 to provide sufficient freedom to the particles to link and interact with one another. It is tested on unimodal sphere function and multimodal Ackley’s function. It is found to have similar results in terms of convergence value and success rate. However, rate of convergence for \(\omega \) = 0.1 does not provide steep convergence for rotated Ackley function as it gets entrapped in a local minima. The acceleration coefficient \(c_{1}\) is set between 1 and 2.5 to test its robustness. But it is found that performance of DPG-PSO is not sensitive to \(c_{1}\) as test results on unimodal Schwefel’s problem 2.22 and the multimodal Rastrigin’s function show that it reaches the global minima for both the cases. The parameter \(c_{2}\) is tested within the range of 0.5 to 2 and is found to be independent as found in convergence plots of Fig. 6e and f showing the successful convergence on unimodal Schwefel’s P2.22 and rotated Griewank functions. Parameter \(c_{3}\) is varied between (0, 1] as it is associated with an exponential term which tends to a very high value and thus the Opposition will overpower the Governor despite having low votes. It is found that DPG-PSO is also independent of \(c_{3}\) as can be seen from Fig. 6g, h showing convergence plot of Unimodal Quadric function and rotated Ackley function. Parameter \(c_{4}\) is tuned in (0.4, 1). It is found that the parameters do not have a significant influence on the convergence of the unimodal Schwefel’s problem 2.22, multimodal non-continuous Rastrigin, multimodal Ackley, and rotated Rastrigin functions in Fig. 6i–l, respectively. Parameter \(\Phi \) is varied from 0.6 to 0.9 to show the robustness. DPG-PSO is found to perform more or less in a similar manner, as can be seen in Fig. 6m for the convergence plot of Griewank function, even if is varied between the said limit.

6 Undemocratic approach

In this section, we investigate the performance of DPG-PSO when the Opposition is absent and there is no democracy. Hence, there will be a single Leader who is going to lead the entire population. So there will be no voting scheme. Only the Peer Group Topology will be present. This proves the efficiency of the democracy system. Three functions, one from each group of Unimodal, Multimodal, and Rotated and Shifted Functions are analyzed for their convergence plots with 6000 Function Evaluation as the ending criteria. All the parameter values remain same. In Fig. 7a, it can be seen that the convergence plot of Sphere function attains its minima 0 within 5200 Function Evaluations for the proposed algorithm, but for the undemocratic approach, the convergence plot reaches a minima of around 10\(^{-150}\) and also the convergence rate is very slow. Further it can be seen from the convergence plots for multimodal Rastrigin function and rotated Griewank functions in Fig. 7b, c, respectively, that the scenario is pretty much same for all the cases as the convergence rate is slower for the undemocratic approach than the proposed algorithm. Also the undemocratic approach gets entangled to the local minima and remains stagnant for some Function Evaluations before proceeding again. This results in unnecessary consumption of time and Function Evaluation. Thus the democratic approach is far superior than the undemocratic approach.

Fig. 7
figure 7

Convergence plots of a unimodal Sphere function, b multimodal Rastrigin’s function, and c rotated Griewank’s function for democratic and undemocratic approaches used

7 The origin-seeking bias and DPG-PSO

Conventional PSO algorithms are often biased towards the center of the search space (Monson and Seppi 2005; Spears et al. 2010. If the search bounds are symmetrical (like \({\forall d,-L\le x^d\le L,L\, \mathrm{being\, a\, constant}}\)), the center coincides with the origin (a vector of zeroes) and this is indeed the situation with all the 18 benchmarks reported in Table 2. If the corresponding function has its global optimum at origin, it may become easier for a PSO to detect the optimum. However, as has been shown in (Monson and Seppi 2005), if the functions have a center offset, i.e., global optimum is shifted to a location other than the origin of the feasible search volume, PSO can become fairly inefficient.

In this section, we demonstrate that the proposed DPG-PSO algorithm is not biased towards the global optimum at the absolute origin. For this we report the comparative results between DPG-PSO and the usual PSO algorithm (GPSO in Table 1) on the shifted versions of 6 functions (\(f_2 ,f_3 ,f_4 ,f_6 ,f_8 \) and \(f_{9})\)from Table 2. Out of the 18 benchmarks shown in Table 2, these 6 functions were tested in their original forms (with global optimum lying at the absolute origin of the search space) in Table 3. The global optimum of each function is shifted by a random amount as indicated in Table 4. In this table, o is the array containing the shift of each of the dimensions from the origin. The proposed algorithm is run 30 times and is compared with the basic variant of PSO. We report the mean error, least error, and standard deviation of the errors in Table 4. The same table also shows the P values obtained from Wilcoxon’s rank-sum test between DPG-PSO and GPSO. A close look through Table 4 reveals that DPG-PSO yields statistically significantly better results than GPSO on all the instances of the shifted functions. The performance of GPSO on these shifted variants deteriorates considerably as compared to its performance on their original counterparts as can be seen from Table 3. These results indicate the absence of origin-seeking bias in DPG-PSO on the standard benchmarking instances.

Table 4 Comparison of shifted functions over a period of 30 runs and 2,00,000 FEs, showing mean Error, minimum error, standard deviation, and P values from the Wilcoxon’s rank-sum test

8 Conclusion

In this paper, a new PSO variant called DPG-PSO has been proposed employing the concepts of democracy form of governance and social interaction of peer group. The new strategy of using two leaders gives freedom to the particles to choose their leader and breaks from the orthodox system of following a single leader. The proposed Peer-Based Topology enables the particles to learn more from its neighbors. These emulations of human behavior and style of governance helps DPG-PSO to efficiently surmount the problem of premature convergence. Experimental results on standard benchmark functions along with competitive algorithms, we can safely infer that DPG-PSO introduces a strategy to make PSO perform better and provide more accurate result with faster rate of convergence. It is also demonstrated that the proposed algorithm does not possess origin-seeking bias of the conventional PSO algorithm as it can keep up its good performance on functions with randomly shifted optima. It is also not much sensitive to the control parameters and hence appears to be more robust.

For future research, the concept of democracy and peer group can be extended to other evolutionary computational algorithms and further studied. The democracy-based search strategies may also be extended to the multiobjective optimization scenarios.