1 Introduction

There are so many optimisation problems in various areas of science and engineering. For solving them, there exist twofold approaches; classical approaches and heuristic approaches. Classical approaches such as linear programming and non-linear programming are not efficient enough in solving optimisation problems. Since they suffer from curse of dimensionality and also require preconditions such as continuity and differentiability of objective function that usually are not met.

Heuristic approaches which are usually bio-inspired include a lot of approaches such as genetic algorithms, evolution strategies, differential evolution and so on. Heuristics do not expose most of the drawbacks of classical and technical approaches. Among heuristics, particle swarm optimisation (PSO) has shown more promising behavior.

PSO is a stochastic, population-based optimisation technique introduced by Kennedy and Eberhart (1995). It belongs to the family of swarm intelligence computational techniques and is inspired of social interaction in human beings and animals (especially bird flocking and fish schooling).

Some PSO features that make it so efficient in solving optimisation problems are the followings:

  • In comparison with other heuristics, it has less parameters to be tuned by user.

  • Its underlying concepts are so simple. Also its coding is so easy.

  • It provides fast convergence.

  • It requires less computational burden in comparison with most other heuristics.

  • It provides high accuracy.

  • Roughly, initial solutions do not affect its computational behavior.

  • Its behavior is not highly affected by increase in dimensionality.

  • There exist many efficient strategies in PSO for mitigating “premature convergence.” Thus, its success rate is so high.

However, in many optimisation problems, all or some of decision variables are discrete. Solving such problems are more challenging than those problems with pure continuous variables. Since in PSO, basic variants are merely applicable to continuous problems, appropriate strategies should be adopted for enabling them to be applicable to discrete problems. This paper analyses deeply all strategies which are adopted in PSO for tackling discrete problems and discusses about their pros and cons. The paper is organised as follows; in Sect. 2, an overview of PSO is presented. In Sect. 3, all strategies adopted in PSO for dealing with discrete problems are analysed in details. As an example of wide applicability of discrete PSO, its applications in power system problems are mentioned in Sect. 4. Finally, drawing conclusions and proposing some directions for future research are implemented in Sect. 5.

2 PSO overview

PSO starts with the random initialisation of a population (swarm) of individuals (particles) in the n-dimensional search space (n is the dimension of problem in hand). The particles fly over search space with adjusted velocities. In PSO, each particle keeps two values in its memory; its own best experience, that is, the one with the best fitness value (best fitness value corresponds to least objective value since fitness function is conversely proportional to objective function) whose position and objective value are called \(P_i \) and \(P_{best} \) respectively and the best experience of the whole swarm, whose position and objective value are called \(P_g \) and \(g_{best} \) respectively. Let denote the position and velocity of particle i with the following vectors:

$$\begin{aligned} X_i&= \left( {X_{i1}, X_{i2}, \ldots , X_{id}, \ldots , X_{in}} \right)\\ V_i&= \left( {V_{i1} ,V_{i2} ,\ldots ,V_{id} ,\ldots ,V_{in} } \right)\\ \end{aligned}$$

The velocities and positions of particles are updated in each time step according to the following equations:

$$\begin{aligned} V_{id} \left( {t+1} \right)&= V_{id} \left( t \right)+C_{1} r_{1d} \left( {P_{id} -X_{id} } \right)+C_2 r_{2d} \left( {P_{gd} -X_{id} } \right)\end{aligned}$$
(1)
$$\begin{aligned} X_{id} \left( {t+1} \right)&= X_{id } \left( t \right)+V_{id} \left( {t+1} \right) \end{aligned}$$
(2)

where \(C_1\) and \(C_2\) are two positive numbers and \(r_{1d} \) and \(r_{2d}\) are two random numbers with uniform distribution in the interval [0,1]. Here, according to (1), there are three following terms in velocity update equation:

  1. (1)

    The first term this models the tendency of a particle to remain in the same direction it has traversing and is called “inertia”, “habit” or “momentum”.

  2. (2)

    The second term is a linear attraction toward the particle’s own best experience scaled by a random weight \(C_{1}r_{1d}\). This term is called “memory”, “nostalgia” or “self-knowledge”.

  3. (3)

    The third term is a linear attraction toward the best experience of the all particles in the swarm, scaled by a random weight \(C_{2}r_{2d}\). This term is called “cooperation”, “shared information” or “social knowledge”.

The procedure for implementation of PSO is as follows:

  1. (1)

    Particles’ velocities and positions are initialised randomly, the objective value of all particles are calculated, the position and objective of each particle are set as its \(P_{i}\) and \(P_{best}\) respectively and also the position and objective of the particle with the best fitness (least objective) is set as \(P_{g}\) and \(g_{best}\) respectively.

  2. (2)

    Particles’ velocities and positions are updated according to Eqs. (1) and (2).

  3. (3)

    Each particle’s \(P_{best}\) and \(P_{i}\) are updated, that is, if the current fitness of the particle is better than its \(P_{best}, P_{best}\) and \(P_{i}\) are replaced with current objective value and position vector respectively.

  4. (4)

    \(P_{g}\) and \(g_{best}\) are updated, that is, if the current best fitness of the whole swarm is fitter than \(g_{best},\, g_{best}\) and \(P_{g}\) are replaced with current best objective and its corresponding position vector respectively.

  5. (5)

    Steps 2–4 are repeated until stopping criterion (usually a prespecified number of iterations or a quality threshold for objective value) is reached.

It should be mentioned that since the velocity update equations are stochastic, the velocities may become too high, so that the particles become uncontrolled and exceed search space. Therefore, velocities are bounded to a maximum value \(V_{max} \), that is Eberhart et al. (2001)

$$\begin{aligned} If \left| {V_{id}} \right|>V_{max}\,then\,V_{id} = sign\left( {V_{id} } \right)V_{max} \end{aligned}$$
(3)

where sign represents sign function.

However, primary PSO characterised by (1) and (2) does not work desirably; especially since it possess no strategy for adjusting the trade-off between explorative and exploitative capabilities of PSO. Therefore, the inertia weight PSO is introduced to remove this drawback. In inertia-weight PSO, which is the most commonly-used PSO variant, the velocities of particles in previous time step is multiplied by a parameter called inertia weight. The corresponding velocity update equations are as follows Shi and Eberhart (1998, 1999):

$$\begin{aligned} V_{id} \left( {t+1} \right)&= \omega V_{id } \left( t \right)+C_{1 } r_{1d} \left( {P_{id} -X_{id} } \right)+C_2 r_{2d} \left( {P_{gd} -X_{id} } \right) \nonumber \\ X_{id} \left( {t+1} \right)&= X_{id } \left( t \right)+V_{id } \left( {t+1} \right) \end{aligned}$$
(4)

Inertia weight adjusts the trade-off between exploration and exploitation capabilities of PSO. The less the inertia weight is, the more the exploration capability of PSO will be and vice versa. Commonly, it is decreased linearly during the course of the run, so that the search effort is mainly focused on exploration at initial stages and is focused more on exploitation at latter stages of the run.

3 Strategies in PSO for tackling discrete optimisation problems

In this section, all strategies adopted in PSO for tackling discrete problems are analysed deeply.

3.1 Rounding off

This is the most commonly used approach for tackling discrete/integer variables (Zhang and Xie 2003; Laskari et al. 2002; Venter and Sobieszczanski-Sobieski 2004; Salman et al. 2002; Onate Yumbla et al. 2008; Yare and Venayagamoorthy 2007; Jin et al. 2007; AlRashidi and El-Hawary 2007; Parsopoulos and Vrahatis 2002; Abdelaziz et al. 2009; Sivanagaraju et al. 2008; Ziari and Jalilian 2010; Eajal and El-Hawary 2010; Ziari et al. 2012; Yoshida et al. 2000; Fukuyama 2001). In this approach, discrete/integer variables are treated the same with continuous variables during the course of optimization but at the end of optimization, they are rounded off to the nearest discrete/integer value. Round off procedure, in some researches, is implemented at each iteration whereas in most cases, it is implemented once at the end of optimization. The results show that the accuracy is not much affected by rounding off. The main advantages of this approach are its simplicity and low computational cost. Although it exposes two crucial disadvantages; firstly, the solution may enter infeasible regions during rounding off. And secondly, the fitness value in rounded point may be so different from that in original point and it is also possible that the discrete point with more distance from the continuous optimum possesses better fitness in comparison to rounded off point.

3.2 Approaches for solving binary problems

Any optimization problem either continuous or discrete can be represented as a binary problem. Furthermore, some problems especially engineering problems are binary in nature and just can be represented by binary coding. So, devising approaches for effective handling of binary problems in PSO is of high importance. In Binary PSO, each particle contains a combination of 0’s and 1’s. Also, \(P_i \)’s and \(P_g \)’s are binary variables.

3.2.1 Conventional binary approach

In this approach, at each iteration \(t\), the velocity update equation in (1) is used. But after using (1), \(V_{id} \) is mapped into interval [0,1] via sigmoid function as (Kennedy and Eberhart 1997; Ting et al. 2006; Khalil et al. 2006; Li et al. 2006; Liao et al. 2007; Liu and Gu 2007; Wu and Tsai 2008; Chang and Lu 2002; Yin and Lu 2009; Robinson 2005):

$$\begin{aligned} s\left( {V_{id} } \right)=\frac{1}{1+exp\left( {-V_{id} } \right)} \end{aligned}$$
(5)

Then a random number \(r\) in [0,1] is generated and

$$\begin{aligned} X_{id} \left( {t+1} \right)=\left\{ {{\begin{array}{ll} 1&{r<s\left( {V_{id} } \right)} \\ 0&{ Otherwise} \\ \end{array} }} \right. \end{aligned}$$
(6)

That is, \(\text{ V}_{\mathrm{id}}\) represents the probability that bit \(\text{ X}_{\mathrm{id}}\) takes the 1 value. Indeed, in this approach the concept of velocity has been changed. The above processes are implemented for all dimensions and all particles. They are iterated over iterations till termination criterion is met.

One crucial issue in binary PSO is velocity clamping. According to (6), the more the velocity is, the less is the likelihood of changing the bit, thereby the less is the exploration capability of the algorithm. So, the velocity should be clamped in a small enough \(\text{ V}_{\mathrm{max}}\). Indeed in binary PSO, in contrast to continuous versions, smaller \(\text{ V}_{\mathrm{max}}\) result in more efficient exploration. Typically, \(\text{ V}_{\mathrm{max}} \) is set to 4, so that there is always at least a probability of \(\text{ s}\left( {\text{ V}_{\mathrm{max}}} \right) = 0.018\) for any bit to flip.

3.2.2 Angle modulation-based approach

This approach transforms a high-dimensional binary problem into a four-dimensional continuous problem. It utilises a trigonometric function as a bit string generator. The function is (Pampara et al. 2005; Liu et al. 2007):

$$\begin{aligned} g\left( X \right)= sin \left( {2\pi \left( {X-a} \right).b.cos \left( A \right)} \right)+d \end{aligned}$$
(7)

where \(A=2\pi c\left( {X-a} \right)\) is a single element from a set of evenly separated intervals determined by the number of bits specified to be generated, \(b\) is maximum frequency of sine function and \(d\) indicates the vertical shift of function. Standard PSO is used to optimize the simple four-dimensional tuple \((a, b, c, d)\). The optimized parameters are placed in (7) so that by sampling of resultant function at evenly spaced intervals, a bit for each interval is produced. The set of all generated bits represents the binary vector solution to the original problem. Generation of bit string involves taking the sampled points, feeding them into equation (7) and evaluating the output. If the output is positive, the corresponding bit is set to 1, otherwise, it is set to 0.

In summary, angle modulation-based approach significantly diminishes the complexity of problem by reducing its dimensionality. As expected, the results show more accuracy in lower computational time.

3.2.3 Boolean approach

In this approach, modified flight equations based on “XOR” and “OR” operators of Boolean algebra are utilised (Afshinmanesh et al. 2005).

$$\begin{aligned} V_i \left( {t+1} \right)&= C_1 \otimes \left( {P_i \oplus X_i } \right)+C_2 \otimes \left( {P_g \oplus X_i } \right)\end{aligned}$$
(8)
$$\begin{aligned} X_i \left( {t+1} \right)&= X_i \left( t \right)\oplus V_i \left( {t+1} \right) \end{aligned}$$
(9)

A noticeable feature of this approach is its velocity bounding mechanism which is inspired from “negative selection” mechanism existent in immune systems. In this mechanism at each iteration and for each particle the number of 1’s in the binary velocity vector \(V_i \) which called velocity length \(\text{ L}_\mathrm{i} \) is counted and if it is below a specified threshold \(L_{max},\,V_i \) is considered as a non-self antigen and will not be changed. Otherwise, if \(\text{ L}_\mathrm{i}\) exceeds the threshold, \(V_i \) is considered as a self antigen and negative selection is applied to \(V_i \) such that randomly chosen 1’s in \(V_i\) are flip to 0’s until \(L_i =L_{max}\).

In Marandi et al. (2006), the notion of inertia weight which is missing in (8) is incorporated so that the flight equations are as follows:

$$\begin{aligned} V_i \left( {t+1} \right)=\omega \odot V_i \left( t \right)+C_1 \otimes \left( {P_i \oplus X_i } \right)+C_2 \otimes \left( {P_g \oplus X_i } \right) \end{aligned}$$
(10)

where \(\odot \) represents “AND ” operator.

In an extension of Boolean PSO in order to enhance exploration capability a mutation operator with a prespecified probability is applied to particles’ velocities. This mutation operator just can flip 0 bits to 1and does not work in reverse (Deligkaris et al. 2009).

3.2.4 Quantum-based binary PSO

In this binary-based approach, each particle besides its position vector, possesses a quantum vector \(Q_i =\left\{ {Q_{i1} ,Q_{i2} ,\ldots ,Q_{in} } \right\} .\, \text{ Q}_{\mathrm{id}}\) represents the probability that \(\text{ X}_{\mathrm{id}}\) takes the value 1. That is, for each bit \(\text{ X}_{\mathrm{id}}\), a random number in [0,1] is generated and if it is below \(\text{ Q}_{\mathrm{id}}\), \(\text{ X}_{\mathrm{id}}\) takes the value 1. At each iteration, Q is updated as follows Shuyuan et al. (2004):

$$\begin{aligned} Q_{self-based} (t)&= \alpha P_i (t)+\beta \left( {1-P_i (t)} \right)\end{aligned}$$
(11)
$$\begin{aligned} Q_{social-based} (t)&= \alpha P_g (t)+\beta \left( {1-P_g (t)} \right)\end{aligned}$$
(12)
$$\begin{aligned} \text{ Q}\left( {\text{ t}+1} \right)&= \text{ C}_{1}\, \text{ Q}\left( \text{ t} \right)+\text{ C}_{2}\, \text{ Q}_{\mathrm{self}-\mathrm{based}} \left( \text{ t} \right) + \text{ C}_{3}\,\text{ Q}_{\mathrm{social}-\mathrm{based}} \left( \text{ t} \right) \end{aligned}$$
(13)

After updating Q by (1113), X is updated. The main drawback of this approach is its heavy computational burden required for tuning additional parameters \(\alpha \) and \(\beta \).

3.2.5 Binary PSO based on estimation of distribution

Evolutionary paradigms that use information gathered during the optimization process to construct probabilistic models of the distribution of promising regions in search space and utilise these models to generate new solutions are called “estimation of distribution algorithms (EDA’s)”. Generation of new solutions is conducted by sampling the search space according to the estimated distribution. The distribution is re-estimated after each iteration. From EDA models, in Wang (2007), univariate marginal distribution (UMD) is applied. For applying UMD to binary PSO, firstly all local best’s are selected. Then based on these selected local best’s, UMD model estimates the distribution of promising regions in search space and uses a probability vector \(P=\left( {P_1 ,P_2 ,\ldots ,P_n } \right)\) to characterise the distribution of promising solutions where \(\text{ P}_\mathrm{d} \) is the probability that dth dimension of a promising solution take the value 1. The probability vector guides particles to search the binary space in the following way:

$$\begin{aligned} \text{ If} \text{ rand}\left( . \right)<\beta , then\,\,\text{ X}_{\mathrm{id}} \left( {\text{ t}+1} \right)=\left\{ {{\begin{array}{ll} 1&{\text{If rand} \left( . \right)<\text{ P}_\mathrm{d}} \\ 0&{\text{ Otherwise}} \\ \end{array} }} \right. \end{aligned}$$
(14)

Otherwise:

$$\begin{aligned} \text{ X}_{\mathrm{id}} \left( {\text{ t}+1} \right)=\text{ P}_{\mathrm{gd}} \left( \text{ t} \right) \end{aligned}$$

From (14), it is obvious that the larger the \(\upbeta \) is, the more elements of \(\text{ X}_\mathrm{i} \left( \text{ t} \right)\) are sampled from vector P.

The probability vector P is initialised by the following rule:

$$\begin{aligned} \text{ P}_\mathrm{d} =\frac{\mathop \sum \nolimits _{\mathrm{i=1}}^{\text{ N}_\mathrm{p}} \text{ P}_{\mathrm{id}}}{\text{ N}_\mathrm{p}} \end{aligned}$$
(15)

At each iteration, P is updated via:

$$\begin{aligned} \text{ P}_\mathrm{d} =\left( {1-\lambda } \right)\text{ P}_\mathrm{d} +\lambda \frac{\mathop \sum \nolimits _{\mathrm{i=1}}^{\mathrm{N}_\mathrm{p}} \text{ P}_{\mathrm{id}} }{\text{ N}_\mathrm{p} } \end{aligned}$$
(16)

where \(\lambda \in \left( {0,1} \right)\) is a learning rate.

3.3 Trinary PSO

Trinary PSO addresses problems which possess three different states for each bit. Three states are represented by three unity vectors; 1∡-\(120^{\circ }\), 1∡\(0^{\circ }\) and 1∡-\(120^{\circ }\). For initialisation, particles’ dimensions are initialised randomly with one of above three states. Flight Equation in Trinary PSO is as follows Moradi and Fotuhi-Firuzabad (2008):

$$\begin{aligned} \text{ V}_\mathrm{i} \left( {\text{ t}+1} \right)=\text{ V}_\mathrm{i} \left( \text{ t} \right)+\text{ C}_{1}\,\text{ r}_{1}\,\text{ P}_\mathrm{i} +\text{ C}_{2}\,\text{ r}_{2}\,\text{ P}_\mathrm{g} \end{aligned}$$
(17)

After updating velocity, the phase of \(V_i \) is computed and mapped into interval [0,1] as the probability of states. Thus, the angles of states; 1∡-\(120^{\circ }\), 1∡\(0^{\circ }\) and 1∡-\(120^{\circ }\) are mapped into \(1/6,\,1/2\) and \(5/6\) respectively. Then, the differences between the phase of \(\text{ V}_\mathrm{i}\) and the three numbers (\(1/6,1/2,5/6\)) is calculated in each iteration and resulting numbers (\(\text{ d}_{1},\,\text{ d}_{2},\,\text{ d}_{3})\) are transformed via following function.

$$\begin{aligned} \text{ T}\left( {\text{ d}_\mathrm{k}} \right)=\frac{\text{ a}.\text{ exp}\left( {\text{ tan}\left( {\pi \left( {0.5-\text{ d}_\mathrm{k} } \right)} \right)} \right)}{1+\text{ a}.\text{ exp}\left( {\text{ tan}\left( {\pi \left( {0.5-\text{ d}_\mathrm{k} } \right)} \right)} \right)} \end{aligned}$$
(18)

where k = 1,2,3 and \(a\) is a constant.

After transformation by (18), a random number r in [0,1] is generated and new states of particles’ dimensions are determined according to the following rules:

  1. 1.

    If r is more than all \(\text{ T}\left( {\text{ d}_1 } \right)\), \(\text{ T}\left( {\text{ d}_2 } \right)\) and \(\text{ T}\left( {\text{ d}_3 } \right)\), or it is less than all of them, one state is chosen randomly for next iteration.

  2. 2.

    If r is more than all \(\text{ T}\left( {\text{ d}_1 } \right),\,\text{ T}\left( {\text{ d}_2 } \right)\) and \(\text{ T}\left( {\text{ d}_3 } \right)\), 1∡-\(120^{\circ }\) is selected as the state for next iteration.

  3. 3.

    If r is less than all \(\text{ T}\left( {\text{ d}_1 } \right),\, \text{ T}\left( {\text{ d}_2 } \right)\) and \(\text{ T}\left( {\text{ d}_3 } \right)\), one of the states associated with \(\text{ d}_1\) and \(\text{ d}_2 \) is selected as the state for next iteration.

After updating particles, their objective function is calculated, then \(\text{ P}_\mathrm{i}\)’s and \(\text{ P}_\mathrm{g}\)’s are updated. The main issue in the proposed trinary PSO is that there exists an overlap between first and third states.

3.4 Penalty approach

In this approach, discrete variables are treated as continuous ones by penalising points which are existent at intervals. Penalty function is as follows Kitayama et al. (2006), Li et al. (2005), Kitayama and Yasuda (2006).

$$\begin{aligned} \text{ P}\left( \text{ X} \right) = \mathop \sum \limits _{\mathrm{i=1}}^{\mathrm{i=n}_\mathrm{d}} \frac{1}{2}\left[ {\sin \frac{2 \pi \left\{ {\text{ X}_\mathrm{i}^\mathrm{c} -0.25\left( {\text{ d}_{\mathrm{i,j+1}} +3\text{ d}_{\mathrm{i,j}}} \right)} \right\} }{\text{ d}_{\mathrm{i,j+1}} -\text{ d}_{\mathrm{i,j}} }+1} \right] \end{aligned}$$
(19)

where \(\text{ d}_{\mathrm{i,j}} \) and \(\text{ d}_{\mathrm{i,j+1}}\) are the jth and \(\left( {\text{ j}+1} \right)\)th section of ith discrete variable, \(\text{ X}_\mathrm{i}^\mathrm{c} \) represents the continuous value between them and \(\text{ n}_\mathrm{d}\) represents the number of discrete variables. According to (19) the penalty value is small around the discrete values and is large at regions away from discrete values. Augmented objective function is represented as:

$$\begin{aligned} \varphi \left( \text{ X} \right)= \text{ f}\left( \text{ X} \right)+\text{ SP}\left( \text{ X} \right) \end{aligned}$$
(20)

where S represents penalty factor. For initialising S, \(\text{ S}\left( {\text{ X}_\mathrm{i}} \right)\) for all particles are computed via:

$$\begin{aligned} \text{ S}\left( {\text{ X}_\mathrm{i} } \right)=1+\text{ P}\left( {\text{ X}_\mathrm{i}} \right) \end{aligned}$$
(21)

Then, initial value of \(\text{ S}\) called \(\text{ S}_{\mathrm{init}} \) is determined as:

$$\begin{aligned} \text{ S}_{\mathrm{init}} =\mathop {\min }\limits _\mathrm{i} \text{ S}\left( {\text{ X}_\mathrm{i} } \right) \end{aligned}$$
(22)

During the course of optimization, at each iteration \(t,\,S\) is updated by:

$$\begin{aligned} \text{ S}\left( {\text{ t}+1} \right)=\text{ S}\left( \text{ t} \right).\text{ exp}\left( {1+\text{ P}\left( {\text{ X}_\mathrm{i} } \right)} \right) \end{aligned}$$
(23)

The main drawback of penalty approach is it heavy computational burden which extremely limits its application in complex optimization problems.

3.5 Approaches based on modification of flight equations

In Liu et al. (2009), Al-Kazemi and Mohan (2005), Pan et al. (2008), Tasgetiren et al. (2007), Wu and Tsai (2011), Tao et al. (2010) PSO’s conventional flight equations are modified to make it appropriate for discrete problems. In Liu et al. (2009), following new flight equations are applied to discrete optimization problems:

$$\begin{aligned} \text{ X}_\mathrm{i}^1 \left( \text{ t} \right)&= \text{ X}_\mathrm{i} \left( \text{ t} \right)+\alpha \text{ V}_\mathrm{i} \left( \text{ t} \right)\end{aligned}$$
(24)
$$\begin{aligned} \text{ X}_\mathrm{i}^{2} \left( \text{ t} \right)&= \upbeta .\text{ PMX}\left[ {\text{ X}_\mathrm{i}^{1} \left( \text{ t} \right),\text{ P}_\mathrm{i} \left( \text{ t} \right)} \right]\end{aligned}$$
(25)
$$\begin{aligned} X_i \left( {t+1} \right)&= \gamma \text{ PMX}\left[ {X_i^2 \left( t \right),P_{l_i } \left( t \right)} \right] \end{aligned}$$
(26)

where PMX and \(P_{l_i}\) represent partially matched crossover and particle i’s neighborhood best, respectively.

In another approach called multi-phase discrete PSO, the flight equations are modified as Al-Kazemi and Mohan (2005):

$$\begin{aligned} \text{ V}_{\mathrm{id}} \left( {\text{ t}+1} \right)&= \text{ Min}\left( {\text{ V}_{\mathrm{max}}, \text{ Max}\left( {-\text{ V}_{\mathrm{max}} ,\text{ V}_\mathrm{{id}} \left( \text{ t} \right)+\text{ C}_\mathrm{i} \left( t \right)\left( {\text{ P}_{\mathrm{gd}} -\text{ X}_{\mathrm{id}}} \right)} \right)} \right)\end{aligned}$$
(27)
$$\begin{aligned} \text{ X}_{\mathrm{id}} \left( {\text{ t}+1} \right)&= \left\{ {{\begin{array}{ll} {1 }&{\text{ If}\,\text{ V}_{\mathrm{id}} \left( {\text{ t}+1} \right)+\text{ X}_{\mathrm{id}} \left( {\text{ t}+1} \right)>0.5} \\ 0&{\text{ Otherwise}} \\ \end{array} }} \right. \end{aligned}$$
(28)

where \(\text{ C}_\mathrm{i} \left( t \right)\) equals either 1 or -1 depending on the current phase of the corresponding particle. After each pcf (phase change frequency) number of iterations, the sign of \(\text{ C}_\mathrm{i} \left( t \right)\) is reversed. The sign of \(\text{ C}_\mathrm{i} \left( t \right)\) determines whether the particle i is attracted toward or repelled from the global best.

Moreover in Pan et al. (2008), Tasgetiren et al. (2007)), new flight equations particularly appropriate for flowshop scheduling problem is introduced. In Wu and Tsai (2011) a new flight equation for integer problems is devised and in Tao et al. (2010) new flight equations based on simple cyclic rules are proposed.

In an overall view on approaches proposed in literature which are based on modification of flight equations, it can be said that the proposed flight equations are normally appropriate for a particular problem and in most of cases, they do not work well in other problems.

3.6 Set-based approach

This approach uses a set-based representation scheme wherein particles’ positions and velocities are represented as crisp sets and sets with possibilities respectively (Chen et al. 2010; Yue-Jiao et al. 2012).

If E is a crisp set, a set with possibilities V defined on E is represented by:

$$\begin{aligned} V=\left\{ {e/P\left( e \right)| e\in E} \right\} \end{aligned}$$
(29)

That is, each element \(\text{ e}\in \text{ E}\) possesses a possibility \(P\left( e \right)\in \left[ {0,1} \right]\) in \(\text{ V}\).

In this approach, the flight equations in (4) are used, so due to the new entities of velocities, the operators in (4) should be redefined. The redefined operators are as follows:

  1. 1.

    Coefficient \(\times \) velocity (multiplication of a coefficient and a set with possibilities) is as follows:

    $$\begin{aligned} cV=\left\{ {e/P^{{\prime }}\left( e \right)| e\in E} \right\} \end{aligned}$$
    (30)

    where

    $$\begin{aligned} P^{{\prime }}\left( e \right)=\left\{ {{\begin{array}{ll} 1&{If\,\,cP\left( e \right)>1} \\ cP(e)&{Otherwise} \\ \end{array} }} \right. \end{aligned}$$
    (31)
  2. 2.

    Position-position (Subtraction of two crisp sets) is defined as: If A and B are two position sets, A\(-\)B is:

    $$\begin{aligned} A-B=\left\{ {e| e\in A\,\,and\,\,e\notin B} \right\} \end{aligned}$$
    (32)
  3. 3.

    Multiplication of a coefficient and a crisp set is defined as:

    $$\begin{aligned} cE^{{\prime }}=\left\{ {e/P^{{\prime }}\left( e \right)| e\in E} \right\} \end{aligned}$$
    (33)

    where

    $$\begin{aligned} P^{{\prime }}\left( e \right)=\left\{ {{\begin{array}{lll} 1&{If\,e\in E^{{\prime }}\,and\,c>1} \\ c&{If\,e\in E^{{\prime }}\,and\,0\le c\le 1} \\ 0&{If\,e\notin E^{{\prime }}} \\ \end{array} }} \right. \end{aligned}$$
    (34)
  4. 4.

    Velocity plus velocity (Addition of two sets with possibilities) is defined as follows.

If \(V_1 =\left\{ {e/P_1 \left( e \right)| e\in E} \right\} \) and \(V_2 =\left\{ {e/P_2 \left( e \right)| e\in E} \right\} \) are two velocity vectors, then:

$$\begin{aligned} V_1 +V_2 =\left\{ {e/max\left( {P_1 \left( e \right),P_2 \left( e \right)} \right)| e\in E} \right\} \end{aligned}$$
(35)

According to the operators defined in 1 through 4, particles velocities are updated at each iteration.

For updating position, a random number \(\alpha \) in (0,1) is generated for each particle. Then, for each element e in the dth dimension, if the corresponding possibility \(\text{ P}\left( \text{ e} \right)\) in \(\text{ V}_{\mathrm{id}}\) is not smaller than \(\alpha \), element e is reversed in the crisp set, that is:

$$\begin{aligned} cut_\alpha \left( {V_{id} } \right)=\left\{ {e|e/P\left( e \right)\in V_{id}\,\,and\,P\left( e \right)\ge \alpha } \right\} \end{aligned}$$
(36)

After construction of \(\text{ cut}_\alpha \left( {V_{id} } \right)\), new position of particle i is determined by learning from the elements in \(\text{ cut}_\alpha \left( {V_{id} } \right)\).

3.7 Hybrid approach

In Nema et al. (2008) PSO is hybridized with branch and bound (BB) algorithm which is an efficient classic technique for tackling discrete variables. By this hybridization, both global search capability of PSO and rapid convergence capability of BB are benefited to form a more efficient approach.

At the beginning of this hybrid approach, the BB is invoked for determining an initial feasible solution for PSO and at each iteration, whenever an improvement in \(P_g \) is achieved, it is passed over to the BB module as the starting point. Actually, in this approach, BB is not called at every iteration, but it is called when \(P_g \) is improved. When BB completes, \(P_g \) is updated and the PSO resumes.

The main advantage of this approach is the need to less function evaluations due to the deterministic nature and quickness of BB. On the other hand, incorporation of BB which is derivative-based, limits the applicability of the hybrid approach.

3.8 Other approaches

Besides all the approaches explained for tackling discrete variables in PSO, there exist some other approaches in literature which are not explained here due to space limitation (Clerc 2004; Pang et al. 2004a, b; Niasar et al. 2009; Hoffmann et al. 2011; Sha and Hsu 2006; Qin et al. 2011). Table 1 has tabulated all different discrete variable handling approaches adopted in PSO, their applicability, pros and cons.

Table 1 Different discrete variable handling approaches adopted in PSO, their applicability, pros and cons

4 Applications of DPSO in power system problems

Most of the power system problems can be transformed into optimization problems and most of them contain discrete/integer/binary variables. For instance, location of a compensator represents an integer variable, or reactive power of a static var compensator and the setting of a tap changer introduce discrete variables. Also, the states of switches in distribution systems (close/open) represent binary variables.

In literature discrete PSO has been used for optimal power flow (Onate Yumbla et al. 2008; AlRashidi and El-Hawary 2007), optimal scheduling of generator maintenance (Yare and Venayagamoorthy 2007), transmission network expansion planning (Jin et al. 2007), distribution system reconfiguration (Sivanagaraju et al. 2008; Liu and Gu 2007; Chang and Lu 2002; Yin and Lu 2009; Wu and Tsai 2011), allocation of active power line controllers (Ziari and Jalilian 2010), optimal capacitor placement (Eajal and El-Hawary 2010; Khalil et al. 2006), integrated distribution system planning (Ziari et al. 2012), reactive power control (Fukuyama 2001), unit commitment (Ting et al. 2006). Reliability analysis (Robinson 2005), power system defensive islanding (Liu et al. 2007) and switch placement (Moradi and Fotuhi-Firuzabad 2008).

For solving each of the mentioned power system problems, a particular type of discrete/binary variable handling strategy has been adopted. Round off strategy is used in (Onate Yumbla et al. 2008; Yare and Venayagamoorthy 2007; Jin et al. 2007; AlRashidi and El-Hawary 2007; Sivanagaraju et al. 2008; Ziari and Jalilian 2010; Eajal and El-Hawary 2010; Ziari et al. 2012; Yoshida et al. 2000; Fukuyama 2001), conventional binary PSO has been applied in Ting et al. (2006), Khalil et al. (2006), Liu and Gu (2007), Robinson (2005), angle-modulated PSO is used in Liu et al. (2007) trinary PSO has been used in Moradi and Fotuhi-Firuzabad (2008) and finally an approach based on modification of flight equations is introduced in Wu and Tsai (2011). Table 2 tabulates the PSO applications in discrete power system problems and the type of discrete PSO adopted in them.

Table 2 PSO applications in discrete power system problems and their type of discrete PSO

5 Conclusions and future research directions

In this paper, all strategies adopted in PSO for tackling discrete problems have been analysed. Conclusively, the followings are proposed as some directions for future research.

  • In round off approach, at the end of continuous optimization, when final continuous optimum is found, instead of rounding off, it is recommended that the two nearest discrete values to the continuous optimum are evaluated and the one with better fitness is considered as the optimum discrete point.

  • In conventional binary PSO, replacing sigmoid function with other functions may lead to superior results.

  • Devising a time-variant (preferably time-deceasing) velocity bound for conventional binary PSO may result in better results.

  • In Boolean PSO, transforming other basic PSO variants (like constricted PSO) to Boolean structure may lead to more promising results.

  • In binary PSO, using different probability distribution functions for generating the probability vector (the vector whose entries indicate the probability that their corresponding bits in position vector take the value 1) is recommended. Maybe certain probability functions lead to outperformance in binary PSO.

  • In penalty approach, devising more appropriate and simple penalty functions is of high value.

  • Approaches like estimation of distribution-based PSO and set-based PSO should be tested on a wider range of discrete problems in order to evaluate their performance.

  • Devising new flight equations more appropriate for discrete problems can be a potential research direction.

  • Focusing more on binary PSO and devising more efficient binary coded approaches for solving continuous and binary problems may direct attractions toward binary coded PSO in early future.

  • So limited number of theoretical analysis on discrete variable handling in PSO has been conducted.