1 Introduction

Most engineering design problems can be regarded as MOPs. Ideally, a designer tends to attain an economical solution which represents the best performance for a particular problem. From a practical point of view, structural optimization is not a single-objective problem (SOP) since in fact the designer needs to consider different objective functions such as minimizing the life cycle cost and maximizing the performance of the structure through minimizing its seismic energy input, maximizing its hysteretic energy dissipation and so on. Moreover, MOPs propose sets of optimal solutions known as Pareto optimal sets instead of one single optimal point. Pareto optimal sets are more desirable since decision makers can access a variety of optimal scenarios and choose the one which best suits the demands of a particular project. In the last few decades, evolutionary algorithms (EAs) have attracted considerable attention of researchers. Since EAs eliminate the differentiability requirement for objective functions in classical methods, they can be applied to discrete and continuous optimization problems. Most EAs start the search with random initial points in the search space which removes the difficulty of finding an appropriate initial point. Furthermore, EAs can be applied to highly nonlinear problems. These characteristics make EAs a suitable choice for engineering optimization problems including structural optimization problems. However, in most structural optimization problems each function evaluation may entail multiple structural analyses. Hence, function evaluations can be such costly that make optimization process impractical for real-world problems especially when nonlinear time history analysis must be conducted. As an example of such time consuming processes the reader may refer to the GA-based optimal design of the 3-story building presented by Gong et al. (2013) which lasted 105 h on a desktop computer before converging. Therefore, more emphasis should be put on the number of function evaluations required by EAs for structural problems. One of the disadvantages of EAs is their need for many number of function evaluations before converging to some acceptable approximation of the Pareto front. Therefore, designing algorithms for handling the aforementioned drawback is essential. Regarding the performance of MOPs, there are few researches that tried to reduce the number of fitness evaluations. For instance, Chafekar et al. (2005) proposed a method called OEGADO to deal with this problem. They verified their algorithm for a few benchmark problems. However, the method yields acceptable results within a few thousands function evaluations. Eskandari and Geiger (2008) proposed another GA-based method called FastPGA and applied it to a number of unconstrained continuous benchmark problems that could be completed within a few thousands of function evaluations. There are some other techniques that can also be used to reduce the number of function evaluations. These include employing methods for estimating the values of objective functions instead of exact evaluations. An example of such approach can be found in (Davarynejad et al. 2011) where a fuzzy-based technique is utilized for function approximations. It was successfully applied to the unconstrained continuous ZDT problems to obtain acceptable results within 1000 real function evaluations. The reader may refer to (Santana-Quintero et al. 2010) for a more detailed discussion on objective function approximations.

Among EAs, the PSO method is well known for its fast convergence. However, the method was originally proposed for optimization of unconstrained single-objective continuous problems. There exist some extensions of PSO for multiobjective problems in the literature. However, as it is also discussed by Tong et al. (2016), only a few of them are developed for solving problems with discrete and mixed discrete and continuous design variables. In this paper, the original PSO is extended to achieve a fast converging PSO-based algorithm called FC-MOPSO which can solve constrained/unconstrained continuous/discrete as well as mixed continuous and discrete MOPs within a few hundreds of function evaluations. The performance of the method is verified against some well-known EAs.

2 Unconstrained continuous single-objective PSO

The original PSO method was first introduced by Kennedy and Eberhart (1995) for optimization of continuous nonlinear single-objective functions. The method was proposed based on the observation that groups of individuals such as birds work together to improve their performance. Since then PSO has gained much popularity among researchers because of its simplicity and fast convergence. The idea of PSO is to start the search for an optimal solution through some initial random particles with initial random velocities (or initial zero velocities) and to update the velocities by considering two main aspects, cognition learning and social learning. In later versions of PSO as discussed by Clerc and Kennedy (2002), a factor known as constriction factor is applied to the velocity formulation to control explosion of the system. Hence, the basic formulation used in this paper for updating the position of each particle (or the flight stage in PSO) is as follows.

$$ \begin{array}{l}{\boldsymbol{v}}_{\boldsymbol{i}}^{\boldsymbol{t}+1}=\chi \left[{\boldsymbol{v}}_{\boldsymbol{i}}^{\boldsymbol{t}}+{\phi}_1{\boldsymbol{r}}_1\circledast \left({\boldsymbol{pbest}}_{\boldsymbol{i}}-{\boldsymbol{x}}_i\right)+{\phi}_2{\boldsymbol{r}}_2\circledast \right.\\ {}\kern2.76em \left.\left({\boldsymbol{lbest}}_{\boldsymbol{i}}-{\boldsymbol{x}}_i\right)\right]\end{array} $$
(1)
$$ {\boldsymbol{x}}_{\boldsymbol{i}}^{\boldsymbol{t}+1}={\boldsymbol{x}}_{\boldsymbol{i}}^{\boldsymbol{t}}+{\boldsymbol{v}}_{\boldsymbol{i}}^{\boldsymbol{t}+1} $$
(2)

In this formulation, which has been proposed for the case of single-objective PSO, \( {\boldsymbol{v}}_i^t=\left[{v}_{i_1}^t,{v}_{i_2}^t,\dots, {v}_{i_s}^t\right] \) is the vector of velocity for the i th particle at the t th generation, \( {\boldsymbol{x}}_i^t=\left[{x}_{i_1}^t,{x}_{i_2}^t,\dots, {x}_{i_s}^t\right] \) is the position of the i th particle at the t th generation and s is the number of design variables. χ = 2α/(ϕ 1 + ϕ 2 − 2) is the constriction factor and ϕ 1 and ϕ 2 are the maximum cognition learning rate and the maximum social learning rate, respectively. r 1 and r 2 are vectors consisting of random numbers between zero and one, pbest i is the best position of i th particle found so far and lbest i is the local best or the best particle found in the neighborhood of the i th particle. Furthermore, if a = [a 1, a 2,  … , a n] and b = [b 1, b 2,  … , b n] then a ⊛ b ≔ [a 1 b 1, a 2 b 2,  … , a n b n].

It is to be noted that many different neighborhood topologies have been used in the literature. In this paper, we will use a random neighborhood topology which will be updated at each generation. Although it might be possible to obtain better results by applying certain neighborhood topologies, the random topology suggests a more general approach which is less problem dependent.

Clerc and Kennedy (2002) have shown that the PSO system is stable if ϕ 1 + ϕ 2 > 4 and α ∈ (0, 1). In this paper, ϕ 1 = ϕ 2 = 2.05 which has been recommended by Simon (2013) is used. Clerc and Kennedy (2002) also discuss that consistent convergence to local minima can be guaranteed although convergence on global optima cannot be proven except for some special class of functions. Evidently, such theoretical discussions about the convergence of a PSO system is valid only if t → ∞. Furthermore, this theoretical background applies only to the unconstrained continuous single-objective PSO. It would be very difficult to provide similar mathematical discussions for PSO systems which deal with these difficulties: (1) multiobjectiveness (2) discreteness and, (3) constraints. That is why FC-MOPSO will be formulated as an extension of the given PSO variant which has some theoretical background. (1) and (2) represent the main idea of PSO algorithms. However, some modifications must be implemented for handling MOPs. In Section 3, we will propose a modification that preserves the basic rationale of PSO so that FC-MOPSO can inherit the stability characteristics of its ancestor.

By changing the value of α, a tradeoff between the exploration characteristic of a PSO system and its exploitation characteristic can be obtained. Larger values of α (or larger values of the constriction factor) indicate allowing the swarm to experience larger velocities which results in more exploration. A smaller α, however, emphasizes on more exploitation. In this paper, it is recommended to use the maximum allowable value of α which is practically a value near unity like α = 0.9. Because, FC-MOPSO receives much information form the swarm that may result in premature convergence or stagnation if small values of α are used. It is also possible to define a pattern for updating α at each generation. For example, Larger values of α can be used at earlier generations when the swarm is meant to emphasize on exploration while smaller values can be adopted at final generations when the swarm is supposed to converge to the solution. A simple way for accomplishing it, is to define α as a linear or exponential function of t. Using non-constant α values, however, requires parameter tunings which is not desired because it would make the method more problem dependent. Regarding the above discussions, α = 0.9 is used throughout this paper.

3 Unconstrained continuous FC-MOPSO

3.1 Definitions

Without loss of generality, any multi-objective problem can be defined as follows.

$$ \begin{array}{l}\mathrm{minimize}\ f\kern-1.5em \left(\mathrm{x}\right)\hfill \\ {}\mathrm{subject}\ to\hfill \\ {}{g}_j(x)\le 0,for\ j\in \left[1,p\right]\hfill \\ {}{h}_j(x)=0,for\ j\in \left[p+1,q\right]\hfill \end{array} $$

where x is the vector of design variables, f(x= [f 1(x), f 2(x),…,f k (x)] is the vector of k objective functions which are to be minimized simultaneously, p is the number of inequality constraints and q is the total number of equality and inequality constraints. Evidently, by putting p and q equal to zero the above definition will represent an unconstrained multi-objective optimization problem. Furthermore, any solution to the above problem should lie in the search space domain, i.e., L  x *  U where L = [L 1 ,…,L s ] and U = [U 1 ,…,U s ] are the vectors which define the lower and upper bounds of the search space domain.

Generally, there is no single point x * which minimizes all k objective functions simultaneously; instead there is a set of solutions which are incomparable and is known as the Pareto set. The basic definitions concerning Pareto optimality are as follows (Simon 2013).

  • Domination: A point x * is said to dominate x if the following two conditions hold: (1) f i (x *≤ f i (x) for all i∊[1,k], and (2) f j (x *< f j (x) for at least one j∊[1,k].

  • Weak Domination: A point x * is said to weakly dominate x if f j (x *≤ f j (x) for all j∊[1,k].

  • Nondominated: A point x * is said to be nondominated if there is no x that dominates x *.

  • Pareto optimal points: A Pareto optimal point x * is a point that is not dominated by any other point in the search space.

  • Pareto set: Pareto set (P s ) is the set of nondominated points in the search space.

  • Pareto front (or Pareto): Pareto front (P f ) is the set of all function vectors f(x) corresponding to the Pareto set.

3.2 Extending single-objective PSO to multi-objective PSO

Several approaches can be found in the literature for applying PSO to MOPs. Some of them such as aggregating approaches and lexicographic ordering methods try to solve a given MOP by defining and solving new SOPs. In aggregating approaches, MOPs are converted to SOPs by combining all objective functions into one objective function. In lexicographic ordering methods, objective functions are ranked based on their importance to the user. Minimization of each objective function is subsequently considered as a new SOP. See (Reyes-Sierra and Coello Coello 2006) and (Coello Coello et al. 2007) for more information.

In this paper, however, a Pareto-based approach is presented in which the single-objective PSO formulation is extended to a multi-objective formulation. Therefore, (1) must be modified so that it can be applied to MOPs because there is no pbest or lbest available in a multi-objective approach. The reason is that in MOPs there are Pareto fronts instead of one minimum point. Researchers have defined a variety of methodologies to deal with this problem. See, for example, (Coello Coello and Lechuga 2002); (Reyes-Sierra and Coello Coello 2006); (Sierra and Coello Coello 2005); (Nebro et al. 2009).

Coello Coello and Lechuga (2002) proposed one of the earliest unconstrained continuous multi-objective PSOs called MOPSO. The flight stage of the PSO formulation is based on an inertia weight formulation where inertia weight equal to 0.4 is used. A repository is defined in MOPSO where all nondominated solutions are stored. Leaders are randomly selected from the same repository. OMOPSO (Sierra and Coello Coello 2005) and SMPSO (Nebro et al. 2009) are two highly competitive Pareto-based multi-objective PSO algorithms that can be applied to continuous MOPs. OMOPSO uses one archive of leaders for guiding the swarm. A binary tournament based on the crowding values is used for selecting one leader for each particle. Furthermore, the swarm is divided into three sub-swarms of equal sizes. Uniform and non-uniform mutation operators are applied to the first two sub-swarms, respectively. No mutation operator is applied to the third sub-swarm. OMOPSO utilizes a PSO formulation based on inertia weight for updating the velocities at each generation. Furthermore, random cognition and social learning factors are used in OMOPSO. If any particle gets out of bounds of the search space, it will be put on the boundaries and the direction of its velocity will be reversed. The same mechanism that was used in OMOPSO for selection of the leaders is also used in SMPSO. However, a constriction factor approach with random social and cognition learning factors are used in SMPSO. If any particle gets out of bounds of the search space, it will be put on the boundaries and its velocity will be reduced by multiplying a factor between 0.001 and 0.1. 15% of particles are mutated by a polynomial mutation operator while the rest of particles are not mutated.

In this paper, a novel approach is proposed to ensure faster convergence while the diversity of the Pareto front is preserved. Most researchers rename the lbest particle to the leader particle for a multi-objective problem. Selection of the leader and the local best particle has a direct impact on the behavior of the swarm both in the rate of its convergence and in the diversity of the Pareto front. According to the authors’ experience, even minor changes in selection of the leader will end up in a totally different behavior. To ensure faster convergence we define the leader in a way that takes advantage of the information available from all positions in the search space that the particles have experienced so far. However, this strategy may result in premature convergence to some local minima. Therefore, other procedures will be incorporated for improving diversity of the archives and enhancing the selection procedure.

In the original PSO, pbest i represents the best position found so far by the i th particle. For a multi-objective problem, we generalize this concept to P i set which is defined as the set of nondominated solutions found so far by the i th particle. Subsequently, a particle called psel i will be selected from P i for performing the flight stage. Moreover, by assuming that n is the population size and N ≤ n − 1 is the number of neighbors randomly selected for each particle at each generation, the concept of lbest i is also generalized so that it represents the leader of the i th particle (leader i ). At each generation and for each individual i, N neighbor P j sets (j ∈ [1, N]) are selected randomly and a nondominated set called Q i is extracted from the union of the selected P j sets. Leader i will be a particle selected from Q i by some procedure which will be explained later. Therefore, for a multi-objective problem, the following equations shall be used for the flight stage.

$$ \begin{array}{l}{\boldsymbol{v}}_{\boldsymbol{i}}^{\boldsymbol{t}+1}=\chi \left[{\boldsymbol{v}}_{\boldsymbol{i}}^{\boldsymbol{t}}+{\phi}_1{\boldsymbol{r}}_1\circledast \left({\boldsymbol{psel}}_i-{\boldsymbol{x}}_i\right)+{\phi}_2{\boldsymbol{r}}_2\circledast \right.\\ {}\left.\left({\boldsymbol{leader}}_i-{\boldsymbol{x}}_i\right)\right]\end{array} $$
(3)
$$ {\boldsymbol{x}}_{\boldsymbol{i}}^{\boldsymbol{t}+1}={\boldsymbol{x}}_{\boldsymbol{i}}^{\boldsymbol{t}}+{\boldsymbol{v}}_{\boldsymbol{i}}^{\boldsymbol{t}+1} $$
(4)

As mentioned before, the constriction factor is meant to prevent velocity blowups. However, the authors have observed better performance for the algorithm proposed in this paper by applying further limits to velocity vectors through (5).

$$ {v}_{i_j}^{t+1}\leftarrow \left\{\begin{array}{c}{v}_{i_j}^{t+1},\kern4em \mathrm{if}\kern0.5em \left|{v}_{i_j}^{t+1}\right|\le {v}_m^j\\ {}{v}_m^j\mathit{\operatorname{sign}}\left({v}_{i_j}^{t+1}\right),\kern1.25em \mathrm{otherwise}\end{array}\right.,j\in \left[1,s\right] $$
(5)

where \( {\boldsymbol{v}}_{\boldsymbol{m}}=\left[{v}_m^1,\dots, {v}_m^s\right] \) is the vector of maximum allowable velocities. PSO systems usually start the search with random or zero valued velocities. Starting the search with zero velocities would indicate that the method should start finding the correct path based on only the cognition and social learning mechanisms. It is not the best choice especially if the population size is small. Because, little data about the search space is provided with small population sizes and the method would have little chance to continue the search with appropriate velocities. It can subsequently result in stagnation in early generations. On the other hand, starting the search with large random velocities is not recommended for solving problems with limited number of function evaluations either. Because, it would need too many generations before the initial random velocities could be corrected by cognition and social learning mechanisms. Therefore, the authors suggest that FC-MOPSO should be started with random velocities not larger than v m  = 0.1(U  L). In other words, we prevent the method from starting with large velocities while we also give it some initial velocities for starting the search. The value of 0.1 is selected to represent a small value and to prevent each dimension of each particle from displacing more than 10% of its corresponding search space size at the initial generation. A small value like 0.1 would let the particles escape from stagnation in early generations and correct their paths based on cognition and social learning mechanisms in later generations. For next generations, however, v m  = rand()(U  L) is recommended where rand() is a random number between zero and one. A random value, rather than a constant parameter, is used here so that parameter tuning would not be required. (5) does not apply to discrete variable problems (see Section 5). Besides, if a component of an updated particle from (4) gets outside the search space it will be put on the boundary of the search space. In other words, for all j∊[1, s] if it happens that \( {x}_{i_j}^{t+1}<{L}_j \) or \( {x}_{i_j}^{t+1}>{U}_j \) then \( {x}_{i_j}^{t+1}={L}_j \) or \( {x}_{i_j}^{t+1}={U}_j \) will be used, respectively.

In addition to the procedures discussed in the next section for preventing premature convergence, mutation operators will be incorporated for increasing the chance of finding global optimum solutions. Three mutation operators, (1) the polynomial mutation operator (Deb and Deb 2014), (2) the non-uniform mutation operator (Sierra and Coello Coello 2005) and (3) the binary mutation operator presented in Fig. 1 were checked. It was found that the binary mutation operator works the best for our proposed algorithm. In this method, at each generation, after updating the position of each particle, a binary mutation operator is applied to obtain the final position of the particle at that generation. For continuous MOPs, binary representations of each particle’s position are used so that the binary mutation operator can be applied. Throughout this paper we accomplish this conversion with a precision of 0.01. That is, the length of strings used for converting continuous variables in the j th dimension of the search space to the binary format is calculated from (6).

$$ {B}_j=\mathrm{floor}\left(\mathit{\log}\left(\frac{U_j-{L}_j}{0.01}+1\right)/\mathit{\log}(2)\right)+1 $$
(6)
Fig. 1
figure 1

Pseudo-code of binary mutation operator of the i th particle

Following this procedure, the binary representation used for U j is a string of length B j with all its bits equal to one while the binary representation for L j is a string of length B j with all its bits equal to zero. Therefore, there will be no chance for a particle to get outside the search space once the mutation is applied. Moreover, the difference between two consecutive binary representations will be equal to \( \left({U}_j-{L}_j\right)/\left({2}^{B_j}-1\right) \). Probabilities equal to p m  = 1/s and ρ j  = 1/B j are used for mutating each element of each particle and each bit of that element, respectively.

3.3 Selection procedure and preservation of diversity

The following aspects are considered for preventing premature convergence and preserving the diversity of the Pareto front. Firstly, the size of P i archives is restricted. Based on a vast investigation, the number of elements in each P i set is limited to nine elements in this paper. That is, after updating the position of i th particle at each generation, if the particle is not dominated by any members of P i set, the particle is added to P i set. Moreover, if this new member dominates some members in the set, those members will be excluded from P i . Finally, if the number of members of the set has grown to be more than nine, some members with the least crowding distances will be eliminated to limit the archive size. In a rare case that there is more than one particle in P i with the least crowding distance, ε-dominance is incorporated to eliminate one of the ten members available in P i .

A similar procedure is applied to each Q i set. In other words, after generating Q i from N selected P i sets, Q i will be pruned using crowding distances or ε-dominance concept to limit their size to 100. The concepts of crowding and ε-dominance are defined in Sections 3.3.1 and 3.3.2.

Secondly, the following procedure is suggested for selecting psel i from P i and for selecting lsel i from Q i . The crowding distances of all members in P i (or Q i ) are evaluated and the one with the largest crowding distance value is the selected psel i (or lsel i ). If there are more than one member holding the largest crowding distance, a roulette-wheel selection will be employed for preferring one of them. One typical case that leads to equal crowding distances in a set is the situation in which the set consists of only two members. Since the problem is multi-objective, one has to assign a fitness value to each member before applying a roulette-wheel selection. This can be accomplished by the following equation.

$$ \overline{f}\left({\boldsymbol{x}}_i\right)=\left({\sum}_{j=1}^k{w}_j{\widehat{f}}_j\left({\boldsymbol{x}}_i\right)\right)/\left(\sum_{j=1}^k{w}_j\right) $$
(7)

where \( \overline{f}\left({\boldsymbol{x}}_i\right) \) represents the fitness value assigned to particle x i and w j refers to the weight factor assigned to the j th objective function. The same w j ’s are used in this paper. The hat symbol on objective function values in (7) indicates that normalized objective function values evaluated from (8) are used.

$$ {\widehat{f}}_j\left({\boldsymbol{x}}_i\right)=\frac{f_j\left({\boldsymbol{x}}_i\right)-{ \min}_{i\in \left[1,{m}_i\right]}\left({f}_j\left({\boldsymbol{x}}_i\right)\right)}{{ \max}_{i\in \left[1,{m}_i\right]}\left({f}_j\left({\boldsymbol{x}}_i\right)\right)-{ \min}_{i\in \left[1,{m}_i\right]}\left({f}_j\left({\boldsymbol{x}}_i\right)\right)},j\in \left[1,k\right] $$
(8)

where m i is the number of particles that are incorporated for normalizing the values of objective functions for the i th particle. For a constrained problem, objective functions will be normalized based on the minimum and maximum values of objective functions in current generation, i.e. m i  = n for i∊[1, n]. For unconstrained problems normalization will be restricted to situations that either evaluation of crowding distances or the selection procedure is encountered. Thus, normalization for unconstrained problems will be done only based on the minimum and maximum values available in each P i (or Q i ) set. In other words, m i will be the number of particles in each P i (or Q i ) set for unconstrained problems.

3.3.1 Crowding

Crowding metrics have been used successfully for maintaining diversity of the Pareto front in many EAs. Deb et al. (2002) defined an efficient crowding distance for modifying the fitness values in their NSGA-II algorithm. Their approach does not require parameter tunings unlike other density estimator concepts such as Kernel density estimator (Goldberg and Richardson 1987). This crowding metric is incorporated in the present paper after making some changes to its original definition given by Deb et al. (2002). Moreover, unlike NSGA-II, crowding distances will not be used for modifying the fitness values. The metric is rather utilized in selection procedure and refining P i and Q i archives.

In this paper, the crowding distance for each particle, C(x i ), is defined as the Euclidian norm of d(x i ):

$$ C\left({\boldsymbol{x}}_i\right)=\left|d\left({\boldsymbol{x}}_i\right)\right| $$
(9)

where

$$ d\left({\boldsymbol{x}}_i\right)=\left[{d}_1\left({\boldsymbol{x}}_i\right),\dots, {d}_j\left({\boldsymbol{x}}_i\right),\dots, {d}_k\left({\boldsymbol{x}}_i\right)\right] $$
(10)

and

$$ \begin{array}{l}{d}_j\left({\boldsymbol{x}}_i\right)=1/2\left[\mathrm{abs}\left({\widehat{f}}_j\left({\boldsymbol{x}}_i\right)-{\widehat{f}}_j^{-}\left({\boldsymbol{x}}_i\right)\right)\right.\\ {}\left.\kern3.5em +\mathrm{abs}\left({\widehat{f}}_j^{+}\left({\boldsymbol{x}}_i\right)-{\widehat{f}}_j\left({\boldsymbol{x}}_i\right)\right)\right],j\varepsilon \left[1,k\right]\end{array} $$
(11)

\( {\widehat{f}}_j^{+}\left({\boldsymbol{x}}_i\right) \) and \( {\widehat{f}}_j^{-}\left({\boldsymbol{x}}_i\right) \) are respectively the closest larger value and the closest smaller value to \( {\widehat{f}}_j\left({\boldsymbol{x}}_i\right) \) among m particles. For extreme positions, however, only a larger or a smaller value is available. In such cases \( {\widehat{f}}_j^{+}\left({\boldsymbol{x}}_i\right)={\widehat{f}}_j^{-}\left({\boldsymbol{x}}_i\right) \) will be used. Regarding the above definitions, it can be concluded that a particle with a smaller crowding distance lies in a more crowded region of the objective function space and vice versa.

3.3.2 ε-dominance

Another concept that is incorporated for preserving diversity and preventing premature convergence is the ε-dominance idea. In this method, the normalized objective function space is divided into n e hyperboxes. For a minimization problem, if there is more than one particle in a hyperbox the one closest to the lowest left corner of the hyperbox will be maintained while other particles are eliminated.

Hyperboxes can be formed by dividing each axis of the coordinate system into segments with a size equal to ε j .

$$ {\varepsilon}_j=\frac{\underset{i\in \left[1,{m}_i\right]}{ \max}\left({f}_j\left({\boldsymbol{x}}_i\right)\right)-\underset{i\in \left[1,{m}_i\right]}{ \min}\left({f}_j\left({\boldsymbol{x}}_i\right)\right)}{{\left({n}_e\right)}^{1/k}},j\in \left[1,k\right] $$
(12)

If hyperboxes are generated in the normalized objective function space, (12) can be simplified to ε j  = 1/(n e )1/k for j∊[1, k]. In this paper, n e equal to 9 and 100 is used for P i and Q i sets, respectively. Fig. 2 shows a schematic for hyperboxes in 2D objective function space. Particles shown with filled circles are maintained while the ones represented by unfilled circles are thrown away.

Fig. 2
figure 2

Schematic of hyperboxes and the concept of ε-dominance in 2D objective function space

A pseudo-code for unconstrained continuous multi-objective problems is presented in Fig. 3.

Fig. 3
figure 3

Pseudo-code of the unconstrained continuous FC-MOPSO

4 Constrained continuous FC-MOPSO

In this section, a penalty-based approach is introduced to extend the algorithm proposed in Section 3 to constrained continuous MOPs. In a penalty-based approach, penalized objective functions are frequently minimized instead of the original objective functions. However, a new approach is proposed in this paper which will benefit from the penalized and non-penalized objective functions simultaneously. There are a variety of penalty function definitions available in the literature that can be applied to the method presented here. The authors have found out that the adaptive penalty function suggested by Yen (2009) works very well for the FC-MOPSO algorithm. This adaptive penalty function has the advantage that it takes into account both the amount of constraint violations and the number of constraints violated at each generation. Moreover, with the aid of this penalty function, the information available from both feasible and infeasible regions will be employed to encourage the population to move toward a feasible Pareto front. Let φ(x i = [φ 1 (x i ),…, φ j (x i ),…, φ k (x i )] represent the vector of k penalized objective functions for the i th particle. Then, at each generation t, each component of the penalized objective function is defined as sum of two parts:

$$ {\varphi}_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)={\varLambda}_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)+{\varOmega}_j\left({\boldsymbol{x}}_i\right) $$
(13)

where Λ j (x) is called the distance value of particle x and is defined in (14).

$$ {\varLambda}_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)=\left\{\begin{array}{c}\psi \left({\boldsymbol{x}}_i\right),\kern8.75em \mathrm{if}\ {r}_f=0\ \\ {}\sqrt{{\widehat{f}}_j{\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)}^2+\psi {\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)}^2},\kern2.75em \mathrm{otherwise}\end{array}\right. $$
(14)

where ψ(x) represents the average normalized constraint violation and is defined as below.

$$ \psi \left({\boldsymbol{x}}_{\boldsymbol{i}}\right)=\frac{1}{q}\sum_{j=1}^q\frac{c_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)}{c_{max}^j} $$
(15)

c j (x i ) is the constraint violation of the j th constraint for the i th particle and c j max is the maximum constraint violation of the j th constraint in the current population:

$$ {c}_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)=\left\{\begin{array}{c}\mathit{\max}\left(0,{g}_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)\right),\kern4.25em if\ j\in \left[1,p\right]\\ {}\mathit{\max}\left(0,\left|{h}_j\left({\boldsymbol{x}}_i\right)\right|-\delta \right), if\ j\in \left[p+1,q\right]\end{array}\right. $$
(16)
$$ {c}_{max}^j=\underset{i\in \left[1,\kern0.5em n\right]}{\mathit{\max}}{c}_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right) $$
(17)

where δ is the threshold used for accepting that an equality constraint is satisfied. δ = 0.0001 is usually used in benchmark problems.

The second part of the penalty function is defined as follows.

$$ {\varOmega}_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)=\left(1-{r}_f\right){X}_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)+{r}_f{Y}_j\left({\boldsymbol{x}}_i\right) $$
(18)

where r f  = n f /n is the ratio of feasible solutions in current population and n f is the number of feasible particles among current population. In the above equation, X j (x i ) and Y j (x i ) are two penalty functions as defined in (19) and (20).

$$ {X}_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)=\left\{\begin{array}{c}0,\kern4.5em \mathrm{if}\ {r}_f=0\ \\ {}\psi \left({\boldsymbol{x}}_{\boldsymbol{i}}\right),\kern2.75em \mathrm{otherwise}\end{array}\right. $$
(19)
$$ {Y}_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right)=\left\{\begin{array}{c}0,\kern2.75em \mathrm{if}\ {\boldsymbol{x}}_i\ \mathrm{is}\ \mathrm{feasible}\ \\ {}{\widehat{f}}_j\left({\boldsymbol{x}}_{\boldsymbol{i}}\right),\kern3em \mathrm{otherwise}\end{array}\right. $$
(20)

At each generation, after evaluating fitness values (f-values) for the population, normalized fitness values (\( \widehat{f} \) - values) are obtained from (8). Finally, having checked the feasibility of each particle in the population, penalized objective function values (φ-values) can be computed for each individual in the population. Fig. 4 outlines the present FC-MOPSO algorithm for handling continuous constrained MOPs. In this figure, P f represents the Pareto front obtained by constructing a nondominated set from feasible individuals in the space of f-values. The Pareto set corresponding to P f is called P s .

Fig. 4
figure 4

Pseudo-code of the constrained continuous FC-MOPSO

5 A discrete version of FC-MOPSO

There are very limited studies that tried to address discrete multi-objective PSO algorithms. One of the most recent of such studies was accomplished by Tong et al. (2016) where an extension of MDPSO (Chowdhury et al. 2013) was recommended for solving problems with mixed discrete design variables. The flight stage in MDPSO is performed using a continuous PSO formulation. The Nearest Vertex Approach (NVA) is subsequently used for approximating the discrete-domain position of a particle. Another example of such studies is the discrete PSO developed by Chen et al. (2009) that utilizes an inertia-based formulation of PSO. The method incorporates crossover and mutation parameters in genetic algorithm so that the flight stage formulation of PSO can be applied to discrete problems.

Here, the algorithm proposed in previous sections is modified so that constrained/unconstrained MOPs with discrete or mixed discrete design variables can also be solved. In fact, only minor modifications is required before FC-MOPSO can be applied to such problems. A discrete or mixed discrete MOP is the case that occurs in design of most real-world civil structures. Steel buildings, trusses and some types of fixed offshore platforms are examples where hot rolled steel sections are used. Therefore, sizing variables in steel structures are discrete in practice. Sizing variables in concrete structures, however, deal with mixed design variables. For example, dimensions of beam and column sections can be represented by discrete values that are practical and easy for construction purposes while the ratio of longitudinal steel reinforcements can be regarded as continuous design variables. In a more practical approach, like the one performed by Paya et al. (2008), these two types of design variables, namely the ratio of longitudinal reinforcements and cross sectional dimensions, can be merged together to constitute a discrete design variable. Consequently, a set of predefined sections with predefined dimensions and steel ratios, which meet the requirements of building design codes, will be available for the optimization process. A similar approach can be adopted for other design variables such as lateral reinforcements. The above discussion reveals that adopting discrete design variables in civil structures has two main advantages: (1) a discrete problem definition is more realistic (2) the search space becomes smaller which is easier to search while impractical solutions are already eliminated from the set of solutions.

Soon after introducing the continuous version of PSO, Kennedy and Eberhart (1997) proposed a discrete binary version of the continuous counterpart which will be employed in this paper. Let \( {\boldsymbol{D}}_{\mathrm{j}}=\left[{D}_j^1,\dots, {D}_j^{\mu_j}\right] \), j∊[1, s] be the vector including the μj discrete values available for the j th design variable and μj refer to the number of discrete variables in the j th dimension of the search space. Therefore, the number of bits required for a binary representation of design variables in the j th dimension will be equal to B j  = floor (log(μ j )/ log(2)) + 1. This way, D j 1 will be defined by a string of length B j where all its bits are zeroes and D j 2 is defined with a string that its first bit from the right side is equal to one while other bits are zeroes and so on. It should be noted that \( {D}_j^{\mu_j} \) is not defined by a string with all its bits equal to one. Hence, particles shall be put on the boundaries if they go outside the search space during mutation. Moreover, velocities will be defined in terms of changes of probabilities. In other words, if \( {x}_{i_j,b} \) (j∊[1, s] and b∊[1, B j ]) refers to the b th bit of the j th component of the i th particle, then \( {v}_{i_j,b} \) is the corresponding bit of the velocity and represents the probability that \( {x}_{i_j,b} \) takes a value equal to one. However, \( {v}_{i_j,b} \) can also take values outside the interval [0, 1]. Therefore, to accomplish the interpretation in which velocities are treated as probabilities, an appropriate mapping such as sigmoidal function should be employed so that the probability remains in the interval [0, 1]. The variant of sigmoidal function that returns values restricted to the interval [0, 1] is defined as S(x) = 1/(1 + exp(−x)). Therefore, the FC-MOPSO algorithm defined in Figs. 3 and 4 is still applicable with two modifications. Firstly, at each generation (3) and (4) shall be replaced with (21) and (22), respectively. That is, for each bit b∊[1, B j ] of the j th component of the i th particle we have:

$$ \begin{array}{l}{v}_{i_j,b}^{t+1}=\chi \left[{v}_{i_j,b}^t+{\phi}_1{r}_1\left({psel}_{i_j,b}-{x}_{i_j,b}^t\right)\right.\\ {}\kern3em \left.+{\phi}_2{r}_2\left({leader}_{i_j,b}-{x}_{i_j,b}^t\right)\right]\end{array} $$
(21)
$$ {x}_{i_j,b}^{t+1}=\left\{\begin{array}{c}1,\kern1em if\ \mathit{\operatorname{rand}}\left(\ \right)<S\left({v}_{i_j,b}^{t+1}\right)\ \\ {}0,\kern2.25em \mathrm{otherwise}\kern4.75em \end{array}\right. $$
(22)

Secondly, limitations on maximum velocity will not be applied except for starting the algorithm. That is, random velocity vectors with their bits limited to the interval [−4, 4] are used for initialization of the algorithm while no v m will be applied to limit the velocities afterwards.

The probability that a bit will be changed is equal to \( p\left({x}_{i_j,b}\right)=S\left({v}_{i_j,b}\right)\left[1-S\left({v}_{i_j,b}\right)\right] \) and is depicted in Fig. 5.

Fig. 5
figure 5

Probability of bit changing selected for the discrete FC-MOPSO

6 Test problems and discussion of results

6.1 Benchmark MOPs

15 well-known benchmark problems are employed for validating the performance of continuous FC-MOPSO. These problems include five unconstrained and ten constrained benchmark MOPs. The unconstrained MOPs consist of ZDT1, ZDT2, ZDT3, ZDT4 and ZDT6. These problems have been designed by Zitzler et al. (2000) to assess performance of optimization algorithms in dealing with different types of difficulties in finding the true Pareto front. ZDT1 has a convex Pareto front while ZDT2 is the nonconvex counterpart of ZDT1. ZDT3 represents a problem with discreteness feature because its Pareto front consists of several noncontiguous convex parts. ZDT4 introduces a problem that contains 219 local Pareto front. Therefore, it is suitable for validating the ability of an algorithm to deal with multimodality. As noted by Zitzler et al. (2000), ZDT6 is a test problem with nonuniformity in the search space. The Pareto fronts of this problem are nonuniformly distributed along the global Pareto front. Furthermore, the density of solutions is minimum near the Pareto front and is maximum away from it. The constrained MOPs considered in this section consist of the TNK problem (Tanaka et al. 1995), the speed reducer design problem (Kurpati et al. 2002) which is a multi-objective variant of the SOP originally defined by Golinski (1970), CONSTR (Deb et al. 2002), Srinivas problem (problem F3 in (Srinivas and Deb 1994)), water resource planning problem defined by Ray et al. (2001) and CTP2, CTP3, CTP4, CTP5 and CTP6 (Deb et al. 2001). The CTP test suite consists of problems that cause difficulties either in the entire search space or near the Pareto front.

The 15 benchmark problems are solved by FC-MOPSO and are verified against four well-known algorithms available in jMetal (Durillo and Nebro 2011). There are a variety of algorithms available in jMetal. However, the authors found out that four of them yield the best results within the limited number of function evaluations focused in this paper. Therefore, OMOPSO (Sierra and Coello Coello 2005), SMPSO (Nebro et al. 2009), NSGA-II (Deb et al. 2002) and SPEA2 (Zitzler et al. 2002) are to be employed.

Since EAs are random-based, the results are commonly presented after performing some Monte Carlo simulations. Hence, the results presented here are those obtained after 30 runs (M = 30) with the same starting random numbers for all optimizers. For all fifteen benchmark problems, the algorithms were run with a population size of 10 (n = 10) and the termination criteria was set to be 400 function evaluations or 40 generations equivalently (g = 40). Other parameters for the algorithms in jMetal such as mutation and crossover probability are reasonably left to be their default values in jMetal. Throughout this paper our FC-MOPSO algorithm uses a neighborhood size equal to N = 9 for cases with n = 10 and N = 4 for cases with n = 30.

There are many metrics available in the literature that can be employed for comparing two Pareto fronts but it should be noted that none of them is always reliable (Yan et al. 2007). Therefore, more than one metric is commonly used for comparing the performance of algorithms. Here, the hypervolume indicator (I H ) is used for estimating the diversity of a Pareto and its closeness to the true Pareto. The additive unary epsilon indicator (I ε+ ) is also utilized as another indicator of closeness to the true Pareto. A definition for hypervolume of a Pareto front is given in (23) where n p is the number of points in the Pareto front and r = [r 1 , r 2 ,…, r k ] is a reference point such that it is dominated by all points in the Pareto front. Other parameters in (23) have already been defined in previous sections.

$$ {I}_H\left({P}_f\right)=\frac{1}{n_p}\sum_{j=1}^{n_p}\prod_{i=1}^k\left({r}_i-{f}_i\left({\boldsymbol{x}}_j\right)\right) $$
(23)

In this paper, for each problem, all Pareto fronts from all M simulations besides the true Pareto of the problem are put together. This new set is normalized for each of its objective functions by using (8). These normalized Pareto fronts are utilized for evaluation through (23) with a reference point with all its elements set equal to 1.01. At each simulation, the Pareto fronts are normalized based on the maximum hypervolume in that simulation and the results are reported. Generally speaking, a higher value of I H indicates a more preferred Pareto regarding its diversity and closeness to the true Pareto. For the case of I ε+ indicator, on the other hand, the normalized Pareto fronts are employed to evaluate the metric for each Pareto in comparison to the true Pareto. The reader may refer to (Fonseca et al. 2005) for a definition of I ε+ . The smaller is the value of I ε+ , the closer it is to the true Pareto. With the above mentioned procedure, I H and I ε+ can take values between zero and one. Table 1 represents the mean and standard deviations for the problems discussed. Bolded values in Table 1 show the best result obtained among all five algorithms for each problem. Figs. 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19 illustrate the best Pareto fronts achieved after 30 runs for all problems except for the Water problem that has five objective functions and cannot be plotted. From Table 1, it can be verified that FC-MOPSO has the highest number of best results among all algorithms. Moreover, Figs. 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, and 19 along with Table 1 show that FC-MOPSO absolutely outperforms other optimizers in ZDT test suite problems except for ZDT4 that deals with multimodality. ZDT problems are of unconstrained type. Therefore, the excellent performance of FC-MOPSO in ZDT problems indicates that extending the original single-objective PSO to FC- MOPSO in Section 3 has been quite successful. Note that none of the optimizers considered in this paper except for SMPSO could converge to the true Pareto front of ZDT4 within 400 function evaluations. The results of these optimizers have been so far away from the true Pareto front that it was not possible to illustrate them in Fig. 9. Fortunately, structural engineering optimization problems do not include high multimodality.

Table 1 Results for benchmark MOPs after 30 runs
Fig. 6
figure 6

Pareto fronts for ZDT1 problem

Fig. 7
figure 7

Pareto fronts for ZDT2 problem

Fig. 8
figure 8

Pareto fronts for ZDT3

Fig. 9
figure 9

Pareto fronts for ZDT4

Fig. 10
figure 10

Pareto fronts for ZDT6 problem

Fig. 11
figure 11

Pareto fronts for TNK problem

Fig. 12
figure 12

Pareto fronts for Golinski problem

Fig. 13
figure 13

Pareto fronts for CONSTR problem

Fig. 14
figure 14

Pareto fronts for Srinivas problem

Fig. 15
figure 15

Pareto fronts for CTP2 problem

Fig. 16
figure 16

Pareto fronts for CTP3 problem

Fig. 17
figure 17

Pareto fronts for CTP4 problem

Fig. 18
figure 18

Pareto fronts for CTP5 problem

Fig. 19
figure 19

Pareto fronts for CTP6 problem

6.2 Truss design examples

Five truss design examples are solved to analyze the performance of FC-MOPSO in structural optimization problems. Both discrete and continuous variants of FC- MOPSO are adopted for solving these problems. Furthermore, a problem with mixed discrete and continuous design variables is solved in Section 6.2.5. All continuous problems are also solved by the following algorithms: OMOPSO, SMPSO, NSGA-II and SPEA2. Discrete variants of the problems are solved by NSGA-II. Pareto fronts obtained from FC-MOPSO and other optimizers are reported after 20 runs for problems with two objective functions. For problems with more than two objective functions, the results are presented after 10 runs. All problems are solved for two cases: (1) n = 10, g = 10 and (2) n = 30, g = 100. It should be reminded that the discrete search space is a subdomain of the continuous search space. Hence, potentially, the solutions of a continuous optimization problem can outperform discrete solutions since some parts of the true Pareto front might correspond to regions of the search space that are not available in discrete search space.

6.2.1 Spatial 25-bar truss

The spatial 25-bar truss shown in Fig. 20 was studied as an SOP in many researches such as (Lee and Geem 2004) and (Talatahari et al. 2012). In this paper, a discrete multi-objective approach to this problem that was defined by Luh and Chueh (2004) is considered. It is desired to minimize two objective functions simultaneously: (1) the weight of the structure and (2) the displacement of node 1 in Y-direction. The constraints are defined such that the principal stress in no element exceeds an allowable stress equal to σ a  = ±40 ksi. Furthermore, available discrete cross sectional areas are assumed to be D = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.8, 3.0, 3.2, 3.4] (in 2). The problem is also solved by considering continuous cross-sectional areas larger than 0.01 in 2 and less than 3.4 in 2. There are eight sizing variables deriving from the following member grouping: (1) A 1, (2) A 2 ~ A 5, (3) A 6 ~ A 9, (4) A 10 ~ A 11, (5) A 12 ~ A 13, (6) A 14 ~ A 17, (7) A 18 ~ A 21, and (8) A 22 ~ A 25. A modulus of elasticity equal to E = 10 Msi and a density equal to ρ = 0.1 lb./in 3 is used for all members. Loading conditions for this truss are summarized in Table 2.

Fig. 20
figure 20

Schematic of the spatial 25-bar truss structure

Table 2 Load conditions for 25-bar truss

Fig. 21 compares the results obtained by FC-MOPSO with those from the other optimizers. I H and I ε+ indicators for 3000 and 100 function evaluations are reported in Tables 3 and 4, respectively. Similarly, comparisons between discrete solutions found by FC-MOPSO with those from NSGA-II are reported in Table 5.

Fig. 21
figure 21

Comparison of the Pareto fronts obtained by different algorithms in the 25-bar truss problem

Table 3 Performance comparison of continuous solutions from FC-MOPSO with other algorithms for truss examples (n = 30, g = 100)
Table 4 Performance comparison of continuous solutions from FC-MOPSO with other algorithms for truss examples (n = 10, g = 10)
Table 5 Performance comparison of discrete solutions from FC-MOPSO with NSGA-II for truss examples

For the case of 3000 function evaluations, Table 3 shows that the continuous FC-MOPSO outperforms all other optimizers as far as it concerns I H indicator. From this table, it can also be confirmed that FC-MOPSO is the second best optimizer regarding the I ε+ metric. The difference between I ε+ for FC-MOPSO and the best I ε+ , which was found by SPEA2, is very small. Table 4 and Fig. 21a show that for very small number of function evaluations (namely, one hundred evaluations) FC-MOPSO is the best optimizer regarding both I H and I ε+ indicators. In other words, the relative performance of FC-MOPSO, compared with other optimizers studied, increases when a smaller number of function evaluations is used.

As it can be seen from Table 5 and Fig. 21c, for both cases of 100 and 3000 function evaluations, discrete solutions found by FC-MOPSO outperform those from NSGA-II with a remarkable difference. It is noted that the value of I H relating to NSGA-II for 100 evaluations is larger than I H for 3000 evaluations which may seem counterintuitive. The reason is that, as it can be seen from Fig. 21c, most points found by NSGA-II in 100 function evaluations are gathered in a small central portion of the true Pareto front. Such situations can result in delusively high values of I H . That is why, as it was already discussed in Section 6.1, the results should always be justified with more than one metric along with visual inspections whenever it is possible. For this case, the I ε+ indicator does not yield counterintuitive results since its value for 100 function evaluations is smaller than its value for 3000 function evaluations.

The small standard deviation values of FC-MOSO reported in Tables 3, 4, and 5 confirm that the algorithm have a consistent behavior in 20 runs.

6.2.2 Spatial 56-bar truss

The spatial 56-bar truss shown in Fig. 22 is considered as an example of structural MOPs with more than two objective functions. As a discrete optimization problem, practical equal leg double angles as listed in Table 6 are used. Accordingly, cross sectional areas between 200 mm 2 and 2000 mm 2 should be used for a continuous problem. Loading conditions for this truss consist of 4 kN in the Y-direction and 30 kN in the Z- direction at node 1 while all other nodes are loaded with 4 kN in the Y-direction and 10 kN in the Z-direction. The objective functions are to minimize the weight of the structure, the total displacement of node 1 and the first natural frequency of the truss. Note that these objective functions are in conflict with one another. The mass matrix of the truss is assembled by adopting lumped mass formulation for truss elements. The vertical displacements of nodes 4, 5, 6, 12, 13 and 14 is limited to ±40 mm while the displacement in the Y-direction for node 8 is not allowed to exceed ±20 mm. The modulus of elasticity and material density are assumed to be E = 210 GPa and ρ = 7800 kg/m 3, respectively.

Fig. 22
figure 22

Schematic of the spatial 56-bar truss structure

Table 6 List of equal leg double angles

Fig. 23 illustrates the results obtained from FC-MOPSO along with the optimum solutions obtained by other optimizers. Fig. 23a and b show the best continuous Pareto fronts obtained by different optimizers for 100 and 3000 function evaluations, respectively. It can be observed that FC-MOPSO could find an acceptable and diverse approximation of the Pareto front while other optimizers have failed to converge to such solutions.

Fig. 23
figure 23

Comparisons of optimal solutions obtained by different optimizers for the 56 bar truss problem

Fig. 23b shows that the continuous FC-MOPSO almost approximated the whole Pareto front while the results of other optimizers cover only a small portion of the Pareto front. Similarly, Fig. 23a shows that for 100 function evaluations and in comparison to other optimizers, FC-MOPSO has managed to find better approximation of the Pareto front.

Results from discrete solutions are illustrated in Fig. 23c. This figure shows that FC-MOPSO could yield much better solutions than NSGA-II. Especially for 100 function evaluations, NSGA-II approximates only a very small portion of the Pareto front in comparison to FC-MOPSO where the whole Pareto front is well approximated.

FC-MOPSO preserves the best experiences of each particle in P i sets. This preservation along with the techniques that are carried out for pruning the archives and proper selection of the leaders gives FC-MOPSO a better chance for exploring the search space in comparison to other optimizers. Note that using the I H and I ε+ indicators for comparing two Pareto fronts can be counter-intuitive if there is a great difference in the number of the Pareto optimal points. The number of Pareto optimal points found by FC-MOPSO at each run is much larger than those from other optimizers for this problem as well as the optimization problem defined in Section 6.2.5. Hence, the I H and I ε+ indicators are not evaluated for these problems.

6.2.3 Spatial 72-bar truss

Another well-known optimization test problem often considered in the literature is the weight optimization of the spatial 4-story 72-bar truss shown in Fig. 24. See, for example, (Adeli and Kamal 1986); (Erbatur et al. 2000); (Lee and Geem 2004); (Jansen and Perez 2011). In this study, discrete sizing optimization is carried out by using the first 31 sections defined in Table 6. For a continuous solution, cross sectional areas between 0.1 in 2 and 2.5 in 2 are utilized. Four objective functions are considered for simultaneous minimization: (1) the weight of the structure, (2) the maximum value of displacements of the uppermost nodes in X and Y directions, (3) the first natural frequency and (4) the total potential energy of the structure. The constraints imposed on the problem are to limit the stress of elements to ±25 ksi. The modulus of elasticity and density are assumed to be 10 Msi and 0.1 lb./in 3, respectively. Two load cases are considered. The first one is to apply P X  = 5.0 kips, P Y  = 5.0 kips and P Z  = −5.0 kips at node 17. The second load case consists of P Z  = −5 kips, applied to nodes 17, 18, 19 and 20. Cross-sectional areas of elements are linked into the following 16 groups: (1) A 1 ~ A 4, (2) A 5 ~ A 12, (3) A 13 ~ A 16, (4) A 17 ~ A 18, (5) A 19 ~ A 22, (6) A 23 ~ A 30, (7) A 31 ~ A 34, (8) A 35 ~ A 36, (9) A 37 ~ A 40, (10) A 41 ~ A 48, (11) A 49 ~ A 52, (12) A 53 ~ A 54, (13) A 55 ~ A 58, (14) A 59 ~ A 66, (15) A 67 ~ A 70, and (16) A 71 ~ A 72.

Fig. 24
figure 24

Schematic of the spatial 72-bar truss structure

It is not possible to illustrate the 4D Pareto fronts obtained by the optimizers. Tables 7, and 8 show the extreme points of Pareto fronts found by different algorithms. However, they cannot be used for drawing a final conclusion since it can be observed that most extreme points of the Pareto fronts obtained from each optimizer is a nondominated point with respect to the corresponding extreme points found by other optimizers. Therefore, I H and I ε+ should be used for comparing the performance of the optimizers. Table 3 shows that the continuous FC-MOPSO outperforms other optimizers in terms of hypervolume. Standard deviations are also reasonably small and confirm the stability of FC-MOPSO. From Table 5, it can be confirmed that FC-MOPSO absolutely outperforms NSGA-II in discrete solutions. It is noted that the focus of FC-MOPSO is not solving problems with many objective functions (say, four and more numbers of objective functions) since these kinds of problems are not conventional in design of civil structures.

Table 7 Extreme points of Pareto fronts found by different algorithms for the 72-bar truss problem with n = 30, g = 100
Table 8 Extreme points of Pareto fronts found by different algorithms for the 72-bar truss problem with n = 10, g = 10

6.2.4 Spatial 120-bar truss

The spatial 120-bar truss shown in Fig. 25 was defined by Soh and Yang (1996) and is taken here as an example of medium-scale structure to further analyze the performance of the FC-MOPSO algorithm. This truss was optimized for its weight by Lee and Geem (2004) by considering continuous tubular cross sections between 0.775 in 2 and 20.0 in 2. In this study, two objective functions are minimized: (1) the total weight of the structure (2) the maximum displacements of all nodes in all coordinate directions. Practical discrete tubular cross-sectional areas are defined in Table 9.

Fig. 25
figure 25

Schematic of the spatial 120-bar truss structure

Table 9 List of tubular sections employed for the 120-bar truss example

Furthermore, cross sections between 0.775 in 2 and 20.0 in 2 are considered for a continuous problem definition. Yield stress, modulus of elasticity and density are set equal to σ y  = 58 ksi, E = 30,450 ksi and ρ = 0.288 lb./in 3, respectively. Stress constraints based on AISC-ASD (1989) are applied, namely tensile stresses are limited to \( {\sigma}_{all,i}^{-}=0.6{\sigma}_y \) for each element while compressive stresses are limited to \( {\sigma}_{all,i}^{+} \) as follows.

$$ {\sigma}_{all,i}^{+}=\left\{\begin{array}{c}\ \left[\left(1-\frac{\lambda_i}{2{C}_c^2}\right){\sigma}_y\right]/\left(\frac{5}{3}+\frac{3{\lambda}_i}{8{C}_c}-\frac{\lambda_i^3}{8{C}_c^3}\right),\mathrm{if}\ {\lambda}_i<{C}_c\kern0.5em \\ {}\frac{12{\pi}^2E}{23{\lambda}_i^2},\kern9em \mathrm{otherwise}\kern4em \end{array}\right. $$
(24)

where C c  = (2π2 E/σ y )0.5 and λ i  = k i L i /r i . k i , L i and r i represent the effective length factor (equal to 1 for all members in this example), length of element and radius of gyration for each element, respectively. All elements are clustered into seven groups as defined in Fig. 25. All loads are applied in Z-direction and consist of −13.49 kips at node 1, −6.744 kips at nodes 2 through 14 and −2.247 kips on all other unsupported nodes.

Continuous solutions for 3000 function evaluations are presented in Fig. 26b. From this figure along with Table 3, it can be confirmed that all optimizers have yielded good approximations of the Pareto front. Especially, I H and I ε+ values for FC-MOPSO, SPEA2 and NSGA-II are very close to unity and zero, respectively. As it can be seen from Fig. 26a and Table 4, FC-MOPSO outperforms all other optimizers for 100 function evaluations. These results imply that the relative performance of FC-MOPSO, compared with other optimizers, is improved when a smaller number of function evaluations is used.

Fig. 26
figure 26

Comparisons of optimal solutions obtained by different optimizers for the 120 bar truss problem

The best Pareto fronts for the discrete problem are shown in Fig. 26c from which the efficiency of FC- MOPSO over NSGA-II can be clearly confirmed. The superiority of FC-MOPSO is also confirmed from Table 5 where the I ε+ indicators for FC-MOPSO are much better than those from NSGA-II. Hypervolume values for FC-MOPSO are very close to the values related to NSGA-II.

6.2.5 Sizing and geometric optimization of the spatial 25-bar truss with mixed design variables

The spatial 25-bar truss problem defined in Section 6.2.1 has been solved as a single-objective weight optimization problem with mixed design variables in the literature. See, for example, (Gholizadeh 2013) and (Mortazavi and Toğan 2016). In this paper, three objective functions (1) the weight of the structure (2) the maximum of all nodal displacements and (3) the first natural frequency of the structure will be minimized simultaneously. Material properties, load conditions, constraints on sizing variables and available discrete cross sectional areas are already defined in Section 6.2.1. The problem includes thirteen design variables: (1) A 1, (2) A 2 ~ A 5, (3) A 6 ~ A 9, (4) A 10 ~ A 11, (5) A 12 ~ A 13, (6) A 14 ~ A 17, (7) A 18 ~ A 21, (8) A 22 ~ A 25 (9) x 4 = x 5 = − x 3 = − x 6, (10) y 3 = y 4 = − y 5 = − y 6, (11) z 3 = z 4 = z 5 = z 6, (12) x 8 = x 9 = −x 7 = −x 10, and (13) y 7 = y 8 = −y 9 = −y 10. The first eight variables are discrete sizing variables and the latter five variables represent nodal coordinates that are considered as continuous geometric variables. All other nodal coordinates are fixed at the coordinates shown in Fig. 20. Furthermore, the following constraints are applied on geometric variables:

  • 20 in. ≤ x 4 ≤ 60 in.

  • 40 in. ≤ y 4 ≤ 80 in.

  • 90 in. ≤ z 4 ≤ 130 in.

  • 40 in. ≤ x 8 ≤ 80 in.

  • 100 in. ≤ y 8 ≤ 140 in.

Continuous and discrete FC-MOPSO should be utilized simultaneously for solving the problem. In other words, the algorithm of Section 4 is used for dealing with continuous geometric variables while the algorithm proposed in Section 5 is used for dealing with discrete sizing variables. This method shall be called mixed FC-MOPSO.

The results obtained from FC-MOPSO and NSGA-II are illustrated in Fig. 27. An approximation of true Pareto front is also illustrated. The true Pareto front was approximated by considering all Pareto fronts that were obtained from all simulations of the two optimizers.

Fig. 27
figure 27

Comparisons of optimal solutions obtained by FC-MOPSO and NSGA-II for sizing and geometric optimization of the 25 bar truss problem

Fig. 27a and b show that FC-MOPSO clearly outperforms NSGA-II in solving the problem with mixed continuous and discrete design variables. In fact, NSGA-II could approximate a small portion of the Pareto front. Fig. 27b shows that almost all points corresponding to the approximation of true Pareto belong to FC-MOPSO.

Extreme points of the Pareto fronts are also listed in Table 10. However, as it was discussed in Section 6.2.3, this table cannot be used for preferring one algorithm over another.

Table 10 Extreme points of Pareto fronts for sizing and geometric optimization of the 25-bar truss problem

7 Conclusions

A new PSO-based approach called FC-MOPSO was developed for handling multi-objective optimization problems with continuous or discrete variables and including different types of constraints. The extension of the original single-objective PSO to a multi-objective PSO was carefully accomplished to preserve the basic rationale in PSO.

FC-MOPSO reserves the best positions that each individual has experienced as a nondominated set. These non-dominated sets are subsequently used for selecting a leader and a local best particle for each individual and updating its position. This feature supports the swarm with the most possible information about the best regions of the search space. Such procedure can result in a fast rate of convergence but may also lead to converging to local minima. However, FC-MOPSO avoids premature convergence by restricting the size of archives, utilizing appropriate selection procedures and applying mutation operators. Restricting the size of archives and the selection procedure is done in a way that the swarm is encouraged to move to less explored regions of the search space. Such characteristics increase the chance of finding an acceptable approximation of the Pareto in limited number of function evaluations. Furthermore, extension of the single-objective PSO to FC-MOPSO was accomplished in a way that it is also possible to use any desired neighborhood topology other than the random one that was recommended in this paper. In other words, any neighborhood topology defined for single objective PSO methods can be readily applied to FC-MOPSO as well.

The methodologies proposed in this paper are effective, consistent with each other and easy to implement. The continuous and discrete FC-MOPSO can be readily mixed with each other to solve problems with mixed design variables. Therefore, FC-MOPSO can be applied to unconstrained/constrained continuous/ discrete/mixed multi-objective problems while most other PSO-based algorithms available in the literature are not meant to address all these scenarios.

The performance of FC-MOPSO was validated by considering a wide range of well-known benchmark problems. It was shown that the proposed method performs very well in finding acceptable approximations of Pareto fronts within limited number of function evaluations. This characteristic along with the capability of handling discrete and mixed search spaces makes FC-MOPSO an appropriate tool for structural optimization problems. It is an effective method because most structural MOPs deal with discrete or mixed discrete and continuous design variables on one hand and suffer from high costs in function evaluations on the other hand. Function evaluations can be quite costly especially for cases where more rigorous analyses such as nonlinear time history analyses are to be adopted.

The proposed method avoids defining parameters that need to be tuned. Two types of parameters could be distinguished in FC-MOPSO: (1) the three parameters inherited from its ancestor (i.e. ϕ 1, ϕ 2 and α) and (2) all other parameters specifically defined for FC-MOPSO such as p m , ρ j and C(x i ). The user is not asked to tune type 1 parameters because of the theoretical background for stability of PSO algorithms. Type 2 parameters are not required to be tuned either. Because, they were defined in a way that parameter tuning would not be required.

The number of design variables of the problems considered in this paper was at most sixteen variables for truss examples and thirty variables for continuous benchmark MOPs. FC-MOPSO could find satisfactory results for these problems within 100 and 400 function evaluations. Naturally, for larger problems it may be necessary to use higher number of function evaluations. It can be done by considering larger populations or larger number of generations or both. If the population size is very small (say for example, n = 10), the swarm will receive little information about the search space. Moreover, it is likely that the number of function evaluations is small in such cases (for example, large number of evaluations like n = 10, g = 1000 is an unlikely scenario because n = 100, g = 100 is probably preferred) and the swarm will have to converge to a solution more rapidly. Therefore, using neighborhood sizes equal to n - 1 is recommended in such cases. N = n - 1 somehow represents a fully informed PSO. For larger population sizes it is recommended to adopt small neighborhood sizes to restrict the amount of information passed to each particle and thereby, avoiding premature convergence. For example, N = 4 and N = 8 can be used for n = 30 and n = 100, respectively.