1 Introduction

Many problems in engineering and other fields of research can be considered as optimization problems aimed at finding an optimum design solution. To simulate a real-life problem for a real-life situation, a designer should investigate numerous objectives for obtaining the optimum design, while the given problem approaches its real-life nature as the number of objectives increases. The real-life optimization situations may involve solving various objectives, simultaneously.

As the number of objectives increases, the given problem approaches its real-life nature. Nowadays, researchers prefer to conduct their problems in real-life situations considering various objectives, simultaneously.

In contrast to an ordinary optimization problem (having only a single objective), multi-objective problems (MOPs) do not have a single solution (Glover and Kochenberger 2003). Depending on the designer’s decision, an optimum design solution may be extracted from the set of Pareto front solutions (Deb 2001; Coello et al. 2002).

Among optimization algorithms, metaheuristic methods have shown their potential for finding the near-optimal solution to the numerical real-valued test problems (Osman and Laporte 1996; Blum and Andrea 2003). Over the last decades, numerous algorithms have been widely used to solve MOPs (Wang et al. 2012), since MOPs are widely observed in the domains of science and engineering (Lin and Chen 2013).

Numerous metaheuristic algorithms can provide designers the flexible means for solving optimization problems. Such methods are usually based on mathematical rules and models to imitate natural phenomena or real-life events to conduct search on the domain space and optimize given problems.

The concepts of metaheuristic algorithms are inspired by various events in nature such as natural selection and evolution processes used in genetic algorithms (GAs) (Holland 1975) or animal behavior and their search abilities for finding food such as particle swarm optimization (PSO) (Kennedy and Eberhart 1995), and social-human evolution such as imperialist competitive algorithm (Atashpaz and Lucas 2007).

Considering multi-objective approaches, a new version of the non-dominated sorting genetic algorithm (NSGA) (Srinivas and Deb 1995) alleviated the shortcomings (i.e., high computational effort, non-elitist approach, and specifying sharing parameters) of the NSGA. The improved method is known as NSGA-II (Deb et al. 2002a). Knowles and Corne (2000) came up with a new method called the Pareto archived evolution strategy (PAES). The PAES employs a local search approach for creating new generations using the population information from its selection process.

For handling several objective functions, the PSO was not exempted from the eyes of researchers and was considered in literature as a multi-objective optimizer (Coello and Lechuga 2002; Mostaghim and Teich 2003; Sierra and Coello 2005). For instance, Kaveh and Laknejadi (2011) combined the concept of the PSO with their developed method, the charge system search (CSS), for solving MOPs so called the CSS-MOPSO.

In addition, recently, many optimizers have been proposed in literature for tackling MOPs trying to improve and enhance the exploratory capabilities of non-dominated solutions (Zitzler and Thiele 1999; Zitzler et al. 2001; Gao and Wang 2010; Pradhan and Panda 2012; Wang et al. 2012; Mahmoodabadi et al. 2013).

In this paper, a recently proposed metaheuristic method which is based on the water cycle process has been used to tackle MOPs. The idea of water cycle algorithm (WCA) was first suggested by Eskandar et al. (2012) and the application and validation of the WCA was carried out for constrained optimization problems (Eskandar et al. 2012). The main purpose of this paper is to show the potential and performance of WCA for solving multi-objective functions.

The remaining of this paper is organized as follows: definitions of standard MOPs are given in Sect. 2. In addition, performance criteria used to have a quantitative assessment of MOPs are described in Sect. 2. In Sect. 3, detailed descriptions of the WCA and multi-objective water cycle algorithm (MOWCA) and their concepts are introduced. Section 4 represents the comparisons of the obtained statistical optimization results using the MOWCA with other optimizers for reported problems in form of tables and figures. Numerical examples and benchmark functions accompanied with their mathematical formulations considered in this paper are provided in Appendix A. Finally, conclusions are drawn in Sect. 5.

2 Multi-objective problems

The nature of many real-life problems are considered as a form of MOP. In many fields of science and engineering, multiple objective functions should be considered and optimized, simultaneously. Therefore, a MOP can be formulated as follows:

$$\begin{aligned} F(X)=\left[ f_1 (X),f_2 (X),\dots ,f_N (X)\right] ^\mathrm{T}, \end{aligned}$$
(1)

where \(X=[x_{1}, x_{2}, x_{3},\dots ]\) is a vector of design variables. The simplest approach for the MOPs is to use weighting factor for each function and add them together based on the following equation (Haupt and Haupt 2004):

$$\begin{aligned} F=\sum \limits _{n=1}^N {w_n \, f_n }, \end{aligned}$$
(2)

where \(N\) is the number of objective functions, and \(w_{n}\) and \(f_{n }\) are weighting factors and objective functions, respectively. The major drawback of aforementioned technique (Eq. 2) is selecting a suitable value for the weighting factors (\(w_{n})\). Different values of \(w_{n}\) give different optimal solutions for the same \(f_{n}\).

However, the Pareto front approach can be used as an alternative approach to solve the MOPs. In the MOPs, there is usually a set of solution which is defined as Pareto optimal solutions or non-dominated solutions (Coello 2000). The main purpose of the multi-objective optimization is to find as many of non-dominated solutions as possible. The non-dominated solutions are defined as follows (Wang et al. 2012):

  1. (a)

    Pareto dominance: \(U=(u_{1}, u_{2}, u_{3},\dots ,u_{n}) < V=(v_{1}, v_{2}, v_{3},\dots ,v_{n})\) if and only if \(U\) is partially less than \(V\) in the objective space which it means:

    $$\begin{aligned} \left\{ \begin{array}{l} {{f}}_{{{i}}} {{(U)}} \le {{ f}}_{{{i}}} {{(V)}}\quad \forall {{i}} \\ {{f}}_{{{i}}} {{(U)}} < {{ f}}_{{{i}}} {{(V)}}\quad \exists {{i}} \\ \end{array} \right. \;\;{{i}} = {{1,2,3,}} \ldots {{N}}, \end{aligned}$$
    (3)

    where \(N\) is the number of objective functions.

  2. (b)

    Pareto optimal solution: vector \(U\) is said to be a Pareto optimal solution if and only if any other solutions cannot be detected to dominate \(U\). A set of Pareto optimal solution is called Pareto optimal front (PF\(_{\mathrm{optimal}})\).

Figure 1 illustrates the concept of Pareto optimal optimization technique for bi-objective problems. As can be seen in Fig. 1, solutions \(A\) and \(B\) are considered as non-dominated solutions. The reason is they are not dominated by each other for given objectives.

Fig. 1
figure 1

Optimal Pareto solutions (\(A\) and \(B)\) for the two-dimensional domain

To clarify further, the obtained solution \(A\) has the minimum value for the \(f_{1}\) compared with solution \(B\). However, the obtained value for solution \(A\) for the \(f_{2}\) is higher than solution \(B\) (see Fig. 1). In contracts, solution \(C\) is dominated by solutions \(A\) and \(B\) in terms of the minimum values for both objective functions (\(f_{1}\) and \(f_{2})\) as shown in Fig. 1. The solution \(C\) is called dominated solution and solutions \(A\) and \(B\) are known as Pareto optimal solutions (non-dominated solutions).

2.1 Performance metrics

In order to have an accurate evaluation for the proposed MOWCA to solve MOPs, three factors are usually taken into consideration (Zitzler et al. 2000). These three criteria are given in the following subsections.

2.1.1 Generational distance metric

Generational distance (GD) metric is defined as a criterion for the convergence between the Pareto optimal front (PF\(_{\mathrm{optimal}})\) and generated (calculated) Pareto front (PF\(_{\mathrm{g}})\). In fact, it is a Euclidian distance between the resulting non-dominated solution and PF\(_{\mathrm{optimal}}\) (Kaveh and Laknejadi 2011).

Based on this definition, each algorithm with the minimum GD can have the best performance among others. This evaluation factor is defined in form of mathematical formulation, however, there are different variants of GD reported in the literature (Kaveh and Laknejadi 2011; Coello 2004):

$$\begin{aligned}&\mathrm{GD1}=\left( {\frac{1}{n_{\mathrm{pf}} }\sum \limits _{i=1}^{n_{\mathrm{pf}} } {d_i^2 } }\right) ^{1/2}, \end{aligned}$$
(4)
$$\begin{aligned}&\mathrm{GD2}=\frac{1}{n_{\mathrm{pf}} }\left( {\sum \limits _{i=1}^{n_{\mathrm{pf}} } {d_i^2 } }\right) ^{1/2}, \end{aligned}$$
(5)

where \(n_{\mathrm{pf}}\) is number of member in PF\(_{\mathrm{g}}\) and \(d\) is the Euclidean distance between member \(i\)th in PF\(_{\mathrm{g}}\) and nearest member in PF\(_{\mathrm{optimal}}\). Meanwhile, the Euclidean distance \((d)\) is obtained based on the following equation:

$$\begin{aligned} d(p,q)=d(q,p)=\left[ {\sum \limits _{i=1}^n {(f_{iq} -f_{ip} )^2} } \right] ^{1/2}, \end{aligned}$$
(6)

where \(q=(f_{1q},f_{2q},f_{3q}, {\ldots }, f_{nq})\) is a point on PF\(_{\mathrm{g}}\) and \(P=(f_{1p},f_{2p},f_{3p}, {\ldots }, f_{np})\) is the nearest member to \(q\) in PF\(_{\mathrm{optimal}}\). Figure 2 shows schematic view of this performance meter for two-dimensional space. The best obtained value for the GD metric is equal to zero which means the PF\(_{\mathrm{g}}\) can exactly cover the PF\(_{\mathrm{optimal}}\).

Fig. 2
figure 2

Schematic view of GD criterion for the MOPs

2.1.2 Metric of spacing

Metric of spacing \((S)\) gives an overview about the distribution of non-dominated solutions along the generated Pareto front (Kaveh and Laknejadi 2011). In other words, the main objective of this criterion is to demonstrate and clarify distribution of the non-dominated solutions in the objective space. Similar to the GD performance metric, the S metric is suggested by researchers having different formulations as given follows (Kaveh and Laknejadi 2011; Coello 2004):

$$\begin{aligned}&S1=\frac{\left[ {\frac{1}{n_{\mathrm{pf}} }\sum \limits _{i=1}^{n_{\mathrm{pf}} } {(d_i -\bar{d})^2} } \right] ^{1/2}}{\bar{d}}, \end{aligned}$$
(7)
$$\begin{aligned}&S2=\left[ {\frac{1}{n_{\mathrm{pf}} -1}\sum \limits _{i=1}^{n_{\mathrm{pf}} } {(d_i -\bar{d})^2} } \right] ^{1/2}, \end{aligned}$$
(8)

where \(\bar{d}\) is the mean value of all \(d_{i}\). The smallest value of \(S\) shows the best uniform distribution on PF\(_{\mathrm{g}}\). If all non-dominated solutions are uniformly distributed in the PF\(_{\mathrm{g}}\), then, the values of \(d_{i}\) and \(\bar{d}\) are the same, therefore, the value of \(S\) metric equals to zero.

3 Multi-objective water cycle algorithm

3.1 Water cycle algorithm

The WCA mimics the flow of rivers and streams towards the sea and derived by the observation of water cycle process. Let us assume that there are some rain or precipitation phenomena. An initial population of design variables (population of streams) is randomly generated after raining process. The best individual (i.e., the best stream), classified in terms of having the minimum cost function (for minimization problem), is chosen as the sea (Eskandar et al. 2012).

Then, a number of good streams (i.e., cost function values close to the current best record) are chosen as rivers, while all other streams flow to the rivers and sea. In an \(N\) dimensional optimization problem, a stream is an array of \(1 \times N\). This array is defined as follows:

$$\begin{aligned} \mathrm{A}\;\mathrm{Stream}\;\mathrm{Candidate}=[x_1 ,x_2 ,x_3 ,\ldots ,x_N ], \end{aligned}$$
(9)

where \(N\) is the number of design variables (problem dimension). To start the optimization algorithm, an initial population representing a matrix of streams of size \(N_{\mathrm{pop}} \times N\) is generated. Hence, the matrix of initial population, which is generated randomly, is given as (rows and column are the number of population and the number of design variables, respectively):

$$\begin{aligned}&\mathrm{Total}\;\mathrm{Population}=\left[ {\begin{array}{l} \mathrm{Sea} \\ \mathrm{River_1} \\ \mathrm{River_2} \\ \mathrm{River_3} \\ \;\;\;\;\;\;\;\;\vdots \\ \mathrm{Stream}_{\mathrm{Nsr}+1} \\ \mathrm{Stream}_{\mathrm{Nsr}+2} \\ \mathrm{Stream}_{\mathrm{Nsr}+3} \\ \;\quad \;\vdots \\ \mathrm{Stream}_{N_{\mathrm{pop}} } \\ \end{array}} \right] \nonumber \\&\quad =\left[ {{\begin{array}{*{20}c} {x_1^1 } &{}\quad {x_2^1 } &{}\quad {x_3^1 } &{}\quad \cdots &{}\quad {x_N^1 } \\ {x_1^2 } &{}\quad {x_2^2 } &{}\quad {x_3^2 } &{}\quad \cdots &{}\quad {x_N^2 } \\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots \\ {x_1^{N_{\mathrm{pop}} } } &{}\quad {x_2^{N_{\mathrm{pop}} } } &{}\quad {x_3^{N_{\mathrm{pop}} } } &{}\quad \cdots &{}\quad {x_N^{N_{\mathrm{pop}} } } \\ \end{array} }} \right] , \end{aligned}$$
(10)

where \(N_{\mathrm{pop}}\) and \(N\) are the total number of population and the number of design variables, respectively. Each of the decision variable values (\(x_{1}, x_{2},\dots , x_{N})\) can be represented as floating point number (real values) or as a predefined set for continuous and discrete problems, respectively. The cost of a stream is obtained by the evaluation of cost function (\(C)\) given as follows:

$$\begin{aligned} C_i =\mathrm{Cost}_i =f(x_1^i ,x_2^i ,\dots ,x_N^i )\quad \quad i=1,2,3,\dots ,N_{\mathrm{pop}}. \end{aligned}$$
(11)

At the first step, \(N_{\mathrm{pop}}\) streams are created. A number of \(N_{\mathrm{sr}}\) from the best individuals (minimum values) are selected as a sea and rivers. The stream which has the minimum value among others is considered as the sea. In fact, \(N_{\mathrm{sr}}\) is the summation of number of rivers (which is defined by user) and a single sea (Eq. 12). The rest of the population (i.e., streams flow to the rivers or may directly flow to the sea) is calculated using the following equation:

$$\begin{aligned}&N_{\mathrm{sr}} =\mathrm{Number}{}\;\mathrm{of}\;\mathrm{Rivers}+\;\underbrace{\;\;\;\quad \;1\quad \quad \;\;}_{\;\mathrm{Sea}},\end{aligned}$$
(12)
$$\begin{aligned}&N_{\mathrm{Stream}} =N_{\mathrm{pop}} -N_{\mathrm{sr}}. \end{aligned}$$
(13)

Equation (14) shows the population of streams which flow to the rivers or sea. Indeed, Eq. (14) is part of Eq. (10) (i.e., total individual in population):

$$\begin{aligned}&\mathrm{Population}\;\mathrm{of}\;\mathrm{Streams}=\left[ {\begin{array}{l} \mathrm{Stream}_1 \\ \mathrm{Stream}_2 \\ \mathrm{Stream}_3 \\ \;\quad \;\vdots \\ \mathrm{Stream}_{N_{\mathrm{Stream}} } \\ \end{array}} \right] \nonumber \\&\quad =\left[ {{\begin{array}{*{20}c} {x_1^1 } &{}\quad {x_2^1 } &{}\quad {x_3^1 } &{}\quad \cdots &{}\quad {x_N^1 } \\ {x_1^2 } &{}\quad {x_2^2 } &{}\quad {x_3^2 } &{}\quad \cdots &{}\quad {x_N^2 } \\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots \\ {x_1^{N_{_{\mathrm{Stream}} } } } &{}\quad {x_2^{N_{_{\mathrm{Stream}} } } } &{}\quad {x_3^{N_{_{\mathrm{Stream}} } } } &{}\quad \cdots &{}\quad {x_N^{N_{_{\mathrm{Stream}} } } } \\ \end{array} }} \right] .\nonumber \\ \end{aligned}$$
(14)

Depending on flow magnitude, each river absorbs water from streams. The amount of water entering a river and/or the sea, hence, varies from stream to stream. In addition, rivers flow to the sea which is the most downhill location. The designated streams for each rivers and sea are calculated using the following equation (Eskandar et al. 2012):

$$\begin{aligned} \mathrm{NS}_n =\mathrm{round}\left\{ \left| {\frac{\mathrm{Cost}_n }{\sum \limits _{i=1}^{N_{\mathrm{sr}} } {\mathrm{Cost}_i } }} \right| \times N_{\mathrm{Stream}} \right\} ,\quad n=1,2,\ldots ,N_{\mathrm{sr}},\nonumber \\ \end{aligned}$$
(15)

where NS\(_{n}\) is the number of streams which flow to the specific rivers and sea. As it happens in nature, streams are created from the raindrops and join each other to generate new rivers. Some stream may even flow directly to the sea. All rivers and streams end up in the sea that corresponds to the current best solution.

Let us assume that there are \(N_{\mathrm{pop}}\) streams of which \(N_{\mathrm{sr}}-1\) are selected as rivers and one is selected as the sea. Figure 3a shows the schematic view of a stream flowing towards a specific river along their connecting line.

Fig. 3
figure 3

a Schematic description of the stream’s flow to a specific river; b schematic of the WCA optimization process

The distance \(X\) between the stream and the river may be randomly updated as the following relation:

$$\begin{aligned} X\in (0,C\times d), C>1, \end{aligned}$$
(16)

where \(1<C<2\) and the best value for \(C\) may be chosen as 2; \(d\) is the current distance between stream and river. The value of \(X\) in relation (16) corresponds to a random number (uniformly distributed or determined from any appropriate distribution) between 0 and (\(C\times d\)).

Setting \(C>1\) allows streams to flow in different directions towards rivers. This concept may also be used to describe rivers flowing to the sea. Therefore, as the exploitation phase in the WCA, the new position for streams and rivers have been suggested as follows (Eskandar et al. 2012):

$$\begin{aligned}&\!\!\!\overrightarrow{X}_{\mathrm{Stream}}^{i+1} =\overrightarrow{X}_{\mathrm{Stream}}^i +\mathrm{rand}\times C\times \left( \overrightarrow{X}_{\mathrm{River}}^i -\overrightarrow{X}_{\mathrm{Stream}}^i\right) ,\nonumber \\ \end{aligned}$$
(17)
$$\begin{aligned}&\!\!\!\overrightarrow{X}_{\mathrm{Stream}}^{i+1} =\overrightarrow{X}_{\mathrm{Stream}}^i +\mathrm{rand}\times C\times \left( \overrightarrow{X}_{\mathrm{Sea}}^i -\overrightarrow{X}_{\mathrm{Stream}}^i \right) ,\nonumber \\ \end{aligned}$$
(18)
$$\begin{aligned}&\!\!\!\overrightarrow{X}_{\mathrm{River}}^{i+1} =\overrightarrow{X}_{\mathrm{River}}^i +\mathrm{rand}\times C\times \left( \overrightarrow{X}_{\mathrm{Sea}}^i -\overrightarrow{X}_{\mathrm{River}}^i \right) ,\nonumber \\ \end{aligned}$$
(19)

where rand is an uniformly distributed random number between zero and one. Equations (17) and (18) are for streams which flow to their corresponding rivers and sea, respectively. Notations having vector sign correspond to vector values, otherwise the rest of notations and parameters are considered as scalar values. If the solution given by a stream is better than its connecting river, the positions of river and stream are exchanged (i.e., the stream becomes a river and the river becomes a stream). A similar exchange can be performed for a river and the sea.

The evaporation process operator also is introduced to avoid premature (immature) convergence to local optima (exploitation phase). Basically, evaporation causes sea water to evaporate as rivers/streams flow to the sea. This leads to new precipitations. Therefore, we have to check if the river/stream is close enough to the sea to make the evaporation process occur. For that purpose, the following criterion is utilized for evaporation condition:

$$\begin{aligned} \begin{array}{l} {if}\;\left\| {\overrightarrow{X}_{\mathrm{Sea}}^i -\,\overrightarrow{X}_{\mathrm{River}}^i \,} \right\| <\,d_{\max } \;\;\mathrm{or}\;\mathrm{rand}<0.1\quad \\ \quad i=1,2,3,\ldots ,N_{\mathrm{sr}} -1, \\ \mathrm{Perform}\;\mathrm{raining}\;\mathrm{process}\;u\sin g\; \mathrm{Eq}.~(20) \\ end \\ \end{array} \end{aligned}$$

where \(d_{\mathrm{max}}\) is a small number close to zero. After evaporation, the raining process is applied and new streams are formed in the different locations (similar to mutation in the GAs). To further clarify, if evaporation condition is satisfied for any rivers, the corresponding river together with its streams will be removed (i.e., evaporated). Afterward, the new streams which are equal to the number of previous streams and a river will be generated in new positions using Eq. (20). Hence, in the new generated sub-population, the best stream will act as a new river and other streams move toward their new river.

Indeed, the evaporation operator is responsible for the exploration phase in the WCA. The following equation is used to specify the new locations of the newly formed streams:

$$\begin{aligned} \overrightarrow{X}_{\mathrm{Stream}}^{\mathrm{new}} =\mathrm{L}\overrightarrow{\mathrm{B}}+\mathrm{rand}\times (\mathrm{U}\overrightarrow{\mathrm{B}}-\mathrm{L}\overrightarrow{\mathrm{B}}), \end{aligned}$$
(20)

where LB and UB are lower and upper bounds defined by the given problem, respectively. Similarly, the best newly formed stream is considered as a river flowing to the sea. The rest of new streams are assumed to flow into the rivers or may directly flow into the sea.

A large value for \(d_{\mathrm{max}}\) prevents extra searches and small values encourage the search intensity near the sea. Therefore, \(d_{\mathrm{max}}\) controls the search intensity near the sea (i.e., best obtained solution). The value of \(d_{\mathrm{max}}\) adaptively decreases as follows:

$$\begin{aligned} d_{\max }^{i+1} =d_{\max }^i -\frac{d_{\max }^i }{\text{ Max }\;\mathrm{Iteration}} \end{aligned}$$
(21)

Infiltration and transpiration are considered as two important steps in the water cycle process seen in nature. Infiltration is an important process where rain water is absorbed into the ground, through the soil and underlying rock layers. For the transpiration step, as plants absorb water from the soil, the water moves from the roots through the stems to the leaves. Once the water reaches the leaves, some of it evaporates from the leaves adding to the amount of water vapor in the air.

However, in the standard WCA (in its current version), the loss of waters using groundwater or plant absorption was not considered. In fact, these two steps (i.e., infiltration and transpiration steps in water cycle processes) are not included in the standard WCA.

The development of the WCA optimization process is illustrated by Fig. 3b where circles, stars, and the diamond correspond to streams, rivers, and sea, respectively. The white (empty) shapes denote the new positions taken by streams and rivers. In addition, Table 1 shows the pseudo-code and step-by-step processes of the WCA in detail.

Table 1 Pseudo-code of the WCA

3.1.1 Similarities and differences with other optimizers

In this subsection, similarities and differences of WCA with other optimization techniques are highlighted. The PSO (Kennedy and Eberhart 1995) and ICA (Atashpaz and Lucas 2007) as two common metaheuristic optimizers are selected for comparison purposes with the WCA. Indeed, every metaheuristic algorithm has its own approach and methodology in finding global optimum solution.

As a similarity among the WCA, PSO, and ICA, we can say that all methods are categorized as population-based metaheuristic algorithms; population of particles in the PSO, population of countries in the ICA, and population of streams in the WCA. As for the ICA, the WCA utilizes the concept of grouping for individuals using different strategy.

Except this similarity, their concepts, parameters and operators are different with each other. The PSO’s concept is based on the movement of particles (e.g., fishes, birds, etc.) and their personal and best individual experiences (Kennedy and Eberhart 1995). The WCA’s notions are derived by the water cycle process in nature and the observation of how streams and rivers flow to the sea, while the ICA is inspired by the imperialistic competition and social–political phenomenon in the globe.

The updating formulations for the positions of rivers and streams differ from the updating formulations used in the PSO and ICA. The WCA does not use the concept of moving directly to the best solution (global best) as used in the PSO. In fact, the WCA utilizes the concept of moving indirectly from streams to the rivers and from rivers to the sea (i.e., the temporal obtained optimum solution).

In contrast, in the ICA, colony’s countries move toward their relevant imperialist country; however, the imperialist countries do not have any moment toward the best solution (i.e., best imperialist country).

In the WCA, rivers [a number of best selected solutions except the best one (sea), (Eq. 12)] act as guidance points for guiding other individuals in the population (streams) towards better positions (see Fig. 3b) and to avoid the search in inappropriate regions (see Eq. 17).

It is worth pointing out that rivers, themselves, move towards the sea (i.e., best obtained solution). They are not fixed points (see Eq. 19) unlike the imperialist countries in the ICA. In fact, this procedure (moving streams to the rivers and, then moving rivers to the sea) leads to indirect movements towards the best solution by the WCA. In fact, the third movement (moving rivers to the sea, Eq. 19) does not define in the ICA (Atashpaz and Lucas 2007).

On other hand, in the PSO, individuals (particles) based on their personal and best experiences attempt to find the best solution as the searching approach is moving directly towards the best optimal solution. In addition, in the WCA, a number of near-best to best selected solutions (rivers + sea) attract other individuals of population (streams) based on their goodness of the function values (i.e., intensity of flow) using Eq. (15). However, in the classical PSO, this process is not used.

Another difference among the WCA, PSO, and ICA is the existence of evaporation condition and raining process in the WCA which corresponds to the exploration phase. The evaporation condition and raining process provide an escape mechanism for the WCA to avoid getting trapped in local optima, while in the PSO, the exploration mechanism (formulation) is different.

In the PSO, inertia weight (\(w)\) (i.e., a user parameter) in the updating equation (movement equation) is responsible for the exploration phase and reduces at each iteration, while in the ICA, based on the revolution probability (i.e., defined by user), revolution phase is in charge of exploration task. Table 2 summarizes the differences of three reported optimizers in terms of applied strategies.

Table 2 Differences among three optimization methods in terms of their approaches for finding global optimum solution

3.2 Proposed MOWCA

In order to convert the WCA as an efficient multi-objective optimization algorithm, it is crucially important to define predominant features of WCA in a correct way (i.e., sea and rivers). In standard optimization problems by WCA, only one objective function should be minimized and in this condition, a number of best obtained solutions in the population are selected as a sea (best obtained solution) and rivers.

Nevertheless, for MOPs, there is more than one function to be minimized (or maximized). Therefore, modifications required for the standard WCA for selecting sea and rivers in the multi-objective space. To select the most efficient (best) solutions in the population as a sea and rivers, crowding-distance mechanism is used. The concept of crowding-distance mechanism was first defined by Deb et al. (2002a).

This parameter is a criterion to show distribution of non-dominated solutions around a particular non-dominated solution. Figure 4 illustrates how to calculate crowding-distance for point \(i\) which is the average side length of the cuboid (Deb et al. 2002a). Lower value for crowding-distance indicates more distribution of the solutions in a specific region. In MOPs, this parameter is calculated in objective space. Hence, to compute this parameter for each non-dominated solution, all non-dominated solutions should be sorted in term of values for one of the objective functions.

Selection of the sea and rivers from the obtained population as the best guide solution for other solutions at each iteration is a vital step in the MOWCA. This affects both the convergence capability of the MOWCA as well as maintaining a good distribution of non-dominated solutions. Therefore, for all iterations, crowding-distance for all non-dominated solutions should be calculated to determine which solutions have the highest crowding-distance values.

Afterwards, the obtained non-dominated solutions are designated as sea and rivers and also, the intensity of flow for rivers and sea are calculated based on the crowding-distance values. In this situation, most likely, some non-dominated solutions creates around sea and rivers at next iterations and their value of crowding-distance amends and reduces.

Moreover, it is significantly important to save the non-dominated solutions in an archive to generate the Pareto front sets. This archive is updated at each iteration and dominated solutions are eliminated from the archive and all non-dominated solutions are added to the Pareto archive.

However, the size of Pareto archive (number of non-dominated solutions in the archive) is variable in the literature. Therefore, whenever the number of members in the Pareto archive increases the Pareto archive size, the crowding distance is applied again in order to eliminate as many non-dominated solutions as necessary which have the lowest crowding-distance values among the Pareto archive members.

3.3 Steps and flowchart of MOWCA

The steps of the MOWCA are summarized as follows:

Step 1: Choose the initial parameters of MOWCA: \(N_{\mathrm{sr}}\), \(d_{\mathrm{max}}\), \(N_{\mathrm{pop}}\), Max_Iteration, and Pareto archive size.

Step 2: Generate random initial population and form the initial streams, rivers, and sea using Eqs. (10), (12), and (13).

Step 3: Calculate the value of multi-objective functions for each stream using Eq. (11).

Step 4: Determine the non-dominated solutions in the initial population and save them in the Pareto archive.

Step 5: Calculate crowding-distance for each Pareto archive member.

Step 6: Select a sea and rivers based on the crowding-distance value.

Step 7: Determine the intensity of the flow for rivers and sea based on the crowding distance values using Eq. (15).

Step 8: Streams flow into the rivers using Eq. (17).

Step 9: Streams flow into the sea using Eq. (18).

Step 10: Exchange positions of river and sea with a stream which gives the best solution.

Step 11: Rivers flow into the sea using Eq. (19).

Step 12: Similar to Step 10, if a river finds better solution than the sea, the position of river is exchanged with the sea.

Step 13: Check the evaporation condition using the pseudo-code given in Subsect. 3.1.

Step 14: If the evaporation condition is satisfied, the raining process occurs using Eq. (20).

Step 15: Reduce the value of \(d_{\mathrm{max}}\) which is a user-defined parameter using Eq. (21).

Step 16: Determine the new non-dominated solutions in the population and save them in the Pareto archive.

Step 17: Eliminate any dominated solutions in the Pareto archive.

Step 18: If the number of member in the Pareto archive is more than the determined Pareto archive size, go to the Step 19, otherwise, go to the Step 20.

Step 19: Calculate the crowding-distance value for each Pareto archive member and remove as many members as necessary with the lowest crowding-distance value.

Step 20: Calculate the crowding-distance value for each Pareto archive member to select new sea and rivers.

Step 21: Check the convergence criteria. If the stopping criterion is satisfied, the algorithm will be stopped, otherwise return to the Step 8.

Fig. 4
figure 4

Schematic view of crowding-distance calculation

4 Optimization results and discussions

In this section, 12 MOPs are considered for validating the performance of the proposed MOWCA. These benchmark problems are selected from a set of significant past studies in this area (Fonseca and Fleming 1993; Deb 2002; Freschi and Repetto 2006; Gao and Wang 2010; Kaveh and Laknejadi 2011). The natures of mentioned problems include various types of objective functions (quadratic, cubic, polynomial, and nonlinear) having different number of design variables. Mathematical formulations of all considered MOPs accompanied with their optimal Pareto front are listed in Appendix A.

The proposed MOWCA was coded in MATLAB and the task of optimization was executed using 30 independent runs. For all benchmark problems, the initial parameters for the MOWCA (\(N_{\mathrm{total}}\), \(N_{\mathrm{sr}}\), and \(d_{\mathrm{max}})\) were selected as 50, 10, and 1e\(-\)5, respectively.

Additionally, the maximum number of iterations varies for each problem in order to have fair comparisons. In fact, the maximum number of function evaluations (NFEs) is taken as the stopping condition, similar to the other methods in this paper.

Meanwhile, based on the previous studies (Deb et al. 2002a; Freschi and Repetto 2006; Gao and Wang 2010; Kaveh and Laknejadi 2011; Pradhan and Panda 2012), the Pareto archive size is set to 100 for all reported MOPs.

Moreover, Eqs. (4) and (7) are used to calculate the performance parameters (i.e., the GD1 and S1) for test problems 2, 3, 6, 7, 8, and 9. Similarly, Eqs. (5) and (8) (i.e., GD2 and S2) are utilized for computing the aforementioned parameters for test problems 1, 4, 5, and 10. The comparison set adopted for our study is composed of state-of-the-art techniques covering a wide range of techniques such as the NSGA-II, PAES, MOPSO, charge system search and particle swarm optimization (CSS-MOPSO), and immune system multi-objective optimization algorithm (ISMOA) (Knowles and Corne 2000; Deb et al. 2002a; Deb et al. 2002b; Zhang et al. 2009; Kaveh and Laknejadi 2011).

For quantitative and qualitative evaluations, the final statistical results for these algorithms are evaluated based on the values obtained for the performance parameters (i.e., GD and \(S)\) and the generated plot for the Pareto front using the MOWCA. Table 3 shows the statistical optimization results including the best, mean, worst, standard deviation (SD), and NFEs used for all of the reported MOPs in this paper using the MOWCA.

Table 3 Statistical optimization results obtained by the MOWCA for all reported MOPs given in Appendix A

For the DTLZ problems, two cases are considered for bi-objective and three-objective functions in this paper. From Table 3, for the DTLZ problems two sets of results are provided. The first and second rows of Table 3 correspond to bi-objective and three-functions, respectively, for DTLZ series.

Tables 4 and 5 represent the obtained statistical results for the GD as performance metric for different optimizers for the MOPs given in Appendix A. Looking at Table 4, it can be inferred that the MOWCA has the advantage of having the smallest value of GD for the DEB, POL, KUR, ZDT3, ZDT4, ZDT6, and VNT functions, while the MOPSO and CSS-MOPSO (Kaveh and Laknejadi 2011) indicate better GD for the FON and ZDT1 functions, respectively (see Table 4).

Table 4 Mean and SD for the GD criterion
Table 5 Mean and SD for the GD criterion for the DTLZ series

Judging by Table 5, more comparisons have been carried out using the MOWCA, rank-based multi objective artificial physics optimization (RMOAPO), simple multi objective artificial physics optimization (SMOAPO), and multi objective particle swarm optimization (MOPSO) (Wang and Zeng 2013). In addition, the obtained optimization results given in Table 5 are based on 10,000 function evaluations.

Looking at Table 5, (similar to Table 4), the MOWCA surpassed other reported optimizers obtaining better statistical results for the GD. The best attained statistical results (i.e., mean and SD) are highlighted in bold as shown in Tables 4 and 5. In fact, the MOWCA offers the best performance obtaining the lowest GD for the most MOPs (10 out of 12 in Tables 4 and 5) in this paper and has been placed in first rank for the GD. On the contrary, the PAES and NSGA-II (Deb et al. 2002a) have the worst statistical results in terms of the GD (see Table 4).

In order to have more comparisons with the optimization results obtained by the MOWCA, the VNT and DEB functions were solved using other optimizers given in the literature. The statistical optimization results found by the MOWCA are given in Table 3 for the GD. For the VNT function, vector immune system (VIS) (Freschi and Repetto 2006) has obtained its mean GD value of 0.0033 and SD of 0.00171, while multi objective immune system algorithm (MISA) (Coello and Cruz Cortés 2005) has attained the values of 0.00338 and 0.00215 for the aforementioned evaluators, respectively.

Recently, multi-objective cat swarm optimization (MOCSO) (Pradhan and Panda 2012) was investigated for solving the Deb benchmark problem using the same conditions and offered the mean and SD values of 0.000769 and 0.000057, respectively. In summary, it can be seen from Table 3 that the MOWCA offers superiority over the VIS, MISA, and MOCSO in terms of mean and SD values for the GD metric.

Accordingly, in Tables 6, 7, and 8, the metric of spacing (\(S)\) is presented for reported MOPs. In order to perform a fair comparison with corresponding optimizers, the used NFEs for Tables 6 and 7 are set to 10,000, while for Table 8 the NFEs is chosen as 5,000.

Table 6 Mean and SD for the \(S\) metric
Table 7 Mean and SD for the \(S\) metric for the DTLZ problems
Table 8 Mean and SD for the \(S\) metric for the DTLZ problems

By observing Table 6, the MOWCA, (as for the CSS-MOPSO for some cases), obtained the best optimization results with respect to the average metric of spacing for the most MOPs in this paper. Also, the PAES Knowles and Corne (2000) has the weakest performance of all (see Table 6). Comparing with other optimizers, including the MOPSO, SMOAPO, and RMOAPO, using 10,000 NFEs (see Table 7 for bi-objective DTLZ problem) for the DTLZ series, the MOWCA could find a wide variety of solutions having uniform spread and the smallest value for the \(S\) metric.

In addition, in Table 8 (for three-objective DTLZ problem), using different optimizers and different NFEs, the obtained statistical results are compared. As can be seen in Table 8, in terms of \(S\) metric, the MOWCA has been placed in the first rank offering the minimum value for the metric of spacing.

However, the SD values obtained by the ISMOA (Zhang et al. 2009) are slightly better than those by the MOWCA. The maximum NFEs for the MOWCA is set to 5,000 which is five times fewer than the 25,000 NFEs considered for the NSGA-II and ISMOA. Hence, better SD obtained by the ISMOA can be easily justified by NFEs against the MOWCA. To be more precise, the ISMOA required more time for stability of its solutions (Zhang et al. 2009).

However, the ISMOA and NSGA-II could not find non-dominated solutions with well distribution compared to the MOWCA (Deb et al. 2002a; Zhang et al. 2009). The best obtained statistical results are highlighted in bold in Tables 6, 7, and 8. Moreover, the VIS (Freschi and Repetto 2006), MISA (Coello and Cruz Cortés 2005), and MOCSO (Pradhan and Panda 2012) were tackled to solve the Deb and VNT benchmark functions. The values for mean and SD using the MOCSO for the Deb test problem are 0.009 and 0.0007, respectively, for the metric of spacing.

Likewise, for the VNT function, the mean \(S\) and its SD values obtained by the VIS were 0.0589 and 0.00950, respectively; whereas, the MISA reached values of 0.0710 and 0.00962 for the aforesaid parameters (Coello and Cruz Cortés 2005). From the assessments, it can be seen from Tables 6, 7, and 8 that compared with the VIS, MISA, and MOCSO, the MOWCA has the advantages of having smaller statistical values for metric of spacing.

It is worth pointing out that the MOWCA offers acceptable statistical results for all performance parameters, while for the NSGA-II (Deb et al. 2002a) the results for the GD metric are considerably less accurate compared with the MOWCA. In general, a suitable optimization algorithm should offer reasonable statistical results for all existing evaluators given in the literature.

Figure 5 demonstrates the comparisons between the exact and computed Pareto fronts using the proposed optimizer for the MOPs given in Table 3. Further, Figs. 6 and 7 show the final non-dominated solutions obtained by the MOWCA and their optimal Pareto fronts for the DTLZ series problems (i.e., DTLZ 2, 4, and 7) having two and three objective functions, respectively. It is clear that the considered performance metrics for the given MOPs using the MOWCA have smaller values of the GD and \(S\) metrics, as shown in Figs. 5, 6, and 7.

Fig. 5
figure 5

Comparisons of optimal Pareto fronts and generated Pareto front using the MOWCA for: a DEB, b FON, c KUR, d POL, e ZDT1, f ZDT3, g ZDT4, h ZDT6, and i VNT (solid lines and dot points represent the optimal and generated (obtained) Pareto fronts, respectively)

Fig. 6
figure 6

Comparisons of optimal Pareto fronts and generated Pareto front using the MOWCA for the bi-objective function: a DTLZ 2, b DTLZ 4, and c DTLZ 7 (solid lines and dot points represent the optimal and generated (obtained) Pareto fronts, respectively)

Fig. 7
figure 7

Comparisons of optimal Pareto fronts and generated Pareto front using the MOWCA for the three-objective function: a DTLZ 2, b DTLZ 4, and c DTLZ 7 (left and right sides represent the optimal and generated (obtained) Pareto fronts, respectively)

5 Conclusions

This paper presented a proposed optimization technique for solving MOPs called MOWCA. The basic concepts of the WCA are inspired by observation of the water cycle process in real world. In this paper, the MOWCA was used for solving a number of well-known MOPs (i.e., 12 problems). The efficiency and performance of the MOWCA were carried out using two popular criteria (i.e., metric of generational distance and spacing). The obtained statistical results from performance metrics apparently reveal that the MOWCA was able to offer solutions close to the full optimal Pareto front in addition to providing superior quality of solutions in comparison with other state of the art algorithms considered in this paper. In general, MOWCA offers competitive solutions compared with other population based algorithms based on the reported numerical results in this research. In fact, although the robustness and exploratory capability of the MOWCA depends on the nature and complexity of the problems, the obtained optimization results show that the MOWCA can be considered suitable and efficient alternative method, having comparable degree of accuracy to find the optimal Pareto fronts for different scales of MOPs.