1 Introduction

Many optimization problems have more than one objective, with at least two objectives in conflict with one another. Therefore, improving on one objective leads to a weakening of another objective. This kind of problems are referred to as multi-objective optimization problems (MOPs). Evolutionary algorithms (EAs) are computational intelligence algorithms that are based on the natural model of evolution and are commonly used to solve MOPs [3]. Constrained multi-objective optimization problems CMOP refer to MOPs with inequality and/or equality constraints, in addition to boundary constraints of unconstrained MOPs.

Constrained MOPs are more difficult to solve, since the constraints limit the number of valid solutions in the search space, i.e. the feasible space. Two approaches that are frequently used to solve CMOPs with a (MOEA) are:

  • Adding a penalty term (function) to each objective function and then optimizing the adapted objective functions [21]

  • Adapting the Pareto-dominance [8, 16] equations to take into account the violation of constraints [5]

The problem with the penalty appraoch is that you are changing the objective function by adding an additional term, the penalty term, to the objective function. Since the various objective functions of the CMOP can have different orders of magnitude, the order of magnitude of the penalty term should also differ. However, the value selected for the penalty term will also influence the quality of the solutions that are found, since the penalty term balances the influence of constraint violation on the fitness of a specific solution. A too small contribution by the penalty term to the fitness value may lead to a set of infeasible solutions being found. In contrast, a too high penalty term may lead to a poor distribution of found solutions [5].

This paper investigates the influence of these two constraint-handling approaches on the steady-state non-dominated sorting genetic algorithm II (SNSGAII), the Pareto-archived evolution strategy (PAES), the multi-objective evolutionary algorithm based on decomposition (MOEA/D) and the cultural algorithm (CA) when solving CMOPs with various characteristics.

The rest of the paper’s layout is as follows: Sect. 2 provides an overview of CMOP, the algorithms and the constraint-handling approaches. The experimental setup is discussed in Sect. 3. Section 4 presents the results of the study and conclusions are highlighted in Sect. 5.

2 Background

This section discusses CMOP, as well as the algorithms and constraint-handling approaches that were used in the study.

2.1 Constrained Multi-objective Optimization

A CMOP can be defined as follows:

(1)

where \(f_k\) represents the k-th objective function, and \(g_m\) and \(h_m\) represent the m-th inequality and equality constraint respectively. The i-th component of the upper boundary value of \(\mathbf {x}\) is indicated by \(\mathbf {x}_i^U\) and the lower boundary value by \(\mathbf {x}_i^L\).

Due to the inequality and equality constraints in Eq. 1 not all solutions in the search space will be feasible. Various approaches exist to deal with CMOP [5, 11, 18] and to guide the search of the algorithm towards feasible solutions. Two approaches that are simple to implement are the following:

  • Adding a penalty term to the objective function

  • Modifying the Pareto-dominance equations to consider constraint violations

Assuming that you minimize all objective functions (a maximizing objective function can be managed by converting it to a minimization function using the duality principle), all constraints are first normalized before the constraint violation is calculated. Let the normalized constraint functions be represented by \(g^*_m (\mathbf {x}_i)\ge 0\) for \(m=1,2,\ldots ,n_g\). Then for each solution \(\mathbf {x}_i\) the constraint violation is calculated as follows [5]:

All constraint violations are then added together:

$$\begin{aligned} \varOmega (\mathbf {x}_i)=\sum ^{n_g}_{j=1} \omega _m(\mathbf {x}_i) \end{aligned}$$

The combined constrained violation value (\(\varOmega \)) is multiplied with a penalty parameter (\(P_k\)) and this product is then added to each of the objective function values as follows:

$$\begin{aligned} f_k^*(\mathbf {x}_i)=f_k(\mathbf {x}_i) + P_k \varOmega (\mathbf {x}_i) \end{aligned}$$
(2)

For a feasible solution the \(\varOmega \) term in Eq. (2) will be zero and \(f_k^*=f_k\).

The second approach is to keep the objective function (\(f_k\)) unchanged, but to modify the Pareto-dominance comparison of vectors. When solving a MOP a single fitness value does not exit. Therefore, a new approach is required to compare two solutions’ quality. Typically, Pareto-dominance is used:

Definition 1

Assuming minimization, a solution \(\mathbf {x}_i\) is better in quality than another solution \(\mathbf {x}_j\) if:

$$\begin{aligned}&f(x_i^a) < f(x_j^a) \,\,\forall a=1,\ldots ,A \end{aligned}$$
(3)
$$\begin{aligned} \exists&x_i^b: f(x_i^b) \le f(x_j^b) \,\, {with} \,\, b \in [1, n_x] \end{aligned}$$
(4)

If \(\mathbf {x}_i\) is better than \(\mathbf {x}_j\), \(\mathbf {x}_i\) dominates \(\mathbf {x}_j\), written as \(\mathbf {x}_i \prec \mathbf {x}_j\). If not one of the solutions are better in quality than the other, the solutions are non-dominated. The set of non-dominated solutions in the objective space is referred to as the Pareto-optimal front (POF).

Definition 1 does not take constraint violations into consideration. Therefore, when comparing solutions of a CMOP Definition 1 has to be adapted as follows:

Definition 2

Assuming minimization, a solution \(\mathbf {x}_i\) constrain-dominates another solution \(\mathbf {x}_j\) if:

  1. 1.

    \(\mathbf {x}_i\) is feasible and \(\mathbf {x}_j\) is infeasible

  2. 2.

    \(\mathbf {x}_i\) and \(\mathbf {x}_j\) are both infeasible, but \(\mathbf {x}_i\) has a smaller constraint violation

  3. 3.

    \(\mathbf {x}_i\) and \(\mathbf {x}_j\) are both feasible and \(\mathbf {x}_i \prec \mathbf {x}_j\) according to Definition 1.

2.2 Algorithms

The non-dominated sorting genetic algorithm II (NSGA-II) is a multi-objective genetic algorithm (MOGA) modelled on genotypic behaviour and is still the standard benchmark algorithm in multi-objective optimization (MOO) studies [7, 14]. The algorithm improves on the \(O(MN^3)\) complexity of previous non-dominated sorting genetic algorithm (GA) [21], as well as adds elitism whilst removing the need for a sharing parameter [7].

PAES is a MOEA that uses an (1+1)-ES that uses only mutation on a single parent to create a single offspring. Therefore, PAES is a local search strategy [13]. It uses an archive to store the non-dominated solutions found so far and it maintains diversity through a crowding procedure that recursively divides the objective space into a grid.

MOEA/D is an EA based on the concept of decomposition [22]. It decomposes a MOP into a number of sub-problems and then optimizes the sub-problems simultaneously. Each sub-problem is optimized using only information from its neighbouring sub-problems. Each population contains the best solution for each of the sub-problems.

EAs mainly focus on genetic concepts and natural selection. However, CAs model cultural evolution and are based on theories in sociology and archeology [19]. Individuals are described through behavioural traits and can generate generalized descriptions of their experience. CA was extended to solve MOPs by Coello and Becerra [4].

3 Experimental Setup

This section discusses the experimental setup used for this study. The algorithms are discussed in Sect. 3.1. Sections 3.2 and 3.3 present the benchmarks and performance measures that were used to evaluate the performance of the algorithms on CMOPs. The statistical analyses conducted on the obtained results are discussed in Sect. 3.4.

3.1 Algorithms

The following algorithms were used in this study: SNSGA-II [7], PAES [13], MOEA/D [22] and CA [4]. For each of these algorithms two configurations were used in the study, namely one configuration that implemented the penalty function approach and one configuration that implemented the modified Pareto-dominance approach (refer to Sect. 2.1).

Each algorithm was run 30 times for a maximum of 25 000 fitness evaluations. Each algorithm had a population size of 100 and PAES used 3 bi-sections. The crossover probability was set to 0.75 and the mutation probability to 0.1, since these values promote a balance between exploration and exploitation of the search space [9]. Simulated binary crossover (SBX) [6] and polynomial mutation [5] were used to generate the offspring, where applicable. For MOEA/D, the neighbour size was set to 20, the neigbour selection probability to 0.9 and the maximum number of replaced solutions to 2.

3.2 Benchmark Functions

The following benchmark functions were used to evaluate the performance of the various algorithms:

  • Binh2 with 2 decision variables and 2 inequality constraints [1], referred to as \(f_1\)

  • Osyczka2 that has 6 decision variables and 6 inequality constraints [15], referred to as \(f_2\)

  • Srinivas with 2 decision variables and 2 inequality constraints [21], referred to as \(f_3\)

  • Two Bar Truss with 3 decision variables, 4 inequality constraints and a boundary constraint [12], referred to as \(f_4\)

  • Welded Beam that has 4 decision variables and 4 constraints [17], referred to as \(f_5\)

3.3 Performance Measures

This section discusses the performance measures that were used to evaluate the performance of the MOEAs.

Inverted Generational Distance (IGD). Inverted generational distance (IGD) measures the distance between each solution in the optimal POF and the closest solution in the approximated POF [20]. A small value indicates that an algorithm performed well with regards to both convergence and diversity.

Hypervolume (HV). The hypervolume (HV) or S-metric measures how much of the objective space is dominated by a non-dominated set [23, 24]. The reference vector or reference point that was used in the HV calculation was the vector that consisted of the worst value for each objective of the union of all non-dominated solutions of all approximated POFs that were compared against each other. A high HV value indicates good performance.

\(\varvec{\epsilon }\) Metric ( \(\varvec{\epsilon _m}\) ). Zitzler et al. presented the \(\epsilon \)-metric (\(\epsilon _m\)) to compare approximated sets [25]. It measures the factor by which an approximation set is worse than another approximation set with respect to all objectives, i.e. it provides the factor \(\epsilon \) where for any solution in set B there is at least one solution in set A that is not worse by a factor of \(\epsilon \) in all objectives. A small EP value indicates good performance.

3.4 Statistical Analysis

For each performance measure a Friedmann Test was performed to determine whether there was a statistical significant differences between the entire set of results. If the Friedmann Test indicated a statistical significant difference, pair-wise Mann-Whitney U tests were conducted between the various algorithm configurations. If the Mann-Whitney U test indicated a statistical significant difference, the algorithm with the best average for the specific measure was awarded a win and the other algorithm a loss. All statistical tests were conducted with a confidence level of 95%.

4 Results

This section discusses the results that were obtained from the experiments.

4.1 Inverted Generational Distance

This section discusses the IGD values that were obtained by the various algorithms. The IGD mean and standard deviation values are presented in Table 1. In Table 1 D refers to the modified Pareto-dominance approach and P refers to the penalty function approach.

Table 1. IGD mean and standard deviation values

From Table 1 it can be seen that for SNSGA-II both constraint-dealing approaches performed equally well. Not one of the approaches obtained a better average and standard deviation value for any of the functions. For MOEA/D the penalty function approach obtained the best IGD average and standard deviation values for 3 of the functions (indicated in bold) and the modified Pareto-dominance approach for none of the functions. In contrast, for the CA and PAES, the modified Pareto-dominance approach obtained the best IGD average and standard deviation values for 2 and 1 of the 5 functions respectively, and the penalty function approach for none of the functions.

Applying a Friedman Test on the means of the IGD values revealed that the null hypothesis had to be rejected, and hence, there was a statistical significant difference in the performance of the algorithms. The results of the Mann-Whitney U tests are presented in Table 2. In Table 2, < indicates that the left hand algorithm performed statistical significantly better than the right hand algorithm, > indicates that the left hand algorithm performed statistical significantly worse than the right hand algorithm, and \(=\) indicates equivalent performance. The results indicate that for each of the MOEAs there was not a statistical significant difference in performance between the configurations that incorporated the two constraint-dealing approaches. However, there was a statistical significant difference in performance when the various MOEAs were compared against one another. Both SNSGA-II configurations outperformed both configurations of PAES and CA. MOEA/D using the penalty function also outperformed both configurations of PAES and CA. The modified Pareto-dominance MOEA/D outperformed both configurations of CA and only the modified Pareto-dominance approach of PAES. The modified Pareto-dominance MOEA/D experienced no statistical significant difference in performance with the penalty function configuration of PAES. There was also no statistical significant difference in performance between the SNSGA-II and MOEA/D configurations.

The percentage of wins obtained by each of these two constraint-dealing approaches for IGD is presented in Table 3. The results indicate that the penalty function approaches marginally outperformed the modified Pareto-dominance approaches.

Table 2. Comparison of algorithms’ performance using Mann-Whitney U tests
Table 3. Wins obtained by the two constraint-dealing approaches

4.2 Hypervolume and \(\epsilon \)-metric

Friedman Tests on the mean HV values and the mean \(\epsilon _m\) values indicated that there was no statistical significant difference in performance of the various algorithms. The HV and \(\epsilon _m\) values are presented in Tables 4 and 5.

Table 4. HV Mean and standard deviation values

The penalty-based SNSGA-II outperformed the modified Pareto-dominace SNSGA-II by obtaining the better HV mean and deviation for 2 functions. A similar trend was observed for MOEA/D. For the CA, the modified Pareto-dominance approach outperformed the penalty-based approach on 3 of the 5 functions. However, both approaches performed equally well for PAES. A similar trend was observed for \(\epsilon _m\).

Table 5. \(\epsilon _m\) mean and standard deviation values

5 Conclusion

This study investigated the effect of two constraint-dealing approaches on the performance of the steady-state non-dominated sorting genetic algorithm II (SNSGA-II), the Pareto-archived evolution strategy (PAES), the multi-objective evolutionary algorithm based on decomposition (MOEA/D) and a cultural algorithm (CA). The two constraint-dealing approaches are: adding a penalty term to each objective function; and using a modified Pareto-dominance approach that incorporates constraint violations. Each multi-objective evolutionary algorithm (MOEA) had two configurations, where each configuration incorporated one of these approaches to deal with constraints.

The results indicated that there was no statistical significant difference between the two constraint-dealing approaches. However, the penalty function approach did slightly outperform the modified Pareto-dominance approach based on the wins and losses. Furthermore, for the CA the modified Pareto-dominance approach performed better on all performance measures. In contrast, the penalty-based MOEA/D outperformed the modified Pareto-dominance version on all performance measures. For PAES, both approaches performed equally well, except on the inverted generational distance (IGD) measure where the modified Pareto-dominance approach slightly outperformed the penalty function approach. The penalty function SNSGA-II slightly outperformed the modified Pareto-dominance SNSGA-II on the hypervolume (HV) and \(\epsilon _m\). However, both approaches performed equally well on IGD.

A penalty function modifies the objective space and can lead to changes in the optima. The modified Pareto-dominance approach does not change the objective functions. Since there was not a huge statistical significant difference in performance between these two approaches, it is simpler to rather use the modified Pareto-dominance approach than to add a penalty function where the ideal penalty parameter values, that are problem dependent, should be found.

Future work will include extending the study to incorporate additional multi-objective optimization constrained problems [2], as well as adaptive penalty approaches [10].