A new hybrid mutation operator for multiobjective optimization with differential evolution

Sindhya, Karthik; Ruuska, Sauli; Haanpää, Tomi; Miettinen, Kaisa

doi:10.1007/s00500-011-0704-5

A new hybrid mutation operator for multiobjective optimization with differential evolution

Original Paper
Published: 22 March 2011

Volume 15, pages 2041–2055, (2011)
Cite this article

Download PDF

Access provided by CONRICYT-eBooks

Soft Computing Aims and scope Submit manuscript

A new hybrid mutation operator for multiobjective optimization with differential evolution

Download PDF

Karthik Sindhya¹,
Sauli Ruuska¹,
Tomi Haanpää¹ &
…
Kaisa Miettinen¹

611 Accesses
47 Citations
Explore all metrics

Abstract

Differential evolution has become one of the most widely used evolutionary algorithms in multiobjective optimization. Its linear mutation operator is a simple and powerful mechanism to generate trial vectors. However, the performance of the mutation operator can be improved by including a nonlinear part. In this paper, we propose a new hybrid mutation operator consisting of a polynomial-based operator with nonlinear curve tracking capabilities and the differential evolution’s original mutation operator, for the efficient handling of various interdependencies between decision variables. The resulting hybrid operator is straightforward to implement and can be used within most evolutionary algorithms. Particularly, it can be used as a replacement in all algorithms utilizing the original mutation operator of differential evolution. We demonstrate how the new hybrid operator can be used by incorporating it into MOEA/D, a winning evolutionary multiobjective algorithm in a recent competition. The usefulness of the hybrid operator is demonstrated with extensive numerical experiments showing improvements in performance compared with the previous state of the art.

Differential Evolution with Landscape-Based Operator Selection for Solving Numerical Optimization Problems

A Differential Evolution Algorithm for Optimization Including Linear Equality Constraints

Generalized Hybrid Evolutionary Algorithm Framework with a Mutation Operator Requiring no Adaptation

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In evolutionary multiobjective optimization (EMO), nature-inspired numerical methods known as evolutionary algorithms (EAs) are applied to solve optimization problems with multiple, conflicting objectives. Unlike a single-objective optimization problem, a multiobjective optimization problem (MOP) does not, in general, have a unique optimal solution. Instead, the optimal solutions to a MOP constitute a possibly infinite set of compromise solutions, known as Pareto optimal solutions, which can be ordered only by subjective preferences. Intuitively, the aim in EMO is to represent the range of tradeoffs within the Pareto optimal solutions using a fixed-size population of nondominated solutions.

The two main concerns in designing an EMO method are the optimality of obtained solutions (convergence), and the representativeness of the solutions (diversity). Thus, an EMO method must maintain diversity in the population not only to prevent premature convergence, but also because the final population should represent a wide range of different nondominated solutions. Ensuring diversity in the objective space is a nontrivial task and perhaps for this reason there has been an emphasis on research towards selection operators—which operate in the objective space—instead of the mutation and crossover operators. In many cases, the mutation and crossover operators used in an EMO method have been developed for single-objective optimization (see, for example, Coello et al. 2007; Deb 2001). This fact has already been pointed out in Jin and Sendhoff (2003), Zhang et al. (2008). In Zhang et al. (2008), it is used, together with an assumed regularity of the solution set in the decision space, to motivate a multiobjective optimization method modelling explicitly the interdependencies between decision variables.

In single-objective optimization, differential evolution (DE) has become a widely used method (Price et al. 2005; Storn and Price 1996). An appealing feature of DE is its extremely uncomplicated, self-adapting mutation operator which is based on random perturbations obtained by sampling difference vectors from the population of decision vectors maintained by the algorithm. The mutation operator of DE is capable of implicitly detecting and exploiting linear interdependencies between decision variables, but does not cope as well with more complicated, nonlinear interdependencies (Ruuska and Aittokoski 2008). We say that a MOP involves nonlinear interdependencies between decision variables if the Pareto-optimal solutions in the decision space do not fall on a plane of lower dimension than the space itself.

Several variants of DE have been proposed in the literature also for multiobjective optimization, by many authors including Abbass (2002), Babu and Jehan (2003), Kukkonen and Lampinen (2005), Robic and Filipic (2005), Santana-Quintero and Coello (2005). For a review, see, for example, Price et al. (2005), Santana-Quintero and Coello (2005). The existing multiobjective variants of DE have also proven successful in solving many of the multiobjective test problems available (Kukkonen and Lampinen 2006; Zhang et al. 2009a). However, as pointed out in Jin and Sendhoff (2003), Okabe et al. (2004), in many widely used multiobjective test problems, the solution set in the decision space can be defined by piecewise linear functions even though the objective functions as such are nonlinear. This raises the question whether part of the success of the DE-based EMO methods could be due to the capability of their mutation operator to exploit linear interdependencies between decision variables in a problem, and whether the efficiency of these methods in solving problems with nonlinear interdependencies could be improved by changes in the mutation operator only.

Different mutation operators have been proposed in the literature. Among others, a trigonometric mutation operator was proposed by Fan and Lampinen (2003). Here, a hypertriangle formed by three vectors is used to bias the perturbation of a target vector towards the vector providing the lowest function value. In Cauchy mutation (Ali and Pant 2010), a decision vector is perturbed by using a Cauchy distributed random variable, with the hope of pulling them from a local basis of attraction. Liu and Lampinen (2005) proposed a fuzzy adaptive differential evolution algorithm, which uses fuzzy logic controllers to adapt search parameters (scaling factor, crossover ratio and population size). In PDE-PEDA proposed by Wang et al. (2009) an estimation of distribution algorithm (global operator) and a linear DE crossover (local operator) is used with a self-adaptive probability. The self-adaptive probability is used here to balance the global and local information. Lara et al. (2010) proposed a local search strategy (HCS) using the geometry of the directional cones to generate solutions both towards and along the Pareto set. Finally, let us mention a hybrid DE operator proposed by Kaelo and Ali (2007). This mutation operator using both the linear mutation operator of DE and an electromagnetism-like algorithm (using an attraction-repulsion technique to move closer to global minima) is used with a pre-fixed probability. Overall, one can say that not much attention has been paid in the DE literature on handling both linear and nonlinear interdependencies between decision variables.

In this paper, our motivation was to propose a new hybrid mutation operator which can robustly handle both linear and nonlinear interdependencies between decision variables, simultaneously retaining the simplicity of the linear differential evolution mutation operator. In practice, no a priori knowledge exists about the interdependencies between decision variables. Hence, the proposed operator is a combination of the DE’s linear mutation operator and a new polynomial part to handle nonlinear interdependencies between decision variables, which can be used as a drop-in replacement for DE’s mutation operator.

The so-called curvature detection in the polynomial part is based on polynomial approximation which is used to guide the generation of new trial vectors. One can say that there is nothing new in using polynomials in optimization; especially quadratic polynomials are frequently used for line search and other purposes (see e.g., Bazaraa et al. 2006; Nocedal and Wright 1999). Polynomials have also been used for trial vector generation in population-based algorithms before (see e.g., Ali et al. 1997; Schütze et al. 2007). Our use of polynomials, however, differs from the current practice in that we do not use polynomials to model objective functions, but the interdependencies between decision variables of the problem. In particular, new trial points are not determined by the extreme points of the approximations as is common in line search procedures.

The rest of the paper is structured as follows: In Sect. 2, we discuss the basics of multiobjective optimization and DE as well as introduce the notation used in this paper. Then, in Sect. 3 we propose a new polynomial part and, finally, a new hybrid operator which utilizes both the original operator of DE as well as the new polynomial part. In Sect. 4 we demonstrate the potential and usefulness of the new hybrid operator when solving multiobjective optimization problems. With a versatile set of test problems we show how the performance of a linear mutation based MOEA/D algorithm can be improved with our hybrid operator. This should encourage further research in this direction. Finally, the paper is concluded in Sect. 5.

2 Notation and background

In this section, we present the notation and background necessary for the rest of the paper. First, we formulate a MOP with the relevant definitions. Then, we provide a brief review of the functioning and properties of the DE.

2.1 Basics of multiobjective optimization

We consider a MOP

$$ \begin{array}{ll} {\hbox{minimize}}& {\mathbf f}({\mathbf x})=(f_1({\mathbf x}),f_2({\mathbf x}),\ldots,f_k({\mathbf x}))\\ {\hbox{subject to}}& {\mathbf x}\in S \end{array} $$

(1)

with a feasible set ${S\subset{\mathbb{R}}^{n}}$ and k ≥ 2 conflicting objective functions ${{f_i}:S\rightarrow\mathbb{R}}$. The n-dimensional vectors ${\mathbf x}\in S$ are called decision vectors and their images ${\mathbf z}={\mathbf f}({\mathbf x})$ objective vectors. The set of attainable objective vectors is denoted by Z, $Z={\mathbf f}(S)$.

An objective vector ${\mathbf z}\in Z$ is called Pareto optimal if there does not exist a vector $\bar{{\mathbf z}}\in Z$ such that $\bar{{z}}_i\le {z}_i$ ($i=1,2,\ldots,k$) with at least one strict inequality. Naturally, a decision vector ${\mathbf x}\in S$ is Pareto optimal if its image ${\mathbf f}({\mathbf x})$ is Pareto optimal. In what follows, the set of Pareto-optimal decision vectors is called the Pareto set, and the set of Pareto-optimal objective vectors the Pareto front. For two decision vectors ${\mathbf x}^1,{\mathbf x}^2\in S,$ ${\mathbf x}^1$ is said to dominate ${\mathbf x}^2$ if $f_{i}({\mathbf x}^1)\le f_{i}({\mathbf x}^2)$ ∀ $i = 1,\ldots,n$ and $f_{j}({\mathbf x}^1) < f_{j}({\mathbf x}^2)$ for at least one index j. In the context of EMO methods, the subset of vectors in a population not dominated by any other vector in the population is called the nondominated set. Typically, EMO algorithms aim at generating nondominated solutions representing the Pareto front as well as possible (i.e., both being close to the Pareto front and reflecting different tradeoffs).

2.2 Basics of differential evolution

DE is a stochastic, population-based direct search method introduced by Storn and Price for optimizing a real-valued function of continuous variables (Price 2005; Storn and Price 1996). It is distinguished from other EAs mostly by its remarkably simple trial vector generation scheme in which a scaled difference of vectors originating from a fixed-size population of decision vectors is used. In spite of its simplicity, DE has demonstrated its capability to find good solutions to a wide variety of different benchmark problems as well as practical optimization problems while comparing favourably to other EAs in terms of convergence speed (Price et al. 2005).

The basic step of DE consists of updating a population of P + 1 decision vectors $P=\{{\mathbf x}^0,{\mathbf x}^1,\ldots,{{\mathbf x}^p}\}.$ For each vector ${{\mathbf x}^t}\in P$ in turn, a trial vector $\hat{{\mathbf x}}^t$ is generated. The vector ${{\mathbf x}^t}$ is called a target vector because it will be replaced in P by the trial vector $\hat{{\mathbf x}}^t$ for the next iteration if $\hat{{\mathbf x}}^t$ yields a better objective function value than ${\mathbf x}^t.$ First, a subset of three decision vectors $Q =\{{\mathbf x}^{r_1},{\mathbf x}^{r_2}, {\mathbf x}^{r_3}\}\subset P$ is selected using three distinct random indices r _i, 0 ≤ r _i ≤ p, i = 1, 2, 3. In the original DE mutation, a trial vector $\hat{{\mathbf x}}^t$ is obtained as a crossover of ${\mathbf x}^t$ and a mutated vector $\bar{{\mathbf x}}^t={\mathbf x}^{r_1}+F({\mathbf x}^{r_3}-{\mathbf x}^{r_2})$, where F > 0 is a scaling factor. In what follows, we refer to this mutation operator as LIMO (linear mutation operator). For different crossover operators and many variants of DE, see, for example, Price et al. (2005).

The use of perturbations, the difference vectors in this case, derived from the current population instead of utilizing an external probability distribution makes the mutation operator of DE self-adaptive (Price et al. 2005). That is, both the scale and orientation of the search are adapted to the extent of the current population. The self-adaptation in DE works especially well if all the interdependencies between decision variables in the problem are linear, but can fail to extract information of any nonlinear interdependencies between decision variables based on the relative positioning of the points in the population (Ruuska and Aittokoski 2008).

3 New hybrid operator and its elements

In this section, we first introduce a polynomial-based operator and then a new hybrid operator which utilizes both the polynomial-based operator and the original LIMO operator of DE. We begin by a brief introduction to interpolation with polynomials.

3.1 Brief background of interpolation with polynomials

In general, a polynomial

$$ p(x) = c_dx^d + c_{d-1}x^{d-1} + \cdots + c_1x + c_0 $$

(2)

can be fitted to data consisting of pairs ${(x_i, y_i) \in {\mathbb R} \times {\mathbb R}}\,i = 1, \ldots, d+1,$ so that it interpolates the pairs, that is, P(x _i) = y _i for each i. As commonly known, the polynomial, which interpolates the given pairs and the degree of which is smaller or equal to the number of pairs plus one, is always unique (see for example Kincaid and Cheney 2002). Here, for deriving an integral element, polynomial operator, for our hybrid operator, we use the Vandermonde matrix (Kincaid and Cheney 2002) to define coefficients c _j. This decision has been made because of its simplicity. However, the Vandermonde matrix may be ill-conditioned and therefore coefficients c _j may be inaccurate (Kincaid and Cheney 2002). In what follows, we only consider second-degree polynomials (d = 2 in Eq. 2 to avoid ill-conditioned Vandermonde matrices. Another benefit of second-degree polynomials is that we can express the formulas of coefficients c _j by using values x _i and y _i, i = 1, 2, 3, only.

For given pairs (x ₁, y ₁), (x ₂, y ₂) and (x ₃, y ₃), the coefficients c ₂, c ₁ and c ₀, for a second-degree polynomial p in Eq. 2, using the Vandermonde matrix are the following:

$$ \begin{aligned} c_2 &= \frac{x_3(y_2 - y_1) + x_2 (y_1 - y_3) + x_1 (y_3 - y_2)}{(x_1 - x_2) (x_1 - x_3) (x_2 - x_3)}, \\ c_1 &= \frac{x_3^2 (y_1 - y_2) + x_1^2 (y_2 - y_3) + x_2^2 (y_3 - y_1)}{(x_1 - x_2) (x_1 - x_3) (x_2 - x_3)} {\hbox{and}} \\ c_0 &= \frac{x_2 x_3 (x_2 - x_3) y_1 + x_1 x_3(x_3 - x_1) y_2 + x_1 x_2 (x_1 - x_2) y_3}{(x_1 - x_2) (x_1 - x_3) (x_2 - x_3)}. \end{aligned} $$

Note that the degree of the polynomial p can be smaller than the number of the given pairs. For example, if y ₁ = y ₂ = y ₃, then we get c ₂ = c ₁ = 0 and c ₀ = y ₁ (generally, this holds also for any d > 2 in Eq. 2, see Kincaid and Cheney 2002).

It is well known that interpolation methods based on polynomials are not restricted to the above type of interpolation. However, here we consider this simple approach, because we want to use a small number of data pairs. Other types of polynomial-based interpolations like splines can be found, for example, in Kincaid and Cheney (2002) with further discussions.

3.2 Polynomial-based point tracking in a higher dimension

In this subsection, we fit a polynomial-based curve to three randomly selected decision vectors ${\mathbf x}^0, {\mathbf x}^1, {\mathbf x}^2 \in P$ so that the curve interpolates the vectors. In the previous subsection, polynomial p in Eq. 2 is constructed only for real values y ₁, y ₂ and y ₃. To fit a curve to n-dimensional vectors ${\mathbf x}^0,$ ${\mathbf x}^1$ and ${\mathbf x}^2,$ we use n second-degree polynomials.

A curve ${\mathbf p}$ created for $Q = \{{\mathbf x}^0, {\mathbf x}^1, {\mathbf x}^2\}$ is a function from ${{\mathbb R}}$ into ${{\mathbb R}^n},$

$$ {\mathbf p}(t) = (p_1(t), p_2(t), \ldots, p_n(t))^{T}, $$

(3)

where p _i is a polynomial from ${{\mathbb R}}$ into ${{\mathbb R}}$ for each $i = 1, \ldots, n.$ Polynomials p _i, stated as

$$ p_i(t) = c_2^it^2 + c_1^it^1 + c_0^i, $$

(4)

are selected so that polynomial p _i interpolates pairs (0, x ⁰_i ), (1, x ¹_i ) and (2, x ²_i ) for all $i = 1, \ldots, n$. In this case, coefficients c ₂, c ₁ and c ₀ are the following:

$$ \begin{aligned} c_2^i &= \frac{x_i^0 - 2x_i^1 + x_i^2}{2}, \\ c_1^i &= \frac{4x_i^1 - 3x_i^0 - x_i^2}{2} \, {\hbox{and}} \\ c_0^i &= x^0_i. \end{aligned} $$

(5)

3.3 A polynomial part for the new operator

In the previous subsection, we have constructed a curve ${\mathbf p}$ to trace the path of ${\mathbf x}^0, {\mathbf x}^1 ,{\mathbf x}^2 \in Q$ and next we propose to use this “path”-information to generate new trial vectors. Once the polynomials $p_1, p_2, \ldots, p_n$ have been found based on Q, we get a trial vector as ${\mathbf p}(t)$ with an appropriate t value. In the context of DE, we use ${\mathbf p}$ defined in Eqs. 3–5 as a new POlynomial based Mutation Operator, POMO, which will be used in our hybrid operator.

As said, the POMO operator uses t as a parameter to generate new trial vectors and depending on the t value POMO generates new trial vectors in different ways. According to the previous subsection, polynomials p _i are fitted to Q with t values 0, 1 and 2. Once the formula for $\mathbf p$ as a function of t has been obtained, it can be used to generate a new trial vector with any value of t. Now, with a t value between 0 and 2, the curve ${\mathbf p}$ interpolates and with a t value higher than 2, the tracking curve ${\mathbf p}$ extrapolates. We limit our consideration to values between 2 and 3 because we do not want to get too far from the original vectors. In the case of interpolation, POMO produces a new trial vector between ${\mathbf x}^0$ and ${\mathbf x}^1$ or ${\mathbf x}^1$ and ${\mathbf x}^2$ depending on the selected interval. In this way, POMO tries to increase the uniform distribution of trial vectors between ${\mathbf x}^0,$ ${\mathbf x}^1$ and ${\mathbf x}^2.$ To generate a new trial vector outside ${\mathbf x}^0,$ ${\mathbf x}^1$ and ${\mathbf x}^2,$ one can use the extrapolation version of POMO.

Let us consider three decision vectors ${\mathbf x}^0,$ ${\mathbf x}^1$ and ${\mathbf x}^2$ in the decision space as shown in Fig. 1. Now, if we use LIMO at ${\mathbf x}^0,$ we generate a new trial vector, denoted as ${\mathbf x}^L$ based on vectors ${\mathbf x}^1$ and ${\mathbf x}^2.$ Whereas using POMO instead of LIMO, we can use the point tracking property of ${\mathbf p}$ and extract the curvature information between the decision vectors ${\mathbf x}^0,$ ${\mathbf x}^1$ and ${\mathbf x}^2$ to get vector ${\mathbf x}^i$ or ${\mathbf x}^e.$ In Fig. 1, vector ${\mathbf x}^i$ is produced by POMO with a t value 1.5. In this case, vector ${\mathbf x}^i$ is between ${\mathbf x}^1$ and ${\mathbf x}^2$ as a result of the interpolation property of POMO. On the other hand, vector ${\mathbf x}^e$ has been generated with a t value 2.5 and it aims at expanding the search area, guided by the vectors ${\mathbf x}^0,$ ${\mathbf x}^1$ and ${\mathbf x}^2.$

As seen in the fitting phase of polynomials p _i, coefficients c _j require an order for the vectors ${\mathbf x}^0,$ ${\mathbf x}^1$ and ${\mathbf x}^2.$ Note that vectors ${\mathbf x}^0,$ ${\mathbf x}^1$ and ${\mathbf x}^2$ are randomly selected and superscripts are here used to denote and identify the random vectors. As known, three randomly selected vectors can be ordered in six different ways and no matter what the order is, coefficients c _j can be calculated. However, it is very difficult to say beforehand which order is the best one for the polynomials p _i. For simplicity, we propose to use the order of randomly selected vectors ${\mathbf x}^0,$ ${\mathbf x}^1$ and ${\mathbf x}^2.$ In other words, ${\mathbf x}^0$ is the first randomly selected vector for the fitting t value 0 and vectors ${\mathbf x}^1$ and ${\mathbf x}^2$ for t values 1 and 2, respectively.

3.4 New hybrid mutation operator

The variable dependencies in any practical problem are a priori unknown. Here, we propose a hybrid mutation operator comprising of both POMO and LIMO parts. For any DE-based EMO algorithm, we suggest to use HOP so that LIMO is used with a probability P _c and POMO otherwise. Henceforth, we refer to such an operator as a Hybrid OPerator (HOP). HOP is a versatile operator for an EMO algorithm because it generates trial vectors in different, yet simple ways.

As mentioned earlier, POMO can act as an extrapolation or an interpolation operator depending on the value of t. We suggest to use both extrapolation and interpolation with a prefixed probability in POMO. In what follows, the extrapolation probability is denoted by P _extra and the interpolation probability by P _inter. Naturally, we need only one of them because P _extra + P _inter = 1. The extrapolation operator behaves like an exploration operator and has a similar role as LIMO. The interpolation operator, on the other hand, exploits locally nonlinear interdependencies.

A balance of extrapolation and interpolation is essential. This can be explained with the concepts of exploration and exploitation. Although these concepts are often utilized in the literature to describe a good optimization method, the same can be extended to POMO. In this paper, we aimed at striking a balance between exploration of the decision space and exploitation of the curvature information from the sample in the decision space. POMO is constructed based on the principle of interpolation and hence it may not necessarily capture the exact nonlinear behaviour when trial vectors are generated by extrapolation. In other words, if POMO is used in HOP with a probability P _extra = 1.0, then it shows exploration behaviour. On the other hand, if POMO is used in HOP with a probability P _extra = 0, that is, P _inter = 1.0 meaning only interpolation, then it exploits the curvature information between chosen decision variables for trial vector generation. Using only extrapolation (exploration) can lead to slow convergence, whereas performing only interpolation (exploitation) can lead to premature convergence. As a result, POMO can be trapped in locally Pareto optimal fronts. Thus, instead of having either extrapolation or interpolation, one may recommend using them both. A detailed parametric study is performed in the next section for the MOEA/D algorithm to suggest a range for choosing P _inter (and P _extra).

We propose to replace the trial point generation of the LIMO operator by the hybrid operator HOP which can be described as follows:

Generate a random number r between 0 and 1.
If r ≤ 0.75, set $\hat{\mathbf x}^t={\mathbf x}^0+F({\mathbf x}^2-{\mathbf x}^1)$
Else set $\hat{{\mathbf x}}^t={\mathbf p}(t)$, where t is randomly selected
- between 0 and 2 if random number for the probability of interpolation is below P _inter and
- between 2 and 3 otherwise.

Because of its simple structure, LIMO can be easily replaced with HOP in any appropriate EMO algorithm.

4 Numerical experiments

We have proposed a new hybrid operator HOP, and in this section we demonstrate how it can be used with the MOEA/D algorithm (Zhang et al. 2009a) and what advantage can be obtained with the new operator when compared with the original algorithm. We have selected MOEA/D to be used in the tests because of its superior performance in the CEC 2009 competition. However, MOEA/D is a high-performance algorithm and one cannot expect drastic improvements in the performance. If we show significant statistical improvement in the performance of the MOEA/D algorithm, the strength of the new operator is supported. We first present a brief summary of the EMO algorithm used, that is, MOEA/D and subsequently present the results of the parameter study and suggestions to choose the parameter values. Finally, we present the results of numerical tests comparing HOP and the original LIMO-based MOEA/D. Here, we consider all the test problems of the CEC 2007 (Huang et al. 2007; Zhang et al. 2009a, b) EMO competitions as these test problems were designed to represent various complicated problems and accepted for comparing different EMO algorithms. These sets contain 14 bi-objective problems, 9 problems with three and 8 problems with five objectives. In Table 1, we present the test problems with their corresponding numbers of variables and objectives and the type of dependency in their Pareto sets.

Table 1 Test problem instances

Full size table

Zhang and Li (2007) proposed MOEA/D, a high-performance EMO algorithm based on scalarizing functions. Subsequently, a new version of the MOEA/D algorithm was proposed in Zhang et al. (2009a). The fundamental principle behind MOEA/D is that a multiobjective problem is scalarized into a number of single objective problems with distinct weight vectors and solved simultaneously. This is because any Pareto optimal solution can be obtained by solving an appropriate scalarizing function (Miettinen 1999). A weighted Chebyshev approach (Miettinen 1999) is used in the MOEA/D algorithm for scalarization.

MOEA/D employs a set of N individuals and uniformly distributed weight vectors, where N is the population size. Each of these weight vectors are used to formulate a single objective problem, that is, a subproblem. A neighbourhood is defined for each individual in the population based on the distances between the weight vectors. Genetic operations (i.e., selection, crossover and mutation) are performed in each of the neighbourhoods. An offspring is generated by crossover and mutation with randomly selected parents in the neighbourhood. This offspring is then compared with the parents and each of the neighbours based on the value of the scalarizing function. If the offspring is better than any of the individuals used in the comparison, it replaces them. The algorithm finally terminates when a suitable termination criterion is met. MOEA/D used in this paper uses dynamic resource allocation to assign different amounts of computational effort to different subproblems. A complete description of the algorithm is provided in Zhang et al. (2009a).

The MOEA/D algorithm employs a linear DE mutation operator LIMO as a crossover operator. The offspring generated by crossover is subsequently mutated using a mutation operator (Zhang et al. 2009a) with a probability to produce a new offspring. For our comparisons we change the crossover operator of MOEA/D and, instead, employ HOP and the rest of the algorithm remains unchanged. In this paper, we refer to the modified MOEA/D algorithm with HOP as MOEA/D-HOP. As described in the previous section, inside MOEA/D-HOP, we have a probability for using either POMO or LIMO.

4.1 Test setting

The MOEA/D algorithm as suggested in Zhang et al. (2009a) is used as the EMO algorithm in the comparisons with MOEA/D-HOP. The program for the MOEA/D algorithm provided in http://dces.essex.ac.uk/staff/qzhang/moeacompetition09.htm is used for tests. The test problems include all the 31 box-constrained test problems of the CEC 2007/2009 competitions. To have a fair comparison with the competition results, we borrow the performance metric and parameter settings of the CEC 2009 competition. Next, we summarize the parameter setting used for all subsequent tests:

1.
Performance metric: Inverted generational distance (IGD) is used as the performance metric. If P ^* is the set of uniformly distributed Pareto optimal solutions in the objective space and P the obtained approximation set of non-dominated solutions in the objective space from the EMO algorithm, the IGD value for the approximation set is defined by
$${\text{ IGD}}(P,P^{*}) = \frac{\sum_{{\mathbf{v}}\in P^{*}} d({\mathbf{v}},P)}{|P^{*}|} $$
(6)
where $d(\mathbf{v},P)$ is the minimum Euclidean distance between $\mathbf{v}$ and points in P and |P ^*| is the number of points in P ^*.
2.
The maximal number of approximate solutions (|P|) for the IGD value calculation: 100 for bi-objective problems, 150 for three objectives, and 800 for five objectives.
3.
Algorithm stopping criterion is the maximal number of function evaluations = 300,000.
4.
The number of independent runs for statistical tests = 30.

In what follows, we present the parameter settings specific to the MOEA/D algorithm. We want to point out that the parameter settings for both MOEA/D and MOEA/D-HOP are the same as in Zhang et al. (2009a).

1.
Population size (N): 600 for bi-objective problems, 1000 for three objectives, and 1500 for five objectives.
2.
Number of weight vectors in the neighbourhood of each weight vector (T) = 0.1N and n _r = 0.01N.
3.
Probability for selection of mating/update range δ = 0.9.
4.
Crossover ratio (CR) = 1.0 and scale factor (F) = 0.5. Mutation probability P _m = 1/n and mutation distribution index η = 20.

The parameters for HOP are the following.

1.
Probability of LIMO operator P _c = 0.75 (heuristically set based on experiments).
2.
Probability of interpolation P _inter and extrapolation (P _extra = 1 − P _inter) set based on an experimental study, which is described later in this section.
3.
Parameter t chosen randomly in the interval 0–2 for interpolation and 2–3 for extrapolation.

It can be seen from the above parameter setting that we may have a larger number of individuals in the final population ($\Uppi$) than allowed for the calculation of IGD values. Hence, we prune the population $\Uppi$ to match the competition specifications using the procedure proposed in Zhang et al. (2009a) and obtain the final approximation set P.

For each test problem, when comparing the MOEA/D and the MOEA/D-HOP algorithms, we record the best, median and worst IGD values from 30 independent runs (note that the smaller the IGD value the better). The IGD values of both algorithms are subsequently used to perform the Wilcoxon rank sum test (Gibbons and Chakraborti 2003), a non-parametric statistical test at 5% significance level. The median IGD value of the MOEA/D algorithm is used as the reference value for the final conclusion to be significant success or significant failure of MOEA/D-HOP against MOEA/D. The p-value is the probability of observing the given sample result under the assumption that the null hypothesis is true. The null hypothesis here is defined as H _o:A = B, where A and B are the median IGD values of the MOEA/D and MOEA/D-HOP algorithms, respectively. The Wilcoxon rank sum test is used separately for each test problem to determine whether the perceived difference in the performance levels is statistically significant. A significant test result is declared a significant success if the median IGD value is less than the reference value, and a significant failure, if the median IGD value is greater than the reference value. The overall performance of the MOEA/D-HOP algorithm is judged based on the total number of significant successes achieved over the MOEA/D algorithm.

4.2 Test results

The POMO part in the HOP is used for interpolation with a fixed probability (P _inter) and extrapolation otherwise. Statistical tests were performed with MOEA/D-HOP and P _inter equal to 0.00, 0.25, 0.50, 0.75 and 1.00 for all test problems. For each value of P _inter, we count the number of significant successes N _win. The particular P _inter for which N _win is maximum, is considered as an ideal setting for P _inter. In Tables 2, 3, 4, 5 and 6 we summarize the test results obtained for different values of P _inter. They contain best, median and worst IGD values for both MOEA/D and MOEA/D-HOP and the p-values for all 31 test problems. The significant success of any algorithm by the Wilcoxon rank sum test is marked in bold face.

Table 2 MOEA/D-HOP with 0% interpolation and 100% extrapolation

Full size table

Table 3 MOEA/D-HOP with 25% interpolation and 75% extrapolation

Full size table

Table 4 MOEA/D-HOP with 50% interpolation and 50% extrapolation

Full size table

Table 5 MOEA/D-HOP with 75% interpolation and 25% extrapolation

Full size table

Table 6 MOEA/D-HOP with 100% interpolation and 0% extrapolation

Full size table

In Table 2, the P _inter value is 0.00; in other words, we only extrapolate to generate trial vectors in the POMO part. It can be observed that the numbers of significant successes of MOEA/D and MOEA/D-HOP are 10 and 6, respectively (denoted in bold face). The IGD value difference between the algorithms is insignificant for 15 test problems.

In what follows, we consider using interpolation in POMO as well. The results for P _inter = 0.25 are shown in Table 3. A p-value greater than 5% significance level indicating insignificance is observed in 19 test problems and the number of significant failures of the MOEA/D-HOP reduces to 5 as against 10 in Table 2. This shows the importance of interpolation in POMO. However, the IGD values using this parameter setting do not yet show any significant performance enhancement for MOEA/D-HOP algorithm. Next, we further increase the value of P _inter to 0.5. In this setting, the extrapolation and interpolation in POMO is used with equal probability. The results in Table 4 show further increase in the number of significant successes for the MOEA/D-HOP algorithm to be 12 and insignificant performance improvements can be seen in 14 test problems.

Next, P _inter is increased to 0.75 and now we have a higher probability of performing interpolation in POMO. In such a test setting, the number of significant successes of the MOEA/D-HOP increases to 14 and the number of insignificant performance improvements reduces to 12, as shown in Table 5. One can also see that when P _inter values are 0.25, 0.50 and 0.75, the number of significant failures for MOEA/D-HOP remains 5. Hence, with the increase in interpolation, the number of significant successes increases. Finally, when the P _inter value is 1.00, in other words, when we only interpolate to generate trial vectors in the POMO part, the number of significant successes of MOEA/D-HOP decreases to 9 as can be seen in Table 6. Also the number of significant failures in MOEA/D-HOP increases to 7. Thus, it can be concluded from the studies that the extremes of only extrapolation or interpolation in POMO degrade the performance of the MOEA/D-HOP algorithm and both extrapolation and interpolation are necessary for the best performance of MOEA/D-HOP. Based on the experiments, we can suggest the value of P _inter to be 0.75.

Furthermore, for curiosity we repeat the tests with P _inter = 0.67. From Table 7 we can observe that the number of significant successes for the MOEA/D-HOP algorithm is 13. Thus, the performance of MOEA/D-HOP has not much deteriorated. Hence, the algorithm is rather stable for any value of P _inter between 0.50 and 0.75. Henceforth, for further discussions and conclusions we consider only Table 5 (with P _inter = 0.75).

Table 7 MOEA/D-HOP with 67% interpolation and 33% extrapolation

Full size table

It can be seen from Table 5 that MOEA/D-HOP performs equally well on problems with linear and nonlinear interdependencies between decision variables. Typically, MOEA/D-HOP either performs better than or on par with the LIMO based MOEA/D algorithm, which provides us confidence for using the hybrid based operator HOP. One can say that by employing HOP, the solutions typically do not get worse than the original ones. In the case of more than two objectives, it can be observed from Table 5 that MOEA/D-HOP portrays a better performance when compared with MOEA/D. Even in the case of the worst IGD values of 30 runs, the IGD values of the MOEA/D-HOP algorithm are better in 22 of 31 test problems when compared with the MOEA/D algorithm. This shows that the HOP-based algorithm can be a safe and more reliable choice for solving multiobjective optimization problems.

The MOEA/D algorithm with the LIMO operator showed a bad performance in the UF2, UF4, UF5, UF9 and UF10 test problems against some of its competitors in the CEC 2009 competition. Incidentally, the MOEA/D-HOP algorithm shows significantly improved performance in the UF2, UF4 and UF5 test problems and insignificant improvement in the UF9 and UF10 test problems. On the other hand, with the MOEA/D-HOP algorithm, the median IGD value of the UF2 test problem is 0.0060, which is even better than the best in the competition (0.00615). It must be noted that UF2 has a nonlinear Pareto set and for this reason it may be very difficult for any linear operator to generate trial vectors so that the entire Pareto set could be covered. This observation supports our idea of the importance of having operators that are robust and not linear.

In Fig. 2, we show the nondominated sets of the UF2 test problem representing the median IGD values for both the MOEA/D-HOP and MOEA/D algorithms. Both algorithms seem to provide a good spread of solutions on the Pareto front. However, on a close inspection one can spot a hole in the nondominated set produced by the MOEA/D algorithm. The LIMO operator has failed to generate any trial vectors in that region. A magnified illustration of this hole is also shown in Fig. 2. It can be seen that the nondominated set of the MOEA/D-HOP algorithm has no holes and the HOP operator has generated trial vectors in the whole region.

In addition, we also depict the nondominated sets corresponding to the worst IGD values for both the MOEA/D and MOEA/D-HOP algorithms for the UF2 test problem in Fig. 3. The plot shows a much smaller discontinuity in the nondominated set of the MOEA/D-HOP algorithm when compared with the MOEA/D algorithm. Figures 2 and 3 demonstrate the reliability of the MOEA/D-HOP algorithm in handling Pareto sets as compared to the MOEA/D algorithm.

5 Conclusions and future research directions

In this paper, we have developed a new hybrid operator HOP consisting of a standard mutation operator LIMO of DE and a polynomial mutation operator POMO. With these two elements included, we have constructed a mechanism which uses the curvature information based on presently known vectors in the calculations of new trial vectors and the resulting operator is robust for different types of problems. Yet, the operator proposed is simple to implement.

In HOP, we have proposed to use both POMO and LIMO operators and both interpolation and extrapolation inside POMO for trial vector generation. We demonstrated the efficacy of the new operator with comparisons to MOEA/D, a winning algorithm of a recent CEC competition, which uses the linear operator LIMO. We have observed significant improvement with the new operator in the performance of the MOEA/D algorithm in various types of problems as against the original MOEA/D algorithm. In spite of the fact that one could not expect drastic improvements in the performance of MOEA/D, because it is a high-performance algorithm, the new operator provided robust performance and better results for many test problems.

In the future research, the choice of a proper t value can be further considered including a possibility of ordering the vectors used in the construction phase of the polynomial. In addition, a self-adaptive way of using both LIMO and POMO operators for multiobjective optimization may be formulated instead of a pre-fixed probability. By adaptively using different operators we can better handle demanding real-life multiobjective optimization problems for which we typically do not know the properties of the Pareto optimal set in advance. The performance of the operator with different EMO algorithms is also in our future scopes of study.

References

Abbass HA (2002) The self-adaptive pareto differential evolution algorithm. In: Proceedings of the congress on evolutionary computation (CEC ’02). IEEE Press, pp 831–836
Ali M, Pant M (2010) Improving the performance of differential evolution algorithm using cauchy mutation. Soft Comput. doi:10.1007/s00500-010-0655-2
Ali MM, Törn A, Viitanen S (1997) A numerical comparison of some modified controlled random search algorithms. J Glob Optim 11(4):377–385
Article MATH Google Scholar
Babu BV, Jehan MML (2003) Differential evolution for multi-objective optimization. In: Proceedings of the congress on evolutionary computation (CEC ’03). IEEE Press, pp 2696–2073
Bazaraa MS, Sherali HD, Shetty CM (2006) Nonlinear programming: theory and algorithms. Wiley, Hoboken
Book MATH Google Scholar
Coello CAC, Lamont GB, Veldhuizen DAV (2007) Evolutionary algorithms for solving multi-objective problems, 2nd edn. Springer, New York
MATH Google Scholar
Deb K (2001) Multi-objective optimization using evolutionary algorithms. Wiley, Chichester
MATH Google Scholar
Fan HY, Lampinen J (2003) A trigonometric mutation operation to differential evolution. J Glob Optim 27:105–129
Article MathSciNet MATH Google Scholar
Gibbons JD, Chakraborti S (2003) Nonparametric statistical inference. Marcel Dekker, New York
MATH Google Scholar
Huang VL, Qin AQ, Deb K, Zitzler E, Suganthan PN, Liang JJ, Preuss M, Huband S (2007) Problem definitions for performance assessment of multi-objective optimization algorithms. Technical report, Nanyang Technological University
Jin Y, Sendhoff B (2003) Connectedness, regularity and the success of local search in evolutionary multi-objective optimization. In: Proceedings of the congress on evolutionary computation (CEC ’03). IEEE Press, pp 1910–1917
Kaelo P, Ali MM (2007) Differential evolution algorithms using hybrid mutation. Comput Optim Appl 5(2):231–246
Article MathSciNet Google Scholar
Kincaid D, Cheney W (2002) Numerical analysis: mathematics of scientific computing. Brooks/Cole Publishing Co
Kukkonen S, Lampinen J (2005) GDE3: The third evolution step of generalized differential evolution. In: Proceedings of the congress on evolutionary computation (CEC ’05). IEEE Press, pp 443–450
Kukkonen S, Lampinen J (2006) Constrained real-parameter optimization with generalized differential evolution. In: Proceedings of the congress on evolutionary computation (CEC ’06). IEEE Press, pp 207–214
Lara A, Sanchez G, Coello CAC, Schütze O (2010) HCS: A new local search strategy for memetic multiobjective evolutionary algorithms. IEEE Trans Evol Comput 14(1):112–132
Article Google Scholar
Liu J, Lampinen J (2005) A fuzzy adaptive differential evolution algorithm. Soft Comput 9:448–462
Article MATH Google Scholar
Miettinen K (1999) Nonlinear multiobjective optimization. Kluwer, Boston
MATH Google Scholar
Nocedal J, Wright SJ (1999) Numerical optimization. Springer, New York
Book MATH Google Scholar
Okabe T, Jin Y, Olhofer M, Sendhoff B (2004) On test functions for evolutionary multi-objective optimization. In: Yao X et al (eds) Proceedings of the parallel problem solving from nature (PPSN VIII 2004), Springer, Berlin, pp 792–802
Price KV, Storn RM, Lampinen JA (2005) Differential evolution—a practical approach to global optimization. Springer, Berlin
MATH Google Scholar
Robic T, Filipic B (2005) DEMO: Differential evolution for multiobjective optimization. In: Coello CAC, et al (eds) Proceedings of the evolutionary multi-criterion optimization (EMO 2005). Springer, Berlin, pp 520–533
Ruuska S, Aittokoski T (2008) The effect of trial point generation schemes on the efficiency of population-based global optimization algorithms. In: Proceedings of the international conference on engineering optimization (EngOpt ’08)
Santana-Quintero LV, Coello CAC (2005) An algorithm based on Differential Evolution for multi-objective problems. Int J Comput Intell Res 1(2):151–169
MathSciNet Google Scholar
Schütze O, Talbi E, Coello CC, Santana-Quintero LV, Pulido GT (2007) A memetic PSO algorithm for scalar optimization problems. In: Proceedings of the symposium on swarm intelligence (SIS ’07). IEEE Press, pp 128–134
Storn R, Price K (1996) Minimizing the real functions of the ICEC’96 contest by differential evolution. In: Proceedings of the conference on evolutionary computation (ICEC ’96). IEEE Press, pp 842–844
Wang X, Hao M, Cheng Y, Lei R (2009) PDE-PEDA: A new pareto-based multi-objective optimization algorithm. J Univ Comput Sci 15(4):722–741
MathSciNet MATH Google Scholar
Zhang Q, Li H (2007) MOEA/D: A multi-objective evolutionary algorithm based on decomposition. IEEE Trans Evol Comput 11(6):712–731
Article Google Scholar
Zhang Q, Zhou A, Jin Y (2008) RM-MEDA: a regularity model-based multiobjective estimation of distribution algorithm. IEEE Trans Evol Comput 12(1):41–63
Article Google Scholar
Zhang Q, Liu W, Li H (2009a) The performance of a new version of MOEA/D on CEC09 unconstrained MOP test instances. In: Proceedings of the conference on evolutionary computation (CEC ’09). IEEE Press, pp 203–208
Zhang Q, Zhou A, Zhao SZ, Suganthan PN, Liu W, Tiwari S (2009b) Multiobjective optimization test instances for the CEC09 special session and competition. Technical report, CES-487, University of Essex and Nanyang Technological University

Download references

Acknowledgments

The authors would like to thank Prof. Raino A.E. Mäkinen and Dr. Timo Aittokoski for fruitful discussions.

Author information

Authors and Affiliations

Department of Mathematical Information Technology, P.O. Box 35 (Agora), 40014, University of Jyväskylä, Finland
Karthik Sindhya, Sauli Ruuska, Tomi Haanpää & Kaisa Miettinen

Authors

Karthik Sindhya
View author publications
You can also search for this author in PubMed Google Scholar
Sauli Ruuska
View author publications
You can also search for this author in PubMed Google Scholar
Tomi Haanpää
View author publications
You can also search for this author in PubMed Google Scholar
Kaisa Miettinen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Karthik Sindhya.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sindhya, K., Ruuska, S., Haanpää, T. et al. A new hybrid mutation operator for multiobjective optimization with differential evolution. Soft Comput 15, 2041–2055 (2011). https://doi.org/10.1007/s00500-011-0704-5

Download citation

Published: 22 March 2011
Issue Date: October 2011
DOI: https://doi.org/10.1007/s00500-011-0704-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A new hybrid mutation operator for multiobjective optimization with differential evolution

Abstract

Similar content being viewed by others

Differential Evolution with Landscape-Based Operator Selection for Solving Numerical Optimization Problems

A Differential Evolution Algorithm for Optimization Including Linear Equality Constraints

Generalized Hybrid Evolutionary Algorithm Framework with a Mutation Operator Requiring no Adaptation

1 Introduction

2 Notation and background