1 Introduction

Optimization is a key topic in computer science, artificial intelligence, operations research and several other related fields (Corne et al. 1999). In these fields, optimization is the process of trying to find the best possible solution to a problem. Mathematically, an optimization problem with constraints can be formulated as the process of:

(1)

where n is the number of objectives to optimize, S is the set of potential solutions, J is the number of inequality constraints expressed in the form \(g_j(x) \ge 0\), and K is the number of equality constraints expressed in the form \(h_k(x) = 0\). Both the goal of the process as well as the design of the optimizers are highly influenced by the use of one or several objectives. Thus, most taxonomies distinguish between single-objective optimization \((n = 1)\) and multi-objective optimization \((n > 1)\). However, since the ability of many multi-objective approaches is severely deteriorated by an increase in the number of objectives (Khare et al. 2003; Knowles and Corne 2007), a further distinction is made to refer to problems with four or more objectives. Such multi-objective problems having more than three objectives are often referred to as many-objective problems (Purshouse and Fleming 2007; Ishibuchi et al. 2008).

Several exact approaches have been designed to deal with optimization problems. However, exact approaches are unaffordable for many real world applications, resulting in the development of a wide variety of heuristics and metaheuristics. Their main aim is to obtain good quality solutions in a limited amount of time (Glover and Kochenberger 2003; Talbi 2009). Among them, Evolutionary Algorithms (eas) (Eiben and Smith 2008) have become a popular choice for solving different types of optimization problems. These eas involve a set of population-based methods which draw their inspiration from biological evolution. eas have shown great promise for yielding solutions for large and difficult optimization problems.

eas were initially developed in an effort to tackle unconstrained single-objective optimization problems. However, since their inception, a great deal of research has been conducted to adapt them to other types of problems. For instance, Multi-Objective Evolutionary Algorithms (moeas) adapt eas for dealing with multi-objective optimization problems (Deb 2001; Coello and Lamont 2004; Coello et al. 2007). This has been a very active research area in recent decades, as a result of which several moeas have been proposed in the literature (Zhou et al. 2011). In multi-objective optimization the aim is to obtain a set of trade-off solutions, rather than a single (best overall) solution, as in single-objective optimization. The optimization goal of multi-objective solvers involves several challenges (Zitzler et al. 2000). First, the distance of the resulting non-dominated set to the true Pareto Front should be minimized. A good distribution of the solutions found is also desirable. Finally, the extent of the non-dominated front should also be maximized. In order to fulfill these requirements, most moeas try to maintain a proper diversity in their population. Most moeas emphasize diversity in the objective function space (Coello et al. 2007), and in some cases in the space of the variables, so a number of mechanisms have been proposed to this end [e.g., fitness sharing (Deb and Goldberg 1989), the crowding operator (Deb et al. 2002), clustering (Toscano Pulido and Coello 2004), adaptive grids (Knowles and Corne 2003) and entropy (Wang et al. 2010) among others]. Such diversity maintenance schemes are generically called “density estimators” and are one of the main components of most modern moeas.

Considering these intrinsic properties of most moeas, several authors have claimed that the use of multi-objective solvers might be helpful for single-objective optimization as well (Abbass and Deb 2003). For this reason, moeas have been applied—with different guidelines—to solve single-objective optimization problems. The application of moeas to single-objective optimization can be mainly grouped into three different types of methods:

  • Methods that transform a constrained single-objective optimization problem into an unconstrained multi-objective optimization problem.

  • Methods that consider diversity as an objective.

  • Schemes termed as “multiobjectivization” whose aim is to transform a single-objective problem into a multi-objective problem by transforming its fitness landscape.

The main aim of this paper is to provide a comprehensive survey (see also Segura et al. (2013a)) of the application of moeas to single-objective optimization. Some papers on related population-based metaheuristics are also included. In addition, some lines of future work, as well as several open research topics, will be enumerated. The remainder of this paper is organized as follows. Section 2 describes the main proposals that use multi-objective concepts to solve single-objective problems with constraints. Section 3 is devoted to the methods that include diversity as an objective. The foundations of multiobjectivization and a review of the most important proposals are offered in Sect. 4. Finally, some possible future trends, as well as several open topics, are described in Sect. 5.

2 Constrained optimization

2.1 Foundations

Constrained optimization is the process of finding a feasible solution that optimizes one or several mathematical functions in a constrained search space. eas, in their original versions, lack a mechanism for incorporating constraints into their search process (Mezura-Montes and Coello 2008). However, many real-world optimization problems involve constraints (Venkatraman and Yen 2005). As a result, several proposals for dealing with constrained optimization problems have been devised. In fact, some comprehensive surveys (Coello 2002; Mezura-Montes and Coello 2011) and books (Mezura-Montes 2009) have already been published on this topic.

The most popular method for dealing with constrained search spaces in eas is the use of penalty functions. Penalty functions were originally devised by Courant in the 1940s (Courant 1943). The basic idea is to transform a constrained optimization problem into an unconstrained one by modifying the fitness function on the basis of the constraint violations present in each individual. Constraint violations are measured and are then used to penalize infeasible solutions, with the aim of favoring feasible solutions. The main drawback of penalty functions is the difficulty involved in finding a penalty function that is both effective and efficient. Penalty functions usually have several parameters that must be carefully tuned to adapt the scheme to a particular optimization problem. Thus, the use of penalty functions increases the number of free parameters that need to be tuned. It has been empirically demonstrated that the behavior of a penalty function may be extremely sensitive to its parameter values (Surry and Radcliffe 1997). Moreover, in some cases, no value for the parameters is adequate, which makes evident that some alternative (and more general) methods are desirable.

As a result, several other constraint-handling schemes have been proposed in the literature. Among them, some of the most well-known are the following:

  • Reject infeasible solutions (Back et al. 1997). This is probably the easiest way to allow the use of eas in constrained optimization. It can be considered as a particular case of penalty functions, where a zero fitness is assigned to any infeasible solution.

  • Apply repairing methods with the aim of transforming infeasible solutions into feasible ones (Liepins et al. 1990). The schemes are problem-dependent and it is not always easy to define such methods, so the major inconvenience of this approach is its lack of generality. Moreover, repair methods could considerably worsen the original function, failing to yield efficient results, or they might introduce a systematic bias into the search (Back et al. 1997).

  • Use a combination of evolutionary operators and encoding that never produce infeasible solutions (Esbensen 1995). This kind of scheme is highly dependent on the optimization problem. However, in those cases in which it can be applied, it might offer a great improvement. These methods are also referred to as greedy decoders.

  • Apply multi-objective methods, including as objectives the original function to be optimized and the constraints or a measure of their assessment (Mezura-Montes and Coello 2011). The application of multi-objective methods has the advantage of being more general. Usually, the number of additional parameters that they require in comparison with the other schemes is minimal. Therefore, they are a promising kind of scheme which certainly requires further research.

2.2 Multi-objective methods for constrained optimization

One of the most promising ways of dealing with constrained optimization problems is to apply a multi-objective scheme. One of its main purpose is to avoid the requirement of setting several additional parameters, as happens with penalty functions. Several schemes based on applying a moea or some multi-objective concepts have been published. A taxonomy for these schemes was proposed in Mezura-Montes and Coello (2008). The following kinds of techniques are identified:

  • Schemes that transform the original constrained single-objective problem into an unconstrained bi-objective problem by considering a measure of the constraint violations as the second objective.

  • Schemes that transform the problem into an unconstrained multi-objective problem having the original objective function and its constraints as separate objectives. In this case, the constrained single-objective problem is converted into a multi-objective problem with N objectives, where the number of constraints is \(N - 1\).

In Segura et al. (2013a), the original taxonomy—of bi-objective and N-objective approaches—was extended to incorporate an additional dimension. Specifically, it distinguishes between those methods that once that a feasible solution is found, tend to lose diversity because all the members of the population might be quickly attracted to the feasible region, and those schemes that incorporate mechanisms to at least partially avoid this. The first methods are termed “feasible-compliant” methods and the second are termed “non-feasible-compliant”. Since in some cases this property might depend on the parameterization of the methods, this extension is not used in this paper. However, since it is quite important to analyze this feature because feasible-compliant methods might have convergence drawbacks in problems that have disconnected feasible regions (Venkatraman and Yen 2005), in some cases some comments about these properties are included.

2.2.1 Bi-objective methods

Most of the bi-objective methods incorporate mechanisms to allow the exploration of several disconnected feasible regions. However, two of the methods where this situation might me problematic are Wang et al. (2005a, 2007b). In the proposal presented in Wang et al. (2005a), the second objective is defined as the maximum constraint violation. The survivor selection operator sorts individuals considering the second objective. Ties are broken by taking into account the first objective value. Then, the best individuals are selected. In addition, a novel crossover operator is proposed. Another feature is the application of different mutation operators for feasible and infeasible individuals. A more complex approach was proposed in Wang et al. (2007b). In this case, the second objective is defined as the sum of the constraint violations. In the normal operation, a parent individual can only be replaced by an individual which dominates it. Alternatively, if there are no feasible individuals in the offspring population, the selection considers solely the degree of constraint violation. In addition, the scheme introduces a mechanism in some generations to ensure that any feasible individual is selected prior to any infeasible individual.

Probably, one of the most popular bi-objective methods (Surry et al. 1995; Surry and Radcliffe 1997) is the Constrained Optimization by Multi-Objective Genetic Algorithms (comoga). In this method, the second objective is defined as the non-domination rank of each individual considering the constraint violations as objectives. Then, solutions are selected with a binary tournament involving the original objective or the newly defined objective. This decision is based on a parameter called \({{P}_{cost}}\), whose value is dynamically modified.

In the line search algorithm proposed in Camponogara and Talukdar (1997), the second objective is the sum of the constraint violations. First, the Pareto fronts are calculated. Then, two individuals \(x_i\) and \(x_j\), where the individual \(x_i\) dominates \(x_j\), are randomly selected. Considering these two points, the following search direction is generated:

$$\begin{aligned} d = {{(x_i - x_j)} \over {\left| x_i - x_j\right| }} \end{aligned}$$
(2)

Then, a line search through the line defined by point \(x_i\) and direction d is conducted. The aim is to find a solution that dominates both \(x_i\) and \(x_j\). In addition, a mechanism for preserving diversity, based on randomly changing one half of the population, is used.

The technique proposed in Zhou et al. (2003) also considers the sum of the constraint violations as the second objective to optimize. New individuals are generated following the minimal generational map model. First, C individuals are generated. Then, two individuals are selected to be part of the offspring. The first one is selected considering the second objective. The second one is selected considering the Pareto strength. These steps are repeated until N offspring are selected. Finally, these offspring substitute the current population.

In Mezura Montes and Coello (2005) the self-adaptive mutation mechanism of a multimembered evolution strategy is applied to explore constrained search spaces. This is combined with a comparison mechanism which uses three feasibility-based rules to guide the search towards the global optima of constrained optimization problems. To avoid a high selection pressure and maintain infeasible solutions in the population, a simple diversity mechanism is added.

The proposal in Cai and Wang (2006) aims to focus the search on the boundary of the feasible region. As in some of the previous schemes, the second objective is defined as the sum of the constraint violations. The non-dominated individuals of the offspring replace dominated individuals of the current population. In addition, an archive stores infeasible solutions with a low sum of constraint violations. Such infeasible solutions are used to replace some randomly selected solutions of the current population. Such a step promotes the search in the boundary of the feasible region. In Wang and Cai (2012a), an improved version of Cai and Wang (2006) is proposed. The method—called cmode—has two main differences when compared to its previous version: instead of using the simplex crossover, cmode exploits differential evolution as the search engine and a novel solution replacement mechanism is included. In the Hybrid Constrained Optimization Evolutionary Algorithm (hcoea) (Wang et al. 2007a), the second objective is also defined as the sum of the constraint violations. It combines a global search with a local search scheme. The aim of the local search is to accelerate the convergence. Finally, in Wang et al. (2008) the optimization is divided into three stages, with the Pareto dominance only being used in the first optimization stage. Some variants of these ideas have been applied to differential evolution (Wang and Cai 2012b) and to other types of population-based metaheuristics (Venter and Haftka 2010). Wang and Cai (2012b) proposes a dynamic hybrid framework—called DyHF—hich consists of two major steps: global and local search models. In both search models, differential evolution serves as the search engine, and Pareto dominance is employed to compare the individuals in the population. In some cases, some randomly selected individuals of the population are replaced by promising infeasible individuals. The two steps above are executed dynamically according to the feasibility proportion of the current population in an effort to reasonably distribute the computational resources among the global and local searches during the evolution.

A method that also divides the process into phases is presented in Venkatraman and Yen (2005). In the first phase, the sum of the normalized constraint violations is used as the fitness value. A single-objective optimization approach is used in such a phase. The second phase starts when a feasible solution is found. Then, a version of the Non-Dominated Sorting Genetic Algorithm (nsga-ii) (Deb et al. 2002) is used to assign the fitness of the individuals. This version considers the sum of the normalized constraint violations as the second objective to be optimized. A small change is carried out in nsga-ii. Specifically, the algorithm assigns any feasible solution to the first rank regardless of its first objective value. However, some infeasible individuals might also be assigned to the first rank. In the second phase, the best feasible individual and all of the offspring survive to the next generation.

An alternative method that also tries to focus the search on the boundary of the feasible region is proposed in Deb et al. (2007). First, the problem is transformed into a bi-objective one by considering the sum of the constraint violations as the second objective. Then, a version of nsga-ii that includes the definition of a reference point is used (Deb and Sundar 2006). This version tries to find a set of solutions close to the supplied reference point. The reference point is dynamically changed. Specifically, at each generation the best feasible solution found is considered as the reference point. In order to ensure that diversity is maintained, the \(\epsilon \)-dominance concept is used (Laumanns et al. 2002). Moreover, the method is integrated with the classical sequential quadratic programming (sqp) method.

The Infeasibility Driven Evolutionary Algorithm (idea) proposed in Ray et al. (2009) requires a user-defined parameter which specifies the desired proportion of infeasible solutions in the population. The ranking procedure is executed independently for feasible and infeasible solutions. The replacement scheme considers the calculated ranks and the desire of retaining the given proportion of infeasible solutions. The scheme has been extended to incorporate the use of local-search (Singh et al. 2010), and it has also been applied to a practical optimization problem (Singh et al. 2013).

Masuda and Kurihara (2012) includes several changes over a standard Multi-Objective Particle Swarm Optimizer. First, the number of Pareto optimal candidate solutions propagating into the next generation is limited so that solutions with larger constraint violations are removed. Second, the best global solution—which is likely to be the closest to the global optimum for the original objective—is introduced in order to improve the Pareto optimal candidate set. Finally, in the interest of preserving and recovering the diversity of the search, when the number of optimal solution found is below a certain predefined value, particles are reinitialized and their velocity set to 0, so that the known optima can be approached from different directions with the aim of finding new Pareto optimal candidates.

In Li and Zhang (2014), a concept of b-dominance (biased-dominance) is applied to compare or select individuals during the search process. This dominance concept is combined with differential evolution to produce the Biased Multiobjective Optimization (bmo) Algorithm. In order to apply the bmo algorithm, the constrained optimization problem is reformulated as a bi-objective optimization problem, with one objective for the original objective function and the other for the constraint violation, which is regarded as the biased objective. Considering b as the biased threshold value, when the biased objective function values of the two individuals are both below the b value, b-dominance is equivalent to Pareto dominance. In the case where either of the two biased objective function values surpasses the b value, only the biased objective function is taken into consideration.

Biasing mechanisms are also evaluated in Garza-Fabre et al. (2015a). Starting from a previous work (Garza-Fabre et al. 2013), the hydrophobic-polar model for protein structure prediction is reformulated as an unconstrained multi-objective problem by treating constraints as an additional objective function. Rather than discriminating feasible from infeasible solutions, the multi-objective strategy defines trade-offs between quality (original objective) and feasibility. This gives infeasible solutions the opportunity to be considered and exploited during optimization. It was found that a significant portion of the infeasibility translates into landscape neutrality. However, an excessive increase in the neutrality may also prevent a search algorithm from moving in the correct direction. Without a proper search bias, therefore, the computational resources can be exhausted by exploring uninteresting areas of the solution space. Therefore, the second part of this work studied the effectiveness of different mechanisms for biasing the search towards the feasible region, which can be coupled to the multi-objective constraint-handling strategy. Three different biasing mechanisms were evaluated: the use of an archiving strategy, the incorporation of a secondary discrimination criterion (use of feasibility rules), and the application of a proportional bias dependent on the degree of constraint violation.

As concerns biasing mechanisms, Runarsson and Yao (2005) presents a detailed study that shows the importance of search bias in constrained optimization. The work—probably motivated by the authors’ previous findings (Runarsson and Yao 2000)—analyzes how different constraint handling methods and search distributions create different search biases for constrained evolutionary optimization. As a result, infeasible individuals may enter a feasible region from very different points depending on this bias. In Dong and Wang (2014) an unbiased model is proposed and the relationship between the existing biased bi-objective model and the proposed unbiased one is analyzed in detail. In the unbiased model, both objective functions are equally treated and Pareto ranking is employed as the unique selection criterion.

Finally, in Gao et al. (2015) a dual-population differential evolution—named dpde—is proposed. At each generation during the evolution process, the whole population is divided into two subpopulations based on their feasibility so that both objectives (actual function to be optimized and degree of constraint violations) are treated separately and each subpopulation focuses only on optimizing the corresponding objective. One subpopulation consists of the infeasible solutions to minimize the degree of constraint violations, while the other subpopulation consists of the feasible solutions to optimize the objective function value. When applying volutionary operators like selection, the fitness value of a solution in each subpopulation is assigned by the corresponding objective function. dpde makes use of an information-sharing strategy to exchange search information between the different subpopulations, similar to team cooperation. This way, both subpopulations cooperate to approximate the feasible optimal solution.

2.2.2 N-objective methods

The number of N-Objective methods is also quite large. Even in those cases where the original problem is transformed into an N-objective problem, some methods where disconnected feasible regions might be problematic have been devised. One of the most popular schemes (Parmee and Purchase 1994) is based on the Vector Evaluated Genetic Algorithm (vega) (Schaffer 1985). This method is a combination of a multi-objective approach with a greedy decoder. First, vega is used to guide the search to the feasible region. The set of objectives considered in vega is the set of constraints. Once a feasible solution is generated, the use of vega is discarded. Instead, the authors use a tailor-made operator that preserves the feasibility of solutions.

A version based on the Niched–Pareto Genetic Algorithm (npga) (Horn et al. 1994) is proposed in Coello et al. (2002a), Coello and Mezura-Montes (2002b). In npga, parents are selected through a tournament based on Pareto dominance. In order to save computing resources, only a sample of the population is used to estimate the Pareto dominance. Two main changes are performed with respect to the original version of npga. First, the use of niches is avoided. Instead, a simple method based on performing random selections with a low probability is used. Second, dominance checking is only considered when comparing infeasible individuals. When comparing a feasible with an infeasible individual, the feasible is preferred, while when comparing two feasible individuals, the objective function is considered. If the comparison with the sample of individuals does not reveal any information, direct comparisons between pair of individuals considering the feasibility rules are used. It is important to note that in this scheme the use of random selection to promote diversity might provide the survival of some infeasible individuals. In this sense, it might be considered as a non-feasible-compliant scheme. However, by performing a random selection with a low probability it is unlikely to avoid the drawbacks of feasible-compliant schemes.

A method based on the use of goals and priorities is proposed in Jiménez et al. (2002). In this approach, the objectives generated from the constraints are assigned a higher priority than the original objective. Thus, feasible individuals are better than infeasible individuals, and the comparisons between infeasible individuals completely disregard the original objective function value. The algorithm uses a pre-selection scheme to favor the generation of individuals close to their parents and to promote implicit niching.

The method proposed in Oyama et al. (2005) uses the same concept of domination as the one applied in Coello et al. (2002a). However, a complete ranking is established considering the Pareto-based ranking scheme proposed in Fonseca and Fleming (1993). In addition, a standard fitness sharing scheme is applied to the infeasible individuals based on their constraint violations.

Finally, differential evolution (de) has also been used considering every constraint as an objective (Kukkonen and Lampinen 2006). The de/rand/1/bin scheme (Price et al. 2005) is applied and the selection rule of the survivor selection scheme prefers feasible solutions to infeasible solutions. If both solutions are feasible, or both solutions are infeasible, then the selection scheme considers the concept of weak dominance in the space of the objectives or in the space of the constraints, respectively. Specifically, if the original solution is weakly dominated by the new generated solution, then the original solution is replaced. An extension of such scheme was proposed in Gong and Cai (2008), which includes an external archive with the best found solutions. Such an archive is maintained considering the concept of \(\epsilon \)-dominance (Laumanns et al. 2002). The definition of dominance considers only the space of the constraints. In case of a tie, the extended space is used taking into account the constraints and the objective function. The variation scheme is guided by the individuals of the archive.

Most of the N-objective methods described above might have important drawbacks when facing disconnected feasible regions. However, many other schemes introduce some action to better deal with this situation. Several methods in this group are based on the use of vega (Schaffer 1985). The application of vega considering \(J + K + 1\) subpopulations is proposed in Coello (2000b). The first \(J + K\) subpopulations consider as fitness values the violation of each constraint. The last subpopulation considers the original objective as the fitness value. The idea behind the approach is that by combining individuals of the different populations, a feasible solution with a high value of the original objective might be generated. The main drawback is that the number of sub-populations increases linearly with the number of constraints. Moreover, some constraints might be easier than others, but this is not considered in the approach. An extension of the scheme is proposed in Liang and Suganthan (2006). In this new proposal, the objectives are dynamically assigned to the subpopulations by considering the difficulty of each constraint.

The scheme proposed in Ray et al. (2000) calculates the non-domination ranks considering three different spaces: objective space, constraint space, and a combination of the two. The selection probability of an individual is based on the three calculated ranks. In addition, the scheme incorporates mating restrictions and a niche mechanism based on Euclidean distances. This work was extended to improve the diversity maintenance (Ray and Liew 2003). The new scheme is based on simulating the behavior in societies and civilizations. The individuals of a given society are identified by applying clustering algorithms. As in the authors’ previous work, the selection of the best individuals is based on using non-domination ranks. In this case, two different spaces are considered: objective space and constraint space.

The Inverted Shrinkable Pareto Archived Evolution Strategy (is-paes) is proposed in Hernández-Aguirre et al. (2004). It is an extension of the paes method. The main concept introduced is the use of a shrinking mechanism to reduce the search space. At the beginning, the entire search space is considered. Then, as the evolution progresses, the search space is shrunk to focus on the feasible region. The reduction of the search space is performed by considering the solutions in the archive with the lowest amount of constraint violation.

A method that promotes the oscillation between the search in feasible and infeasible regions is proposed in Angantyr et al. (2003). It does so by calculating the fitness value considering two different ranks. The first rank is calculated considering only the objectives. The second rank is calculated considering only the constraints. These ranks are added considering adaptive weights. The weights depend on the proportion of feasible individuals in the population. The weights assign a greater importance to the rank based on constraints when the proportion of feasible individuals is low.

Finally, an alternative method for promoting the search in the boundary regions is proposed in Churchill et al. (2013). Searching in the infeasible regions with the direct use of nsga-ii calls for long search times. As a result, two new proposals are considered. One involves the use of reference points, and the other applies a guided elitism scheme where some selections are carried out by considering the original objective with penalties. Both approaches yield better results than the original version of nsga-ii. However, the one with reference points is very sensitive to the parameters being considered.

2.2.3 Other methods

Some schemes cannot be classified as bi-objective or as N-objective, in the sense in which such features are defined in this paper. As a result, these kinds of methods have been included in this section.

In the scheme proposed in Schoenauer and Xanthakis (1993), the constraints are handled in a particular order. First, the technique focuses on optimizing one constraint. Then, when a percentage of the population is feasible for this constraint, the next constraint is considered. The idea is to satisfy, sequentially, the constraints imposed on the problem while still satisfying those previously considered. Although a multi-objective scheme is not applied, several objectives are simultaneously considered in this scheme. In the last stages of the optimization, the scheme behaves as a death penalty scheme where infeasible individuals are erased from the population.

In the method proposed in Coello (2000a) every individual is compared (in a pairwise manner) against every other individual in the population in order to determine the number of elements that are better than a given individual. In order to carry out the comparisons, any feasible individual is considered better than any infeasible individual. In the case of comparisons among infeasible individuals, they are first compared considering the number of violated constraints, and, in case of a tie, considering the sum of constraints violations. Finally, for feasible solutions, the fitness is obtained as the normalized original objective value plus one, while the fitness for an infeasible solution I is \({{1}\over {{\textit{countBetter}}(I) + 1}}\), where \({\textit{countBetter}}(I)\) is the number of individuals that are better than I. This ensures that the rank of feasible individuals is always higher than the rank of infeasible ones.

A non-feasible-compliant method based on relaxing one of the constraints was proposed in Watanabe and Sakakibara (2005). The scheme transforms the original problem into a bi-objective problem. However, the second objective is not a measure of the violation of the constraints. Instead, it is equal to the original objective but considering relaxed constraints. Moreover, a penalty function is applied to the first objective. Then, nsga-ii is applied, the aim being to concentrate the search on the boundary of the feasible region.

Also in Murugan et al. (2009) a version of nsga-ii with modifications is applied to solve the Transmission Constrained Generation Expansion Planning Problem. The first objective is to minimize the cost and the second objective is to minimize the sum of the normalized soft constraint violations. The hard constraints (must-satisfy constraints) are treated as constraints only.

Some proposals have been devised that combine bi-objective evolutionary approaches with the classical penalty function methodology in a way that they complement each other (Deb and Datta 2010, 2013). In these cases, the evolutionary approach provides an appropriate estimate of the penalty parameter, while the solution of an unconstrained penalized function using a classical method induces a convergence property in the overall hybrid algorithm. Uniform adaptive scaling of equality and inequality constraints has been also hybridized with evolutionary approaches (Datta and Deb 2015).

A method based on a multi-objective de is proposed in Reynoso-Meza et al. (2010). Three objectives are considered: the original one, the sum of constraint violations for inequality constraints, and the sum of constraint violations for equality constraints. The maintenance of diversity is encouraged with the use of a spherical pruning scheme. Another method which also considers three objectives is proposed in Chowdhury and Dulikravich (2010). In this case, a predatory-prey ea is used. The first and second objectives are equal to the original objective. The third objective is the sum of the constraint violations. This creates a two-thirds bias towards the original objective. The proposed scheme does not scale to problems with several constraints, where most of the time is spent in the infeasible region. In Jia et al. (2011), a de scheme that considers two objectives is defined. The second objective represents the amount of constraint violations. However, it is defined in a dynamic way because the constraints boundaries are gradually tightened to the original boundaries.

A two-phase approach is proposed in Echeverri et al. (2009). The objective function is completely disregarded in the first phase to push the search effort towards searching for a single feasible solution. The problem is converted into a bi-objective optimization in the second phase, where each objective is a weight function of the objectives and constraint.

Finally, note that some theoretical studies on the application of these kinds of transformations have also been developed (Kumar and Banerjee 2006).

2.3 Discussion

As we have shown, the number of proposals that consider multi-objective concepts is vast. In fact, several proposals that are minor variants of the schemes described above have not been included in this survey due to space constraints. The reason for the existence of such a large number of proposals is that none of them has been found to be significantly superior to the others. The No-Free-Lunch theorem by Wolpert and Macready (1997) might be considered as a reason for this. However, some studies have concluded that the use of multi-objective concepts is not adequate for some single-objective problems (Mezura-Montes and Coello 2011). For instance, the only method inspired by multi-objective concepts presented at the 2010 cec competition on constrained optimization (Reynoso-Meza et al. 2010), obtained much worse results than those yielded by other schemes. Thus, careful consideration must be given to the kind of method chosen. In any event, only one method inspired by multi-objective concepts was applied, so it would be of great interest to test related schemes with such benchmark problems. In contrast, several multi-objective schemes have provided high-quality solutions to difficult benchmark and real world constrained problems, showing their usefulness in other cases (Coello 2000a; Wang et al. 2008)

The direct application of moeas to a constrained problem might lead to a compromise between objectives and constraints in some cases. It is also worth noticing that the whole set of solutions is usually not of interest to the user (i.e., the decision maker). In fact, in such cases, the method might be trying to solve both the constrained and unconstrained problems at the same time. If no action is taken, too much time might be spent searching in the infeasible region. In Runarsson and Sarker (1999) an analysis is carried out using a very simple problem. The analysis shows that using Pareto Ranking might lead to a bias-free search where most of the time is spent searching in the infeasible region. The likelihood of wasting evaluations depends of the fitness landscape. This is why some multi-objective schemes that produce a certain amount of bias in the search have been devised. A promising approach is the method proposed in Deb et al. (2007), where a dynamic reference point is used to guide the search. The advantages of introducing a bias in the search are clear. In fact, such a method has obtained better results than any of the schemes presented at the 2006 cec competition on constrained optimization. To the best of our knowledge, the results with such an algorithm for the 2010 cec benchmark tests have not been published, so its performance with new problems is unknown.

The use of several optimization phases where different rankings are considered has also yielded several benefits (Wang et al. 2008). Thus, it seems that the direct use of Pareto dominance concepts might provide benefits in some stages of the optimization, while it might increase the convergence time if it is applied over the entire optimization process. In other cases (Parmee and Purchase 1994), the phases distinguish between the search of a feasible solution and the optimization of such a solution. These types of schemes might encounter difficulties with problems involving unconnected feasible regions (Venkatraman and Yen 2005) and should, therefore, be carefully applied.

Finally, it is also worth noting that many of the proposals described herein have only been tested on a few real world applications or on a reduced number of benchmark problems. Thus, it is very difficult to predict what will be their behavior when dealing with different problems. For a comparison of several multi-objective schemes, see (Mezura-Montes and Coello 2005). Note however, that this comparative study (as well as the others studies already cited in this paper) disregards several methods, which certainly complicates the task of deciding which multi-objective method to apply in which case.

3 Diversity-based schemes

3.1 Foundations

Maintaining a proper diversity is an important issue for the correct behavior of eas (Črepinšek et al. 2013). A loss of diversity might lead to stagnation in suboptimal regions, producing the effect known as “premature convergence”. Premature convergence is one of the most frequent drawbacks that must be faced when using evolutionary approaches. One of the main reasons behind premature convergence is the use of finite population sizes, leading to the phenomenon known as genetic drift (Eiben and Smith 2008).

Several theoretical and empirical studies have analyzed the impact of promoting diversity in evolutionary schemes (Friedrich et al. 2008). Diversity can help the optimization mainly in two ways. First, there is a relationship between diversity and the capabilities of exploration and exploitation in eas (Črepinšek et al. 2013). Among other benefits, a proper balance between exploration and exploitation might allow exploring several hills simultaneously in multimodal problems. In addition, maintaining proper diversity might allow combining different building blocks in crossover operations (Jansen and Wegener 2005). However, maintaining a larger diversity does not necessarily imply a proper balance between exploration and exploitation, so there is not always a positive correlation between diversity and fitness (Burke et al. 2004). This is why the term useful diversity was introduced in Mahfoud (1992) to refer to the diversity that helps to find high-quality individuals.

Considering the importance of maintaining proper diversity in several complex optimization problems, several diversity preservation schemes have been devised. The reader is referred to Črepinšek et al. (2013) for an extensive survey of diversity preservation mechanisms. Among them, some of the most well-known are the following:

  • Restart the approach when stagnation is detected (Eiben and Smith 2008).

  • Increase the population size with the aim of avoiding genetic drift (Eiben and Smith 2008).

  • Apply mating restrictions such as incest prevention (Simões and Costa 2011), i.e., avoid the mating of individuals that are very similar. This is also known as speciation.

  • Perform cataclysmic mutation (Eshelman 1990).

  • Perform selection applying fitness sharing (Nguyen et al. 2012). In this case, highly similar individuals are clustered and penalized by sharing the resulting fitness values among the members of the group that lie in the same niche (i.e., those that are very close to each other either in the decision or objective function space).

  • Apply crowding-based selection where each offspring replaces similar individuals in the parents population (Mahfoud 1992).

  • Use structured populations such as the island-based model (Alba 2005) or cellular approaches (Nebro et al. 2007).

  • Apply a multi-objective scheme that considers diversity as an objective (de Jong et al. 2001).

3.2 Multi-objective methods for promoting diversity

Many authors have remarked that diversity should be considered in some way as an objective of the optimization. One promising approach is offered by using multi-objective methods to ensure proper diversity for single-objective optimization. While only a limited number of such schemes exists, other, closely related mechanisms have also been proposed. In some methods (Matsui 1999), a measure of diversity is combined with the original objective to calculate the fitness value of each individual. However, since the two measures are not entirely compatible, such a combination is complex and problem-dependent. In order to alleviate this problem, other ways of combining them have been devised. One alternative is to use the ideas proposed in Vidal et al. (2013), where the individuals are sorted by the original cost and by their contribution to diversity. Then, the rankings of the individuals are combined to generate the fitness value. Another interesting approach is to alternate between optimizing the population for diversity and for objective values (Ulrich and Thiele 2011).

In the case of applying a multi-objective scheme, both objectives—the original and the diversity-based one—are not combined, but are used simultaneously. This methodology avoids some of the drawbacks of the aforementioned methods. Note that in these schemes, a measure of population diversity is not required. Instead, the objective must be a measure of the diversity introduced by the individual considered in the population. The same principles have been used to promote diversity in multi-objective optimization problems. Most of these schemes can also be applied to single-objective optimization problems. Thus, this section also considers the schemes that can be applied to single-objective schemes, even if they have only been applied to multi-objective optimization problems. In the rest of this section, the original objectives are referred to as fitness objectives, while the additional objective is referred to as the diversity objective.

Several diversity objectives have been devised. In this paper, we propose a taxonomy that classifies diversity objectives into the following groups:

  • Encoding-independent measures that do not depend on the chromosome or the problem.

  • Genotypic and phenotypic measures that consider the values of the genes.

  • Behavioral measures that consider the behavior of the individuals.

3.2.1 Encoding-independent measures

In this kind of scheme, since the encoding is not considered, the diversity objectives are not explicit measures of diversity. They do, however, promote the maintenance of proper diversity in the population. Three different encoding-independent diversity objectives were proposed in Abbass and Deb (2003). All of them must be minimized:

  • Random: a random value is assigned as the diversity objective. Smaller random values may be assigned to some low-quality individuals that thus have a chance to survive.

  • Inversion: in this case, the optimization direction of the objective function is inverted and used as the diversity objective. This approach highly decreases the selection pressure. In fact, under this scheme, every member is non-dominated, so it must be carefully applied.

  • Time stamp: the diversity objective is calculated as a time stamp for each individual. Each individual in the initial population is marked with a different time stamp represented by a counter which is increased every time a new individual is created. Starting with the second population, all newly generated individuals are assigned the same time stamp, which is set as the population size plus the generation index. This time stamp must be minimized.

The previous diversity objectives were used with a moea that considers a fixed population size. If the number of non-dominated solutions in a generation is greater than the previously specified maximum (defined by the user), then the average distance to the two closest individuals is calculated. Then, the individual with the minimal distance is discarded. This distance considers the contents of the chromosomes.

A scheme related to the one that uses time stamps was devised in Schmidt and Lipson (2011). In this scheme, the age of individuals is considered as the diversity objective. However, in this case the aim is to minimize the age, so this scheme induces the survival of young individuals instead of old individuals, as it was done with the use of time stamps. In addition, infusion techniques are also included.

3.3 Genotypic and phenotypic measures

The first scheme that considered diversity as an explicit objective and integrated it into a moea was probably proposed in de Jong et al. (2001). In this case, a genetic programming scheme was executed considering three objectives: maximize the accuracy of the tree, minimize its size, and maximize diversity. The following distance measure between trees was defined. First, the trees are overlaid. Then, the nodes that overlap and are distinct are counted. Finally, the number of distinct nodes is normalized by dividing by the size of the smallest tree. The diversity of each tree is calculated as the mean distance to the rest of the trees in the population. The survivor selection mechanism selects non-dominated individuals. In addition, duplicate individuals are erased. Thus, a population with a variable size is considered. Note that since this scheme considers both a metric of diversity and a function that depends on the individual itself (the tree size), this method should be regarded as a hybrid between diversity-based moeas and multiobjectivization.

Another scheme for multi-objective problems is proposed in Toffolo and Benini (2003). The new diversity objective assumes an encoding based on real values. Specifically, the diversity objective is calculated as the mean Euclidean distance in the genotype space to the remaining individuals in the population. This is usually known as adi (average distance to all individuals). In this case, the original objectives are not directly considered. Instead, the non-domination ranks considering the fitness objectives are calculated. The domination rank and the diversity value are then considered as the objectives.

Based on these objectives, a new non-domination rank is calculated and used to rank the individuals. Based on the ideas in Toffolo and Benini (2003), two new diversity objectives are defined in Bui et al. (2005). These are the dcn (Distance to Closest Neighbor) and the dbi (Distance to Best Individual). The fitness objective is used to identify the best individual in dbi. These schemes were applied to dynamic single-objective optimization problems. The moea used was the well-known nsga-ii. Minor variants of these schemes have been used to tackle different problems. For instance, in Segura et al. (2011a) the Antenna Positioning Problem was addressed, while in Tran et al. (2013) the recent Black Box Optimization Benchmarking (bbob) testbed was considered. In the latter, the authors note that the application of dcn is beneficial when carefully combined with an appropriate multi-objective algorithm. However, the results are not competitive when compared with other state-of-the-art approaches, meaning that in order to show the potential of dcn for continuous optimization, it should be integrated with more complex schemes.

Extensions of the previous schemes were also proposed in Segura et al. (2011b, 2013c). dcn was modified with the aim of penalizing the individuals having a very low quality (Segura et al. 2013c). The newly defined objective was referred to as dcn_thr. In order to perform the penalization, the user must establish a threshold ratio. A threshold value (v) is generated considering the threshold ratio and the best fitness objective achieved. The diversity objective of individuals whose fitness value is higher—for a minimization problem—than v is set to 0. For the remaining individuals, dcn is used. As a result, individuals that cannot achieve the fixed threshold are penalized. The same ideas can also be applied with the dbi and adi diversity objectives (Segura et al. 2011b). A quite similar scheme was also proposed in Nielsen et al. (2015). In this case, individuals are sorted and the worst quantile of the population is penalized. In Segura et al. (2013c), the use of diversity objectives and hyperheuristics were combined. The user can specify a set of different diversity objectives and their corresponding parameters. Then, a hyperheuristic is used to automatically select the objective to use at each stage of the optimization process. A different extension of the dcn is based on calculating the distance to the nearest better individual (Wessing et al. 2013). Different ways of integrating the original objective with this novel objective were tested in multi-modal optimization and it was shown that multi-objective schemes induced a proper trade-off between exploration and exploitation. Different variants to multimodal optimization include Basak et al. (2013); Bandaru and Deb 2013). In Bandaru and Deb (2013) explicit niching is also integrated in order to ensure greater diversity. In addition, some adaptive constraints are included to eliminate local optima from the population, the goal being to detect different global optima.

In Segura et al. (2013b), nsga-ii is used with a new survivor selection scheme that considers the diversity objective dcn_thr. The diversity objective is calculated considering as reference the individuals that are selected to survive, instead of the entire population. After each selection, the diversity objective is recalculated. The parent selection scheme is kept intact. The previous survivor selection scheme was further extended in Segura et al. (2015) and used with the dcn objective to address large instances of the Traveling Salesman Problem (tsp). Specifically, it was combined with the idea of adapting the balance induced between exploration and exploitation to the various optimization stages. This was done by using the stopping criterion, as well as the elapsed time, as inputs to the replacement strategy. These inputs were used to define a dynamic penalization in which, differently from the dcn_thr scheme, the individuals that are penalized are those that contribute less in terms of diversity. This scheme was the first evolutionary scheme to solve a tsp instance with more than 30,000 cities to optimality, meaning that, at least in the field of combinatorial optimization, it should be regarded as a state-of-the-art scheme.

Finally, a diversity objective specifically tailored for a multi-objective version of the Vehicle Routing Problem is proposed in Garcia-Najera (2009). The distance between two individuals is calculated as the number of shared edges. Then, the mean distance to the remaining individuals in the population is used as the diversity objective. A traditional moea is not used. The mating scheme selects one parent considering the fitness objective and the other considering the diversity objective. The survivor scheme only considers the original objectives. Specifically, the fitness is established using Goldberg’s Pareto ranking (Goldberg 1989) and the fittest individuals from the previous population and offspring survive.

3.3.1 Behavioral measures

The field of Evolutionary Robotics (er) also makes use of this type of scheme. In this field, eas are usually applied with the aim of evolving neural networks that act as robot controllers. Calculating proper distances between neural networks in the genotypic or phenotypic space is a difficult task, which is why Mouret and Doncieux (2009a) propose the use of behavioral diversity. In these schemes, the distances among the behaviors of neural networks are considered. Specifically, for the mobile robot problem in question, the robots try to solve the given problem—usually by simulation—considering the evolved neural networks. Then, distances among individuals are calculated considering the status of the environment at the end of the simulation. As an example, if the problem to solve involves moving a set of objects in an arena, the differences between the vectors that indicate the position of each object at the end of the simulation might be used to calculate the distances between individuals. In Mouret and Doncieux (2009a) the nsga-ii with well-known mutation operators of this field is used.

The above research is expanded in Mouret and Doncieux (2009b) to include the concept of behavioral novelty (Lehman and Stanley 2008). In this case, all the evaluated individuals are stored in an archive. Then, distances are calculated considering the members of the archive instead of the members of the population. The novelty distance is also calculated considering both the archive and the population (Mouret 2011). Additionally, a scheme considering three objectives is also proposed: the fitness objective, behavioral diversity and behavioral novelty. It has been shown that selecting the proper objectives is highly dependent on the degree of deceptiveness of the problems at hand (Lehman et al. 2013).

A different research line in er is proposed in Doncieux and Mouret (2010), where several distances that can be applied to any er problem are defined based on calculating distances among the values coming from the sensors and the actions being sent to the effectors. Four different distances are defined. They are based on Hamming distances, Fourier coefficients, trajectory similarities, and on counting the number of times that the robot is in a particular state.

Finally, the original objective can be considered to be a more direct measure of the individual’s behavior. In order to promote greater levels of diversity, maintaining different fitness levels seems promising. This is the aim in Luerssen (2005), where different diversity metrics that take into account the objective values attained by the different individuals are defined.

3.4 Discussion

The maintenance of diversity using moeas has been successfully applied in different fields. For instance, it has been shown to be a proper scheme for reducing the bloat in genetic programming. It has also been used to overcome the bootstrap problem in er. As has been shown, various different schemes have been proposed. In general, they clearly outperform the corresponding single-objective schemes that do not consider any diversity preservation mechanism. However, some schemes offer more promising solutions than others.

The use of the encoding-independent measures proposed in Abbass and Deb (2003) is clearly outperformed by the use of genotypic and phenotypic measures. Bui et al. (2005) carried out a study considering the encoding-independent measures and the dcn, adi and dbi objectives. Their computational results clearly show the superiority of the Euclidean-based distances. The study was done with benchmark optimization problems. The same conclusions were drawn in Segura et al. (2011b); Segredo et al. 2011a). In these cases, the authors considered the Two-Dimensional Packing Problem and the Frequency Assignment Problem. It is important to note that in these last two studies, comparisons with single-objective schemes not considering diversity preservation were also carried out. The experimental study showed that, depending on the instance, the use of the diversity preservation scheme might be beneficial or counterproductive. Finally, the use of hyperheuristics to automatically select the diversity objectives and their parameters has proven effective with benchmark problems (Segura et al. 2013c) and practical applications (Segura et al. 2012). The novel survivor selection scheme proposed in Segura et al. (2013b) shows a clear superiority in terms of premature convergence avoidance. Thus, higher-quality results were achieved in the worst-case. However, the better ability to deal with premature convergence produces a reduction in the convergence speed in the average case for several of the benchmark problems analyzed. Finally, it is important to note that the extension presented in Segura et al. (2015) improved on the results obtained by any other ea to date in the tsp, showing the important benefits contributed by this kind of schemes.

It is also important to note that studies considering several diversity preservation schemes are scarce. In Bui et al. (2005), multi-objective schemes are compared against MutateHigh, a method that preserves diversity by performing highly-disruptive mutations. In Snijders et al. (2006) the adi scheme is compared against a fitness sharing scheme. In Segura et al. (2015), the ability of this kind of schemes is tested against the diversity preservation scheme presented in Lozano et al. (2008). Finally, in Mouret and Doncieux (2012) behavioral diversity and behavioral novelty are compared against fitness sharing. In every case, the multi-objective schemes exhibit better performance. However, further experimental studies of this sort are still needed.

Considering the field of er, the advantages provided by multi-objective schemes are noteworthy. Several studies in this field have compared behavioral diversity with behavioral novelty (Mouret 2011). The use of behavioral novelty usually produces a reduction in the number of generations required to converge to high-quality solutions. However, the computational burden involved is much higher than that associated with behavioral diversity. Thus, the most suitable scheme might well vary depending on the computational cost of the evaluation functions. The metrics that can be used with any problem of the field of er has shown to be very effective (Doncieux and Mouret 2010; Mouret and Doncieux 2012). They have been tested with several different problems, and have provided benefits in every case.

4 Multiobjectivization

4.1 Foundations

The simultaneous use of several objectives has a positive influence on the optimization process of certain single-objective optimization problems (Louis and Rawlins 1993). In this case, the additional objectives are not diversity measures that take into account the rest of the population, but rather objectives that depend solely on each individual’s chromosome. The exclusive dependency on the genotypic values is the main difference with respect to the diversity-based schemes previously presented. The transformation of single-objective problems into multi-objective problems using this methodology has been termed multiobjectivization (Knowles et al. 2001).

The principles behind multiobjectivization were first discussed in Louis and Rawlins (1993). In this paper, a deceptive function that is the sum of two components is multiobjectivized by considering each component as an objective. Pareto selection provides an implicit niching mechanism that facilitates the maintenance of proper diversity. It also favors the combination of good building blocks in the crossover operations, facilitating the achievement of higher-quality solutions. It is worth noting that not much attention was paid to this idea for almost a decade. The term multiobjectivization was first used in Knowles et al. (2001), where the authors distinguished between two types of multiobjectivization: decomposition and aggregation. The first one is based on decomposing the original or target objective into several components in such a way that the original optimum is a Pareto optimum in the new formulation. The second one is based on considering some additional objectives that are used in combination with the original objective. The proposals in Knowles et al. (2001) focus on decomposition-based multiobjectivization. The positive effect of multiobjectivization was shown for two different optimization problems: the tsp and a benchmark problem. Since then, several authors have conducted several theoretical and empirical studies on this topic. Such studies can be divided into two groups:

  • Studies of the principles of multiobjectivization.

  • Applications of multiobjectivization to specific optimization problems.

4.2 Studies of the principles of multiobjectivization

The studies of the principles of multiobjectivization can be divided into three main groups:

  • Analyses that explore the characteristics of the search process when multiobjectivization is used.

  • Guidelines for the proper use of multiobjectivization.

  • Studies of the computational complexity of multiobjectivized schemes.

Several theoretical studies have analyzed the way in which the search space is transformed with the use of multiobjectivization, as well as their implications in the optimization process. In the first papers published on this topic (Louis and Rawlins 1993; Knowles et al. 2001) it was shown that multiobjectivization by decomposition could remove some local optima from the original formulation. Moreover, it was also shown that some plateaus could be added. These plateau regions were useful for destroying deceptive regions, enabling the escape of low-quality regions. A more in-depth analysis of the effects of multiobjectivization by decomposition was carried out in Handl et al. (2008b), which showed that the use of Pareto selection in a decomposed problem has only one possible effect, which is to introduce plateaus of incomparable solutions. On the one hand, the increase in the number and size of plateaus might negatively influence the search. On the other hand, the introduction of plateaus might yield a reduction in the number of local optima, thus possibly mitigating the difficulty of the search. The authors show several decompositions that introduce positive and negative effects in the optimization process. Recently, the effects on the incomparability of the solutions were quantified by considering a decomposition of an energy function for the protein structure prediction problem (Garza-Fabre et al. 2015b). The authors show that multiobjectivization leads to the formation of paths connecting different plateaus of the single-objective fitness landscape, and measure several properties of the fitness landscape, such as the incomparability ratio or the sizes of the neutral networks. They also show that modifying these properties has important effects on the balance between exploration and exploitation, which might be one of the reasons for the good multiobjectivization performance in the cases studied. However, they also show that useful gradient information can be lost, so they hypothesize that by combining phases having multiobjectivization with stages based solely on the single-objective function, additional advantages might be obtained. It is also important to note that in multiobjectivization by decomposition, the best found solution might not survive with traditional moeas if the non-dominated front size of all candidate solutions becomes larger than the population size (Lochtefeld and Ciarallo 2014). As a result, it is quite important to maintain such an individual in an archive or introduce some other mechanism to address this drawback.

Similar analyses have been performed for multiobjectivization by aggregation. The added objectives were referred to as “helper-objectives” in Jensen (2003), and since then, this term has been widely used. Usually, these helper-objectives are somewhat aligned with the original objective, i.e. they should be considered as complementary, not completely conflicting, objectives (Yao et al. 2010). In fact, in some cases the helper-objectives are decompositions of the original objective (Lochtefeld and Ciarallo 2015), while in other cases they are just the original objective with some noise (Watanabe and Sakakibara 2005). Note that Lochtefeld et al. classify the schemes where the helper-objectives are decompositions of the original objective as multiobjectivization by decomposition. However, in our opinion, the main difference between aggregation and decomposition is that in aggregation the first objective is the original one, meaning that these schemes should be considered as multiobjectivization by aggregation. In the analysis presented in Jensen (2004), the main reasons that helper objectives can provide benefits in multiobjectivization were enumerated. These include: (i) avoiding local optimal, (ii) keeping diversity at a reasonable level, and (iii) making the algorithm to identify good building blocks that can later be assembled by crossover. The effects were analyzed considering two different problems using helper-objectives. However, most of these principles are also valid for multiobjectivization by decomposition (Lochtefeld and Ciarallo 2014). A detailed analysis of the effects of multiobjectivization with helper-objectives was conducted in Brockhoff et al. (2007, 2009), where it was shown that the use of Pareto selection might have two effects:

  • Comparable solutions can become incomparable, turning a region with a given search space direction into a plateau.

  • An indifferent relationship between solutions can become a comparable one, turning a plateau of indifferent solutions into a region where the Pareto dominance indicates a direction.

It was shown that both kinds of conversion can have a positive or negative influence on the search. In the first case, the removed direction can be deceptive or not. In the same way, in the second case, the generated direction can be deceptive or it can guide the search to the global optimum.

Several different ways of using the principles of multiobjectivization have been proposed. The work by Jensen (2004) shows that for a given optimization problem, helper-objectives can be generated in several ways. Thus, the use of dynamic helper-objectives, where the helper objective applied is changed during the optimization process, is proposed. Moreover, the use of several helper-objectives simultaneously is tested. The use of a dynamic helper-objective benefits the search because the changes in the structure of the search space can facilitate escaping of local optima. However, using too many helper-objectives removes the selection pressure from the algorithm. In this first approach, the helper-objectives are used considering a random order. Subsequently, the importance of the sequence in which the helper-objectives are applied was studied in Lochtefeld and Ciarallo (2011). It was shown that the order used has a significant impact on the results. In addition, they show that for the Job-Shop Scheduling Problem (jsp), a method for obtaining a proper order can be defined. The defined order was statistically superior to a random order for a large number of instances. A different alternative for sequencing the objectives was proposed in Buzdalova and Buzdalov (2012), where reinforcement learning is used to automatically select the most effective helper-objectives and ignore the ineffective ones. Finally, a substantial analysis considering benchmark problems has also been carried out (Lochtefeld and Ciarallo 2012). It shows that helper-objectives should be sequenced considering their contribution to the fitness and that helper-objectives should have different local optima than the target objective. Finally, for the cases in which several helper-objectives are used simultaneously, more benefits can be obtained if they have complementary properties.

Similar studies have also been carried out involving multiobjectivization by decomposition. In (Garza-Fabre et al. 2012a) several decompositions are proposed and changed randomly. Adaptive decompositions to better balance the trade-off between the objectives generated are proposed in Jähne et al. (2009) and further extended in Lochtefeld and Ciarallo (2014).

Multiobjectivization has also been applied for the optimization of scalarizing fitness functions. Scalarizing functions can be used to transform a multi-objective optimization problem into a single-objective optimization problem. Some multi-objective optimization schemes solve different scalarizing functions to yield an approximation of the Pareto Front (Ishibuchi and Murata 1998). Some authors have proposed solving the scalarizing functions that emerge in multi-objective optimization considering the principles of multiobjectivization. In some problems, the direct use of each objective in a moea is successful only for some weight values. The reason is that in many cases, moeas find solutions with a good convergence to the Pareto Front, but they focus on the “knee” of the Pareto Front. In addition, it has been shown that considering too many objectives is not promising. In particular, the application of moeas provided clear benefits for scalarizing functions of two components in Ishibuchi et al. (2006b). However, the search ability of moeas is deteriorated by the increase in the number of objectives. In order to avoid the previously mentioned drawbacks, both the parent selection and the survivor selection schemes were modified in Ishibuchi et al. (2006a). The scheme considers two probabilities to specify how often the scalarizing fitness function is used for parent selection and for replacement selection. In the remaining cases, multi-objective parent selection and replacement schemes are used. The main drawback of the scheme is the requirement of having to fine tune two additional parameters. A scheme that avoids the use of additional parameters is presented in Ishibuchi and Nojima (2007) where, instead of using the original objectives, certain linear combinations of them are used as objectives. The weights are fixed in such a way that the desired single-objective solution is found in the “knee” of the Pareto Front. The scheme is successfully applied with up to four objectives. As the authors of previous papers have pointed out, these schemes are not only valid for the scalarizing functions that emerge in multi-objective optimization, but also for other single-objective optimization problems.

The use of multiobjectivization for multi-objective problems is even more challenging. The reason is that, since a larger number of objectives are considered, the selection pressure of the new scheme might be too low. A successful approach to a multi-objective problem is presented in (Ishibuchi et al. 2010). Specifically, a problem with two objectives is transformed into a problem with four objectives. In order to avoid the excessive reduction of the selection pressure, the objectives used are linear combinations of the original objectives. Since the objectives are not independent but correlated, the typical problems that emerge in moeas when applied to problems with many objectives do not arise.

Another field where multiobjectivization has been successfully applied is multimodal optimization. Note that in this case the similarities are clearer because both multi-objective optimization and multimodal optimization involve the finding of multiple optimal solutions. As a result, several ways have been devised to transform a multimodal optimization into a multi-objective optimization. Two of the first applications of multiobjectivization in this field were proposed in Yao et al. (2010) and Deb and Saha (2012). In these cases, the aim was to detect both local and global optima, so the second objective involved minimizing the gradient of the original function. Note that with such a definition, the objectives are not totally in conflict (Wang et al. 2015), meaning some of the desired solutions might be dominated. As a result, non-traditional optimizers are applied. For instance, in Yao et al. (2010) multi-population, clustering and alternation between the defined objectives was used, while in Deb and Saha (2012), the notion of domination was modified to allow the application of nsga-ii. Moreover, since using the gradient has several practical difficulties, some ways of calculating the helper-objective based on a definition of neighborhood were also proposed in Deb and Saha (2012). Some instances where traditional moeas can be readily applied have also been defined. In Song et al. (2015), it was possible to locate multiple optimal solutions of equation systems by transforming this problem into a multi-objective one. Specifically, this task is transformed into a bi-objective optimization problem where both objectives combine information on the variables and objectives. Certain weaknesses of this transformation were enumerated and some of them partially resolved in the extension presented in Wang et al. (2015).

Finally, some studies that consider the computational complexity of eas with multiobjectivization have also been carried out. To our knowledge, the first complexity-based study was done for the single-source shortest-path problem (Scharnow et al. 2005). The analysis of the computational complexity showed that an ea with a specific single-objective fitness function has an exponential complexity. However, a polynomial complexity could be obtained by decomposing such an objective into several components (one for each distance considered). The authors conclude that for this case, the multi-objective optimization scheme better reflects the structure of the problem, so the fitness vector reveals enough information to direct the search to promising regions. Neumann and Wegener (2006) performed a similar analysis for the computation of the minimum spanning trees. The authors showed the superiority of decomposition-based multiobjectivized schemes for calculating the minimum spanning tree in dense graphs. In the case of sparsely connected graphs, the use of the single-objective variant is preferred. Thus, considering such theoretical studies, it has been shown that the suitability of using multi-objective schemes is highly dependent on the optimization problem, and even on the type of instance to be solved. Similar studies have also been carried out with benchmark optimization problems. These studies have been carried out for both multiobjectivization by decomposition (Handl et al. 2008b) and multiobjectivization by aggregation (Brockhoff et al. 2009). It was shown that the running time behavior could be improved or worsened by using multiobjectivization. Finally, in the case of two variants of covering problems, it has been shown that some multiobjectivization models can lead to better approximations than single-objective approaches within the same bounded time (Friedrich et al. 2010). However, the complexity analysis shows that a different multiobjectivization compares unfavorably with the single-objective models due to the improper behavior of the scheme with plateaus (Friedrich et al. 2007). Other problems where additional theoretical analyses have been carried out include the minimum cut (Neumann et al. 2011) and minimum multicut problem (Neumann and Reichel 2008).

4.3 Applications of multiobjectivization

Multiobjectivization has been used to tackle problems in several fields. This section is devoted to list several problems that have been addressed considering the principles of multiobjectivization. In order to avoid an exceedingly large section, some internal details of these schemes are omitted.

One of the first applications of multiobjectivization was to reduce bloat in genetic programming (de Jong et al. 2001; Ekárt and Németh 2001; Bleuler et al. 2001, 2008). In this problem, the minimization of the size of the trees is used as a helper-objective. Note that this is a special case of multiobjectivization because this alternative objective is not aligned with the original objective. Instead, the idea is that maintaining trees with different sizes promotes a larger diversity in the population and allows for small trees with proper functionality. This is not the only case where the helper-objective is not aligned with the original one. For instance, in Preuss et al. (2007) multiobjectivization was successfully applied to the phase-equilibrium detection of chemical plants by including an alternative objective that helps to avoid a fitted region that is of no interest to the search. Similarly, in Sharma et al. (2014) a single-objective ea is first applied to obtain a promising solution. Then, a moea is used by setting as the helper-objective the dissimilarity between a given individual and the solution found by the single-objective ea. In this way, the moea finds good solutions in different regions. Note that these variants might be considered as extensions of guided local search (Voudouris and Tsang 2003), where instead of including penalty terms in the fitness function, an additional objective is included to promote the exploration of different regions. However, in this case, in contrast to guided local search, the definition of the helper-objective is not adaptive.

Multiobjectivization has also been used to tackle the protein structure prediction (psp) problem in several studies (Garza-Fabre et al. 2012b). The psp usually involves the minimization of an energy function and, in most cases, multiobjectivized schemes are based on the decomposition of this energy function (Handl et al. 2008a). Different ways of decomposing the energy function have been explored (Garza-Fabre et al. 2015b). The proposal found in Day et al. (2002) was probably the first attempt to multiobjectivize the psp. In this first study, a multi-objective messy genetic algorithm was used. Several other schemes, like nsga-ii, paes or hybrid schemes including the use of local search and immune-inspired concepts, were subsequently devised (Cutello et al. 2005, 2006; Becerra et al. 2010; Olson and Shehu 2013). Also note that while in most cases two objectives are considered, attempts with up to nine objectives have been proposed (Handl et al. 2007).

Multiobjectivization has also been applied in the field of classification. Deb and Reddy used it to improve on the identification of small subsets of genes, sufficient to correctly classify different samples (Deb and Reddy 2003). The use of multiobjectivization allowed the identification of smaller sets. More recently, multiobjectivization has been used to create better classification rules (Jacques et al. 2013). In this last case, three objectives are considered: two of them are aligned with the metrics usually applied in single-objective case, while the other one is to minimize the number of terms in the rules, so it is quite similar to the cases previously described where the number of nodes in a tree is minimized. The tuning of the parameters of neural networks has also been multiobjectivized with the aim of minimizing the classification error (Pilát and Neruda 2013). Specifically, two helper-objectives that are somewhat correlated to the classification error were considered, with the resulting final errors being lower than in the cases where single-objective optimizers are applied.

We previously mentioned that multimodal optimization has been used not only for typical benchmark problems, but also to solve systems of equations (Song et al. 2015). Note that this last problem has also been considered in other cases, where the aim is to locate one solution of the system of equations, and not to identify multiple optima (Grosan and Abraham 2008).

Some traditional np-hard problems have also been examined. The Traveling Salesman Problem (tsp) has been widely analyzed both with multiobjectivization by decomposition and with helper-objectives (Knowles et al. 2001; Jensen 2004; Jähne et al. 2009; Lochtefeld and Ciarallo 2014). A closely related problem—the vehicle routing problem—was successfully multiobjectivized by considering several different helper-objectives in Watanabe and Sakakibara (2007). Multiobjectivizations for the traditional 0-1 knapsack problem were studied in He et al. (2014). The Job-Shop Scheduling Problem (jsp) has also been studied. In this case, different proposals based on helper-objectives have been devised (Jensen 2004; Lochtefeld and Ciarallo 2011). Finally, helper-objectives were also used to address problems in the communication field, such as the Antenna Positioning Problem (Segura et al. 2010) and the Frequency Assignment Problem (Segura et al. 2012).

Some typical graph problems have also been studied. For instance, the shortest path problem was analyzed in Scharnow et al. (2005), while the minimum spanning tree was analyzed in Neumann and Wegener (2006). In both cases, multiobjectivization by decomposition was applied.

Finally, other problems where multiobjectivization has been successfully applied include the following. In Greiner et al. (2007) the structure design problem is optimized by considering the number of different cross-section types as a helper-objective. The addition of a helper-objective not only yields better solutions, but also provides a more robust behavior considering the variation in the mutation rate. The short-term unit commitment problem is multiobjectivized by considering the reliability as a helper-objective in Trivedi et al. (2012). Specifically, the reliability is calculated as the expected unserved energy. Note that while it is not the most usual, in some studies this energy is considered to be a constraint, so this case might be considered to be in the border between multiobjectivization and multi-objective methods for single-objective constrained problems. In fact, in the experimental validation some comments about the definitions where the reliability is a constraint are also included. In Segredo et al. (2014), a packing problem was multiobjectivized. The triangulation problem in 3-D computer vision was multiobjectivized in Vite-Silva et al. (2007) by decomposing the error function into two different components. The development of proper strategies for playing the game of Nim was successfully addressed in Jong and Bucci (2008). Finally, the generation of tests for programming challenges was successfully multiobjectivized by taking into account three different helper-objectives and reinforcement learning (Buzdalov et al. 2013).

4.4 Discussion

Multiobjectivization has been successfully applied to several complex optimization problems. It has been shown for several cases that multiobjectivized schemes provide much better solutions than similar single-objective schemes. Studies based on both empirical and theoretical analyses have been carried out. Studies with benchmark problems have shown that multiobjectivization might be beneficial for several reasons: diversity maintenance, creation of proper building blocks, etc. However, analyses with practical applications have shown that the main benefits stem from the maintenance of proper diversity. In most cases, the schemes are not compared against other diversity preservation schemes. Only in references such as Handl et al. (2008a) and Jähne et al. (2009), different schemes for promoting diversity were adopted. In these cases, the advantages of multiobjectivization are less impressive than when compared to the simpler versions of single-objective eas. In any case, advantages in terms of solution quality and increased robustness are reported.

In addition, several guidelines for the proper design and use of decompositions and helper-objectives have been proposed. This kind of research has been very helpful for the successful use of multiobjectivization with different optimization problems. One of the principles that have yielded the most benefits is the use of dynamic multiobjectivization. In these cases, different helper-objectives are used in the different optimization stages. This helps to promote diversity and to avoid premature convergence.

In the above description, the details of the multi-objective approaches applied have been omitted. The reason is that, in general, the studies that have been reported have focused on the features of multiobjectivization and not on the characteristics of the multi-objective schemes. However, it is worth mentioning that some of the best-known moeas have been applied in such studies: Non-Dominated Sorting Genetic Algorithm 2 (nsga-ii), Strength Pareto Evolutionary Algorithm II (spea2) and Multi-objective Evolutionary Algorithm Based on Decomposition (moea/d), among others. In some works (Greiner et al. 2007), the use of additional diversity preservation techniques in the moeas has provided further improvements. In addition, some research has been based on hill-climbing schemes, which are mainly used to facilitate the analysis of the transformations produced by multiobjectivization (Louis and Rawlins 1993; Brockhoff et al. 2009). However, for the most complex optimization problems, typical moeas have been applied.

5 Future trends

The amount of research that has been conducted into the three types of schemes explored in this paper is very large. However, in each area there are several research issues that remain to be solved. This section identifies some possible fields of future work for each area.

The use of multi-objective methods for constrained single-objective optimization problems is the area that has been most widely explored among those analyzed in this paper. The number of different proposals is huge and several successful proposals have been developed, so one of the main difficulties is the selection of the technique to be applied. Since they arose with the aim of avoiding the tuning of parameters in penalty-based schemes, this is a large drawback to its use. Thus, in our opinion, there should be an effort to apply these techniques using a common framework with the aim of better analyzing their performance. For instance, the benchmark problems proposed in the Congress on Evolutionary Computation might be used. This would provide fair comparison among the different techniques, providing a better insight into the performance of each scheme. Considering some of the published results, a completely superior algorithm is unlikely to be found. However, it would be of great value to identify the types of problems that can be successfully solved with the various optimization schemes. Taking into account this information, a set of solvers might be picked and integrated with adaptive selection mechanisms as hyperheuristics. If successful, this might facilitate the solving of new problems where there is not much information on the fitness landscape.

As we have shown, for some constrained optimization problems, some single-objective schemes are superior to multi-objective schemes. It would be very interesting to identify those properties that hamper optimization for multi-objective methods. In addition, exploring the properties of the best single-objective schemes to understand the differences might give some insight into possible areas to explore. For instance, many successful single-objective schemes incorporate the use of a local search. This area of research has also been explored in some multi-objective schemes, but the number of proposals is very scarce. In addition, in keeping with the idea of applying hyperheuristics, these might be used to combine single-objective and multi-objective methods.

Finally, it is important to note that in recent years, several advances have been made in the field of many-optimization. Since in some optimization problems a large number of constraints arise, the application of some of the latest advances in this field is very promising. Considering the importance in constrained problems of producing some bias in the search so as to avoid the over-exploration of non-promising regions, the direct use of many-optimization is probably not helpful. However, some of the ideas explored in this area might be successfully adapted to constrained optimization.

The number of methods that consider diversity as an objective is more limited. In any case, several different schemes have been devised, and a large number of different optimization problems have been tackled. As was mentioned earlier, some of the schemes proposed have not been compared against some traditional diversity preservation techniques, such as fitness sharing or crowding. Thus, developing a comparison among the different proposals with some of the most recent published benchmark problems would be of great value. It is also important to note that some of the currently used schemes have limited their use to some specific areas. For instance, the multi-objective novelty-based approaches have only been used in the field of evolutionary robotics. Since they have obtained very promising results, it would be very interesting to test them, for example, in real-parameter optimization environments. In addition, some other diversity preservation techniques might inspire new innovations. For instance, the proposal presented in Landa Silva and Burke (2004) is highly related to the diversity preservation techniques explored herein. In this work, an additional objective is used to promote diversity in multi-objective problems. The objective of each individual is calculated as the contribution to a global diversity measure of the Pareto optimal set. Since this metric is calculated considering the Pareto optimal set and the space of the objectives, it cannot be applied to single-objective optimization. However, adapting it to single-objective optimization should not be too difficult. Since solutions of high-quality were obtained with this proposal, developing an adaptation seems very promising.

Finally, in the case of multiobjectivization, several topics require further research. The use of dynamic objectives is a very promising approach that has yielded high-quality solutions in several schemes. The importance of the order in which they are used has been shown, but so far, proper ordering mechanisms have only been provided for the Job-Shop Scheduling Problem. It would be interesting to conduct similar studies for other typical optimization problems. One of the inconveniences of the above method is the dependency between the ordering mechanism and the optimization problem. It would also be interesting to further analyze whether the ordering might be automatically selected using adaptive mechanisms. In addition, research with scalarizing functions has shown the importance of the location of the original optimum in the Pareto Front. Specifically, clear improvements have been obtained when locating the single-objective optimum in the knee of the Pareto Front. This fact has not been considered in most of the currently available decomposition-based approaches. Therefore, it would be interesting to analyze whether these ideas provide any additional benefits with other optimization problems. Some authors have remarked that multiobjectivization might be useful for test-based problems, where a single-objective function usually aggregates the scores obtained on the tests (Jong and Bucci 2008). Instead of aggregating the results, moeas could be applied. However, in those cases where the number of tests is large, significant drawbacks might appear. For this reason, the development of approximate schemes to properly reduce the number of dimensions seems highly promising. Finally, some authors have noted the relationship between multiobjectivization and the balance between exploration and exploitation. In our opinion, the effects on this balance should be further explored. In addition, multiobjectivization might be combined with those scheme that try to explicitly control the balance between exploration and exploitation.