Keywords

1 Introduction

The search for a well-spread non-dominated front in multi- and many-objective optimisation is still an ongoing challenge. This is especially true in large-scale optimisation which contains a very large number of decision variables or many objective functions. Previous methods try to balance the trade-off between convergence and diversity in different ways. The research in the last years has led to a variety of many-objective optimisation methods (e.g. [2, 6, 10]), as well as a number of methods that can deal with hundreds or thousands of decision variables (e.g. [11, 17, 18, 20]). Concepts that can be found in this area are dimensionality reduction (e.g. [14]), variable interaction analyses or the division of design variables into convergence-related and diversity-related parameters (e.g. [11, 17]), some of which involve increased computational costs.

In this work, we propose a method to increase exploration in multi- and many-objective optimisation by searching in subspaces defined by the current populations’ design variables. The approach is based on the assumption that the evolutionary process finds promising areas of the search space (i.e. areas where good solutions are located) and adjusts the variables in the population accordingly to cover and explore these areas. Once certain promising results have been found, a recombination of these solutions through linear combinations is subject to optimisation in order to create new solutions which benefit from the inherent information in the population. Our newly proposed exploration method can easily be included into any metaheuristic optimisation algorithm and can also lead to a dimensionality reduction without the need for dividing variables into subcomponents. In this work, the mathematical concept is introduced and analysed, and an experimental evaluation shows its benefits when embedded into multi- and many-objective algorithms. The equipped algorithms are tested on a total of 60 different benchmark function instances from the literature with 2 to 5 objective functions and 30 to 514 decision variables.

The remainder of this article is structured as follows. In Sect. 2 the basic principles of multi-objective optimisation are outlined briefly and a short overview about related work on multi-objective and large-scale approaches is given. In Sect. 3 the proposed linear-combination approach is introduced. The mathematical concept is explained first before describing the inclusion of the concept into existing algorithms. The experimental evaluation in Sect. 4 equips a number of well-known metaheuristics with the proposed exploration method and compares their performance on a variety of benchmark functions. Finally, a summary and outlook on future work directions is given in Sect. 5.

2 Multi-objective Optimisation and Related Work

Problems in nature and science often contain multiple conflicting goals which need to be optimised simultaneously. These problems are called multi-objective problems (MOPs) and can be formulated as:

$$\begin{aligned} \begin{aligned} Z: \quad min \quad&\varvec{f}(\varvec{x}) = (f_1(\varvec{x}), f_2(\varvec{x}), ..., f_m(\varvec{x}))^T\\ \quad s.t. \quad&\varvec{x} \in \varOmega \subseteq \mathbb {R}^n \\ \end{aligned} \end{aligned}$$
(1)

where \(m \ge 2\). This kind of MOP maps the decision space \(\varOmega = \{ \varvec{ x} \in \mathbb {R}^n |\, \varvec{ g}(\varvec{x}) \le 0 \}\) of dimension n to the objective space of dimension m. In most problems, some of the constraints define a domain for each variable with lower and upper bounds, i.e. \(x_i \in [x_{i,min},x_{i,max}]\), \(i = 1,..,n\). For such problems, a single optimal solution can often not be determined, since there is usually a trade-off between the objective functions. Modern problem solving methods instead concentrate on finding an approximation of a Pareto-optimal solution set [4, 6, 9].

Two key challenges in finding such a set of solutions are convergence and diversity of the solution set. Convergence refers to the search for better, non-dominated solutions, which improve all objective functions from the current solution set, and therefore bring the whole set closer to the true Pareto-optimal solutions. Diversity on the other hand is necessary to obtain a widely spread set of solutions. The output of a metaheuristic algorithm should provide a diverse set of different solutions which represent different trade-offs among the objective functions and cover the whole Pareto-optimal front as good as possible. Finding a non-dominated set of solutions that is as close to the true Pareto-set and as diverse as possible is an ongoing challenge, especially when many-objective and large-scale problems are involved [11, 17, 18, 20].

In the area of large-scale (i.e. many-variable) optimisation, the topic of reducing the dimensionality of a problem is often of importance. Concepts like Cooperative Coevolution [1] aim to optimise smaller subspaces of the n-dimensional search space by dividing the variables into groups or subcomponents based on different criteria. Approaches to achieve a better balance between convergence and diversity have been used e.g. in [11, 17]. These works carry out an analysis to identify variables which influence the diversity of the solution set before starting the optimisation process. In addition, both methods utilize an interaction detection to from groups of variables. A major drawback of these and similar approaches is that the analysis of variables and formation of variable groups requires an additional and very large computational budget for this preprocessing step, while the actual benefit compared to a random assignment of variables to the groups or less expensive methods is not always guaranteed [13]. Another method called WOF [19, 20] shows a superior convergence behaviour in large-scale problems with up to thousands of variables [21]. WOF aims to balance diversity and convergence through the selection of certain solutions from the population, which are used in a fast-converging transformation and optimisation step of the algorithm.

In the area of many-objective optimisation, a variety of algorithms has been developed in recent years, among them many who adapt the concept of reference vectors like NSGA-III [6], MOEA/DD [10] or RVEA [2]. Reference vectors or directions are a common concept that is used to solve the problem of decreasing selection pressure when Pareto-dominance-based approaches are used for many-objective problems. Such methods have increased the capabilities of metaheuristics to solve many-objective problems. Due to that, an area that might draw increased focus in the future is solving problems with many objectives and a large number of decision variables at the same time, while keeping computational budget as low as possible. This work therefore aims to propose a mechanism that can be used to reduce the dimensionality of such problems and by that improve the solution quality of existing many-objective methods.

3 Proposed Approach

In this section, we propose a search strategy that can be used to enhance exploration of the search space and at the same time reduce the dimensionality of a problem without using variable groups. In metaheuristic evolutionary optimisation, one general assumption is that the encoding of the problem ensures that promising solutions can be generated from a combinations of other good solutions. By extension, Pareto-optimal solutions might share certain characteristics, i.e. decision variable values, which might be similar throughout the whole Pareto-optimal set (as for instance the convergence-related variables of common benchmark families [3, 8]), and which might be approximated by an optimiser. The main goal of this approach is to exploit this information that is inherent in the population of an evolutionary algorithm at a given time, i.e. the information which (sub-)vector-space of the n-dimensional search space contains the (at that point) best or most promising solutions. Based on this, a search inside this subspace in conducted. This concept is also related to that of “innovization” from the literature ([5, 7]), which aims to extract information or design principles from the outcome or during the process of optimisation. In the following, the concept of the proposed search strategy is explained.

3.1 Concept

Suppose we have an optimisation problem with n real-valued decision variables and m objectives as given in Eq. 1. Let the population of an algorithm be P and its size be \(s := |P|\). At each given time of the optimisation process the population consists of s solution vectors each of dimensionality n: \(P = \{ \varvec{ x}^{(1)}, \varvec{x}^{(2)}, ..., \varvec{ x}^{(s)} \}\). Each solution is a vector \(\in \mathbb {R}^n\):

$$\begin{aligned} \begin{aligned} \varvec{ x}^{(i)} = (x^{(i)}_1\; x^{(i)}_2\; ...\;\; x^{(i)}_n) \end{aligned} \end{aligned}$$
(2)

and the set of solutions P defines a vector (sub)space. The dimensionality of this subspace is given by the rank of the matrix of the spanning vectors. We therefore compose the matrix \(\hat{X} \in \mathbb {R}^{s \times n}\) which contains in each row one solution of the population.

$$\begin{aligned} \hat{X} = \begin{pmatrix} x^{(1)}_1 &{} x^{(1)}_2 &{} \cdots &{} x^{(1)}_n \\ x^{(2)}_1 &{} x^{(2)}_2 &{} \cdots &{} x^{(2)}_n \\ \vdots &{} \vdots &{} \ddots &{} \vdots \\ x^{(s)}_1 &{} x^{(s)}_2 &{} \cdots &{} x^{(s)}_n \end{pmatrix} \end{aligned}$$
(3)

An evolutionary algorithm (EA) combines the solutions in the current population by using crossover operators. However, instead of classical crossover methods, it is also possible to combine the existing solutions linearly. This can be done through convex, conical or arbitrary linear combinations. In the following we focus on general linear combinations as they include the convex and conical combinations as subsets. A linear combination of the solutions in the population is defined as follows:

$$\begin{aligned} \varvec{x}' = \varvec{ y} \hat{X} = y_1 \varvec{ x}^{(1)} + y_2 \varvec{ x}^{(2)} + ... + y_s \varvec{ x}^{(s)} \end{aligned}$$
(4)

where \(\varvec{y}\) is the vector of coefficients of the combination:

$$\begin{aligned} \varvec{y} = (y_1\; y_2\; ...\; y_s) \end{aligned}$$
(5)

With this concept it is possible to search the subspace spanned by the s vectors in the population, and thereby find improved solutions from combinations of the existing ones. The actual dimension of this subspace is defined by the rank of the matrix \(\hat{X}\) which is bounded between \(1 \le rank(\hat{X}) \le min\{ n, s \}\).

In a way, this method can be seen as a s-parent crossover method. In contrast to a crossover, in our proposed method the parameters of the vector \(\varvec{y}\) are subject to an optimisation process. While a multi-parent crossover would produce a random combination and let the evolutionary process judge whether this was a good one, the proposed procedure uses an evolutionary process to find “optimal” combinations.

In order to find a good linear combination, the values of the vector y are optimised instead of the original variables. As a consequence, this newly formed optimisation problem has only s decision variables compared to the original one with n variables. This means, in case a problem with \(n = 30\) variables is optimised with a population size of \(s = 100\), there might be redundancy in the newly formed linear-combination-problem, and the algorithm now searches in 100 dimensions, even though the actual space in which the solutions are created is still 30-dimensional, and some of the vectors of the linear combinations are not independent in this case. However, the situation differs when applied to a high-dimensional problem with for instance \(n = 500\) variables. The s solutions can at most define a s-dimensional subspace. If all solutions are randomly created in the beginning, it is not guaranteed that good solutions actually lie in the defined subspace. However, when the algorithm is allowed a certain progress to find a preliminary approximation of the optimal areas, we can assume that promising parameter combinations might have been found already, and the spanned subspace might contain additional good solutions. In that case, optimising the linear-combination-solutions can also be regarded as a dimensionality reduction technique, as it enables the algorithm to search in a 100-dimensional subspace instead of the 500-dimensional original search space. This makes the method not only promising for multi- and many-objective problems, but also for the area of large-scale optimisation.

3.2 Inclusion into Other Algorithms

The proposed concept can be used inside arbitrary metaheuristic optimisation algorithms. To do so, we define a population Q of \( \varvec{y}\)-vectors, where each vector in the population defines one linear combination of the members of P as described above. By this, we can use any metaheuristic optimiser on this newly formed population to find suitable linear combinations of the underlying original solutions in the population P. Since the optimisation of the population Q relies on the assumption, see above, that the population P defines a promising subspace of \(\varOmega \), the proposed method is included into other metaheuristics as an additional search step. In particular, we apply the original (arbitrary) metaheuristic in turns with the proposed linear-combination-search. As a further step to concentrate on promising solutions, the linear combinations are only performed on the non-dominated solutions in the population.

Let \(\hat{X}\) be the matrix of the decision variable values of all non-dominated solutions in P as seen above, where each row in \(\hat{X}\) corresponds to one non-dominated solution in P. As a result, \(\hat{X}\) is an \(s' \times n\) matrix, where \(s'\) is the number of non-dominated solutions. In the same way, let \(\hat{Y}\) be the matrix of the decision variable values (i.e. coefficients of linear combinations) of the solutions in Q. The population size of Q is t, therefore \(\hat{Y} \in \mathbb {R}^{t \times s'}\). The original objective function evaluation can be applied to the new population by simply multiplying \(\hat{X}\) with \(\hat{Y}\) and computing \(\varvec{f}(\hat{Y}\hat{X})\), i.e. applying \(\varvec{f}\) to each row in \(\hat{Y}\hat{X}\). For practical reasons and to limit the search space of the newly found problem, the variables \(y_i\) are also equipped with lower and upper bounds, i.e. \(y_i \in [y_{i,min},y_{i,max}]\), \(i = 1,..,s'\).

The outline of the resulting algorithm looks as follows:

  1. 1.

    Optimise the population P with any multi-objective method for a specified time.

  2. 2.

    Use the first non-dominated front of the current population P to build the matrix \(\hat{X}\) out of its decision variables’ values.

  3. 3.

    Create a random population Q of linear-combination-vectors.

  4. 4.

    Optimise Q for a certain time using an arbitrary optimisation method. Store all evaluated solutions in an Archive A.

  5. 5.

    Merge the population P with A and proceed with the normal optimisation process (Step 1).

4 Evaluation

To evaluate the proposed method, we have included it into several well-known optimisation algorithms from the areas of multi- and many-objective optimisation. These algorithms are NSGA-II [4], GDE3 [9] as representatives of traditional evolutionary methods, both classical and differential evolution, and NSGA-III [6] and RVEA [2] to represent many-objective methods. The aim of the experiments is not to show the superiority of one of these methods over one another, but rather to show that the proposed exploration method has a positive influence when applied to existing algorithms. Due to space limitations, an inclusion into dedicated large-methods like LMEA [17], MOEA/DVA [11] or WOF [20], and the analysis of this methods’ capabilities as a dimensionality reduction mechanism, is subject to future work. Each of the four used algorithms has been equipped with the proposed method by applying in terms 100 generations of the original algorithm and after that 100 generations of the search in the formed subspace as described above. The created solutions are then merged back into the original population using the usual selection method of the respective algorithm. For the optimisation of the population Q, the NSGA-II algorithm is used in all cases. This is done so that all algorithms use the same exploration mechanism for searching the formed subspace. Future research might deal with different mechanisms in this regard, as the NSGA-II mechanism might not be the optimal choice for many-objective problems. This procedure is repeated until the maximum amount of function evaluations is reached. The modified versions of the algorithms are denoted with an “x” in front of their names, i.e. xNSGA-II, xGDE3, xNSGA-III and xRVEA. To test the performance, we use a total of 60 test problem instances from three common benchmark families with 2 to 5 objectives and 30 to 514 decision variables. The used problems are as follows:

  • Six problems from the LSMOP (large-scale multi- and many-objective test problems) family [3]: LSMOP1-6. Each of them is tested with 2, 3, 4 and 5 objective functions, resulting in 206, 307, 413 and 514 decision variables.

  • Six problems from the popular WFG family [8]: WFG2-5, WFG7 and WFG8. All of them are tested with 2 and 3 objectives, in combination with both 40 and 400 decision variables. The WFG problems were chosen by an analysis done in [11], where WFG2 and 3 represent problems with a sparse number of interacting variables, WFG4 and 5 have no interactions and WFG7 and 8 have a high number of interacting variables.

  • Six problems from the CEC 2009 unconstrained benchmarks: UF1-3 are 2-objective problems, UF8-10 are 3-objective problems. All of them are tested with 30 and 300 variables.

For implementation, the PlatEMO framework [15] version 1.5 is used. For each experiment we perform 31 independent runs and report the median and interquartile range (IQR) values of the relative hypervolume (HV) indicator [16]. The relative HV is the hypervolume obtained by a solution set in relation to the hypervolume obtained by a sample of the Pareto-front of the problem, consisting of 10, 000 solutions as provided by the PlatEMO framework. The used reference point for the indicator is obtained by using the nadir point of our Pareto-front sample (i.e. the point in the objective space containing the worst value in each dimension throughout the sample) and multiply it by 2.0 in each dimension. Statistical significance is tested using a two-sided Mann-Whitney-U Test with the null hypothesis that the tested samples have equal medians. Statistical significance is assumed for a value of \(p < 0.01\).

4.1 Parameter Settings

The maximum number of function evaluations for all algorithms and problem instances is set to \(100,000\). The number of position-related variables for the WFG problems has been set to n / 4 and the parameter \(n_k\) in the LSMOP benchmarks was set to 5. The population size is set to 40 in all instances of NSGA-II and GDE3. The population sizes of NSGA-III and RVEA are set to 40, 36, 35 and 40 for \(m=2, 3, 4\) and 5 objectives respectively, due to the uniform generation of reference vectors. All algorithms use polynomial mutation with a distribution index of 20.0 and a probability of 1 / n. All algorithms, except GDE3, use the simulated binary crossover with a distribution index of 20.0 and a probability of 1.0. In GDE3 are \(CR = 1\) and \(F = 0.5\). In RVEA, \(\alpha = 2\) and \(f_r = 0.1\) as in the original work. The bounds of the coefficients for the linear combination are set to \(y_{i,min} = -10.0\), and \(y_{i,max} = 10.0\).

4.2 Results

The results of the experiments are shown in Tables 1 and 2, where each algorithm is compared with its respective linear-combination-enhanced counterpart. Overall, the proposed method is beneficial for the performance of all algorithms in most problem instances.

Table 1. Obtained median and IQR values of the relative Hypervolume for NSGA-II and xNSGA-II as well as GDE3 and xGDE3. An asterisk in the column of an x-Algorithm indicates statistical significance to the respective original version of that algorithm. Best performances are marked in bold and shaded where significant.
Table 2. Obtained median and IQR values of the relative Hypervolume for NSGA-III and xNSGA-III as well as RVEA and xRVEA. An asterisk in the column of an x-Algorithm indicates statistical significance to the respective original version of that algorithm. Best performances are marked in bold and shaded where significant.

First, we take a look at the two traditional evolutionary algorithms. The xNSGA-II performs significantly better (based on the used Mann-Whitney-U Test) compared to the original NSGA-II in 44 out of 60 problem instances, and achieves an equal result in 9 cases. xGDE3 is significantly better or equal to its counterpart in 52 out of 60 cases. Notable is that both original algorithms can perform better just in a few 2-objective and 3-objective instances, while in all higher dimensional problems with 4 and 5 objectives, they lack the ability to even achieve any solution beyond the reference point for the HV calculation, resulting in a HV of zero (denoted as dashes in the tables). The linear combination technique enables these algorithms to achieve significantly better results in even high-dimensional problems with 5 objectives and over 500 decision variables.

Next, we examine the two many-objective algorithms NSGA-III and RVEA. Also in these methods the proposed approach is able to improve the performance of both algorithms significantly. In NSGA-III, the modified version with linear combination performs significantly better in 49 problem instances and performed equally well in another 6. xRVEA outperforms its original version in 49 instances as well, with 5 more draws. An interesting observation is that even though both algorithms are originally designed to work with many-objective problems, their enhanced versions increase their performances even in these instances to a great extent. It is worth to note that the performance on the many-objective instances is significantly increased, even though the subspace of linear combinations is searched with the NSGA-II mechanism, which is usually not designed for many-objective problems. A possible explanation for this fact might be that NSGA-III and RVEA do not posses a mechanism for dealing with high-dimensional search spaces. Since the LSMOP problems do not only contain many objective function but also high-dimensional search spaces, this might turn out a challenge for these algorithms. The positive influence of the linear-combination-search might be, at least partly, due to the inherent reduction of dimensionality. This is also supported by the fact that for all the four algorithms, the original version did only perform better than the x-versions in low-dimensional problems, almost exclusively in UF and WFG problems with only 30 or 40 variables.

Another interesting observation concerns the type of problem where the linear-combination-search seems to work less effectively. Among the few instances where the modified algorithms do not perform best are the (low-dimensional) WFG4 and WFG5 problems, both with 2 and with 3 objectives. WFG4 and WFG5 are both fully separable. Furthermore, NSGA-II and NSGA-III outperform their modified counterparts on the 2-objective LSMOP5 problem, which is also fully separable. The separability of variables suggests that an algorithm can reach optimal solutions by altering variables completely independent of each other. These results imply that for such problems, at least in low-dimensional search spaces, a combination of solutions, which actually alters all variables at the same time through the linear coefficients, might not be suitable.

In summary, we conclude that the proposed approach of optimising linear combination of the population members is able to increase the performance of multi- and many-objective algorithms in most cases. This is especially true for higher numbers of decision variables and higher numbers of objective functions. The authors further tested the method on the remaining problems from the WFG, UF and LSMOP families, which were not reported here due to page limitations, and obtained similar superior performance of the proposed method.

5 Conclusion and Future Work

This article proposed a new mechanism for exploration and solution creation in multi- and many-objective optimisation. The mathematical concept is able to focus the search on relevant areas and at the same time reduce the dimensionality of the original problem without using (possibly expensive) variable grouping methods. After we introduced the mathematical concept, we described how this approach can be incorporated into existing metaheuristic algorithms and explored its capabilities on a variety of benchmark functions with different characteristics and dimensionality. The results indicate that this linear-combination approach can improve the performance of existing methods in both large-scale and many-objective optimisation.

Future work in this area involves exploring the possibilities of this approach further. It can be included into specific large-scale metaheuristics like the WOF as a dimensionality reduction technique. Another possible application might be in constrained problems. Linear combinations have been applied to particle swarm optimisation in [12] to preserve the feasibility of individuals. The approach described in this article can easily be adapted to only allow certain linear combinations, for instance convex ones. If the search is restrained in this way to convex combinations, the algorithm can by definition only create feasible solutions out of existing feasible ones, provided that constraints are linear. Therefore, this can be a promising direction for constraint handling in (large-scale) many-objective optimisation.