Improving evolutionary algorithms with information feedback model for large-scale many-objective optimization

Wang, Yong; Zhang, Qian; Wang, Gai-Ge

doi:10.1007/s10489-022-03964-9

Improving evolutionary algorithms with information feedback model for large-scale many-objective optimization

Published: 06 September 2022

Volume 53, pages 11439–11473, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Applied Intelligence Aims and scope Submit manuscript

Improving evolutionary algorithms with information feedback model for large-scale many-objective optimization

Download PDF

Yong Wang¹,
Qian Zhang¹ &
Gai-Ge Wang¹

630 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Recently, many evolutionary algorithms have been proposed. Compared to other algorithms, the core of the many-objective evolutionary algorithm using a one-by-one selection strategy is to select offspring one by one in environmental selection. However, it does not perform well in resolving large-scale many-objective optimization problems. In addition, a large amount of meaningful information in the population of the previous iteration is not retained. The information feedback model is an effective strategy to reuse the information from previous populations and integrate it into the update process of the offspring. Based on the original algorithm, this paper proposes a series of many-objective evolutionary algorithms, including six new algorithms. Experiments were carried out in three different aspects. Using the same nine benchmark problems, we compared the original algorithm with six new algorithms. Algorithms with excellent performance were selected and compared with the latest studies using the information feedback model from two aspects. Then, the best one was selected for comparison with six state-of-the-art many-objective evolutionary algorithms. Additionally, non-parametric statistical tests were conducted to evaluate the different algorithms. The comparison, with up to 15 objectives and 1500 decision variables, showed that the proposed algorithm achieved the best performance, indicating its strong competitiveness.

A large-scale multi-objective evolutionary algorithm based on importance rankings and information feedback

Article 08 June 2023

A Two-phase evolutionary algorithm framework for multi-objective optimization

Article 25 November 2020

Many-objective Optimization Using Evolutionary Algorithms: A Survey

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Owing to the needs of engineering applications and scientific experimental research, there are many multi-objective optimization problems (MOPs) in real life. The main optimization goal is to maximize or minimize the final value with as few resources as possible [1]. The number of objective functions of MOPs increases to two or three, unlike in the single-objective optimization problem [2]. However, the objective function for many optimization problems is not limited to three or fewer. As a result, many-objective optimization problems (MaOPs) with more than three goals arise [3]. There is often no ideal solution to achieve optimal values for all goals. At present, methods to solve optimization problems are generally divided into two categories: traditional mathematical optimization methods (e.g., gradient descent and Newton) and evolutionary algorithms [4].

Traditional optimization algorithms that use information on the first derivative and second derivatives of related functions are often unsuitable for practical application problems and cannot guarantee that the final approximate optimal solution set can be evenly distributed and converged [5]. To solve this problem, many researchers have proposed many evolutionary algorithms. These algorithms draw inspiration from biological evolution and guide the process of finding solutions by simulating the process of biological evolution, using the survival of the fittest principle in genetics [6, 7]. Such as genetic algorithm (GA) [8], particle swarm optimization (PSO) [9], artificial bee colony (ABC) [10], ant colony algorithm (ACO) [11] and differential evolution (DE) [12].

Many scholars have proposed numerous multi-objective optimization algorithms (MOEAs) to solve MOPs. The most famous ones are the decomposition-based multi-objective evolutionary algorithms (MOEA/D) [13], genetic algorithms based on non-dominated sorting and elite selection (NSGA-II) [14], evolutionary algorithms using neighborhood environment selection (SPEA2) [15], etc. With the advent of 5G era and the rise of deep learning, evolutionary algorithms have applications in many emerging fields [16]. They are currently widely used in economic dispatching [17], code detection [18], engineering optimization [19], face reconstruction [20], job-shop scheduling [21,22,23], high performance computing [24], computer network [25], and other fields. These MOEAs have been proved successful in solving most MOPs. However, the non-dominated solution set obtained by these algorithms often cannot effectively approximate the real Pareto front of MaOPs with optimization goals greater than three [26]. Therefore, considering the improvement of the convergence and diversity of the algorithms, this paper examines the many-objective evolutionary algorithm using a one-by-one selection (1by1EA) [27], aiming to improve its optimization performance in large-scale MaOPs. The method uses a convergence index to select the solutions in the current population one by one, adopts a distribution index based on cosine similarity to evaluate the similarity between solutions, and proposes the boundary maintenance mechanism of corner solutions to maintain diversity.

However, few researchers have focused on the utility of prior knowledge and historical information. Many have adopted the method of retaining the best individual in the process of each iteration and discarding all other individuals. This is not conducive to the development of population diversity and the preservation of information. More importantly, most MOEAs focus on solving optimization problems with less than 100 decision variables. The problem of more than a thousand decision variables in engineering application research may also exist. Therefore, we have to enhance 1by1EA to improve the result. Therefore, this paper introduces the information feedback model (IFM) [28] into the framework of 1by1EA and proposes a new optimization algorithm called 1by1EA-IFM, which solves large-scale MaOPs. Unlike most existing algorithms, this algorithm retains the historical information of individuals in the population before selecting offspring individuals. The experiment was carried out in three aspects. First, the performance of the improved six algorithms was compared with the original algorithm. Second, algorithms with excellent performance were selected and compared with current IFM research from two levels. Finally, we chose the one with the best performance and compared it with the other six state-of-the-art algorithms from a longitudinal perspective.

The main contributions of this work can be summarized as follows:

(1)
A novel framework, called 1by1EA-IFM is proposed, where individuals from historical populations are retained in two ways: either in a fixed location or randomly retained. The individual performance is determined by the fitness function value. In each retaining approach, k individuals are selected (k ∈{1,2,3}). Based on the value of k, IFM includes six models (i.e., M-F1, M-F2, M-F3, M-R1, M-R2, and M-R3), which are combined with 1by1EA to present six algorithms, namely 1by1EA-F1, 1by1EA-F2, 1by1EA-F3, 1by1EA-R1, 1by1EA-R2, and 1by1EA-R3. An advantage of IFM is that it can efficiently enhance the diversity of the population by reusing some information from historical populations.
(2)
Experiments were performed on large-scale MaOPs, where the number of decision variables reaches up to 1500. By controlling the number of decision variables, we can improve the scale of the problem, so as to test whether the algorithms can effectively target large-scale problems. In all the validation experiments, the comparison of the indicator values involving the different algorithms was carried out using the Mann-Whitney-Wilcoxon rank-sum test, which can determine whether one algorithm has statistical difference with other algorithms. In addition, with the intention of obtaining rigorous conclusions, two different statistical tests, Friedman’s non-parametric test and Holm’s post-hoc test, were conducted.

The rest of this paper is organized as follows. In Section 2, we introduce the work related to this research. Section 3 provides some basic concepts about multi/many-objective optimization, as well as the original 1by1EA and IFM. The proposed 1by1EA-IFM is described in detail in Section 4. The settings and results related to the experiment are described in Section 5. Section 6 provides a summary of this paper.

2 Related work

There are a variety of optimization problems with more than three objectives in social life, and the corresponding solution is the many-objective evolutionary algorithms (MaOEAs). For most evolutionary algorithms that rely on Pareto dominance, the non-dominant solution sets often cannot effectively approximate the real Pareto front of MaOPs. Therefore, redefining the dominance relationship is an effective measure to solve MaOPs for MOEAs. The main studies were ε domination [29], fuzzy Pareto domination [30] and so on. Chhabra et al. [31] proposed a fuzzy Pareto-driven artificial swarm algorithm (FP-ABC) to better solve for MaOPs. In the FP-ABC algorithm, two external files are combined into the artificial bee colony algorithm to improve the performance of the ABC algorithm. In 2022, Wu et al. [29] introduced a new ε dominance relation and proposed a new MaOEAs (ε-Two_Arch2) to update the individuals in diversity archive. The common problem with these algorithms is that they may cause the population to converge to a sub-region of the Pareto optimal front.

Unlike the above algorithms based on improving dominance relationship, decomposition-based MaOEAs decompose MaOPs into several sub-problems for simultaneous optimization, and the goal of the problem is aggregated into different scalar functions. These algorithms guide the individual to search in the direction near the Pareto front by minimizing these scalar function values. In 2018, Zheng et al. [32] redesigned the weight vectors used in the sub-problem and proposed a new weighted mixture-style method to enhance MOEA/D. Lucas et al. [33] introduced a new MOEA/D with uniformly randomly adaptive weights (MOEA/D-URAW) to deal with the limitation of algorithm independent of problem geometry. However, when the Pareto front of the problem is too complex and inconsistent with the distribution of the weight vector, the distribution of the population obtained by MOEAs is often poor.

Different from the above algorithms, MaOEA/IGD [34] is a many-objective evolutionary algorithm based on the IGD index. This algorithm first used a single-objective evolutionary algorithm to estimate the range of Pareto front, and then established a hyper-plane in the target space according to this range. Bader et al. [35] used Monte Carlo approximation based hypervolume indicators for environmental selection, and proposed a hypervolume estimation algorithm (HypE). One of the biggest problems with the above indicator-based methods is that the computational complexity of the indicator can be too high, thus, the algorithm speed may be generally slower than that of ordinary algorithms.

In addition to MaOEAs covered in the above classifications, the many-objective evolutionary methods have shown some new characteristics in recent years. The selection phrase plays a key role in most algorithms, taking into account both the convergence and the distribution of the solution. However, the increase in the number of objectives and decision variables has brought huge challenge to MaOEAs, such as Pareto dominance ineffectiveness and the conflict problem of convergence and diversity [36].

2.1 Selection strategy

Many scholars have studied MaOEAs based on selection strategies. In the first few years of evolutionary algorithms’ development, in order to avoid loss of external solutions leading to reduced diversity, Zitzler et al. [15] proposed a strength Pareto evolutionary algorithm (SPEA2), which used density estimation, archive truncation, and fine-grained fitness allocation as environmental selection to solve MaOPs. Deb et al. [37] used a reference point strategy-based non-dominated sorting algorithm (NSGA-III) to improve the selection process of NSGA-II to choose individuals with good convergence by using a set of pre-defined references point as the standard. Recently, many novel studies have emerged one after another. In 2021, Liu et al. [38] proposed a many-objective evolutionary algorithm based on decomposition with correlative selection mechanism (MOEA/D-CSM) to find its correlative individuals for each reference point as soon as possible to maintain the diversity of the population. In the same year, Palakonda et al. [39] proposed an ensemble framework (ENMOEA) in which mating and environmental selections of diverse MOEAs are combined. The framework demonstrates the scalability of the algorithm with the addition of a selection strategy. In this year, in order to solve discrete MaOPs, Zhao et al. [40] adopted an adaptive selection strategy to improve the convergence performance of decomposition-based ACO by using different reference points. In the end, this algorithm can effectively improve the quality of optimization.

Although the improvement of the selection strategy makes the overall performance of the algorithm better, trying more strategies can further improve the universality of the algorithm. However, the majority of researchers have not considered the scale of the problem, but have only solved the problem of the number of variables in small- and medium-scale decisions. Therefore, in-depth study of large-scale problems is needs.

2.2 Large-scale optimization

In order to deal with the problem of a large number of decision variables in practical, many researchers have made relevant studies. Chen et al. [41] introduced a scalable small subpopulations based covariance matrix adaptation evolution strategy (S3-CMA-ES). The algorithm used a series of small subpopulations to approximate the Pareto optimal solution, and introduced a variety of diversity improvement strategy to solve the MaOPs of large-scale decision variables. He et al. [42] embedded adaptive offspring generation method in a MOEA framework (DGEA) and proposed a pre-selection strategy to select parents and used them to construct a direction vector in the decision space to propagate offspring. Tian et al. [43] combined a competitive swarm optimizer with large-scale multi-objective optimization (LMOCSO), the proposed algorithm used a new particle update strategy based on two stages to update the particles.

The focus of previous research has been on the scale of the problem, but the operation of the specific details of the algorithm has not been studied in-depth, such as how to better coordinate the environmental selection and mating selection, or how to use historical information more effectively. However, the emergence of the above algorithms provides an idea for the development of this paper.

Considering the influence of selection strategy, this paper introduces IFM into 1by1EA. Instead of selecting the best individual each time, IFM selects individuals from the previous iteration in a fixed or random manner, and then, they are used for the generation of offspring, thus increasing the diversity of the population. Prior to this, researchers have conducted research on IFM. For example, Gu et al. [44] and Zhang et al. [45] used IFM to improve NSGA-III and MOEA/D, respectively. The proposed algorithms are used to solve large-scale MaOPs. Therefore, the new algorithm is compared with those in the two above mentioned studies. For large-scale research, we chose DGEA and LMOCSO which are also used to solve large-scale problems, and the remaining four state-of-the-art algorithms, which have been proposed in the last 3 years. IFM improves the competitiveness and persuasiveness of 1by1EA to solve problems with large-scale decision variables.

3 Preliminaries

3.1 Basic definitions

MOPs are a common problem in many areas of the real world. Assuming they are a minimization problem, the optimization goal is to minimize all objective functions as much as possible. MOPs can be defined as [46]:

$$ \begin{array}{@{}rcl@{}} &&minF(x) = [f_{1}(x), f_{2}(x), {\dots} , f_{m}(x)]^{T}\\ &&s.t. x \in {\Omega}, f \in R^{m} \end{array} $$

(1)

where m is the number of objective functions and Ω is decision space. x is an n-dimensional decision variable, that is, x = [x₁,x₂,…,x_n]^T ∈Ω, it includes possible solutions to the problem. For the set of m-dimensional objective functions $F: f \rightarrow R^{m}$, it matches the n-dimensional decision space and the m-dimensional objective space. When m > 3, this problem can be defined as an MaOP.

The following concepts have been widely popularized:

Pareto dominance: For any two solutions x₁, x₂ in (1), x₁ Pareto dominates x₂ if they satisfy the following conditions, denoted as x₁ ≺ x₂.
$$ \begin{array}{@{}rcl@{}} &&f_{i}(x_{1}) \leq f_{i}(x_{2}), \forall i \in \{1, 2, \ldots, m\}\\ &&f_{j}(x_{1}) < f_{j}(x_{2}), \exists j \in \{1, 2, \ldots, m\} \end{array} $$
(2)
Pareto optimal set: For x^∗∈Ω in (1), if there is no solution x₁ ∈Ω satisfying x₁ ≺ x^∗, then x^∗ is known as the Pareto optimal solution. All of these solutions come together to form the Pareto optimal set (PS).
Pareto optimal front: The set of objective value vectors corresponding to each solution in PS is called Pareto optimal front (PF).
$$ PF = \left\{F(x) \mid x \in PS\right\} $$
(3)

3.2 1by1EA

In order to balance the convergence and diversity of solutions in the high-dimensional target space, 1by1EA was proposed [27] and used to solve MaOPs.

The main contribution of this algorithm is that it uses a convergence index to select the solutions in the current population one by one, and proposes a distribution index instead of the Euclidean distance to evaluate the distance between the solutions in the high-dimensional space. It can be used to choose the neighboring solution of the selected solution, so as to weaken it by using niche technology. Unlike most MaOEAs, a boundary maintenance mechanism ensures that the corner solution is not easily discarded. A corner solution can be defined as the individual with the smallest scalar value aggregated by k objectives in the current population, and the number of k is less than the total number of objectives. The general operating framework of the algorithm is given in Algorithm 1. First, N individuals are randomly generated to form the initial parent population (P). Then, mating selection is performed on P to select parents for producing offspring, which iterates until N offspring are produced. The offspring form the offspring population Q. After the variation operations are performed on Q, Q₁ is obtained, and P and Q₁ are combined to form K. The convergence index and distribution index of each solution are calculated in K, and then, the one-by-one selection method is used to select N individuals from K to constitute the initial population of the next generation.

When calculating the convergence index, each target in the high-dimensional space is treated equally and aggregated into a scalar. The equation is as follows [27]:

$$ c(x) = agg(f_{1}(x), f_{2}(x), {\ldots} , f_{m}(x)) $$

(4)

the distribution index takes the form of a vector, where each element represents the distance between a solution and other solutions in the population. Based on the fact that Euclidean distance is unsuitable for the distribution in high-dimensional space, cosine similarity can use the cosine value of the angle between two vectors to measure the similarity. The distribution index can be expressed as follows [47]:

$$ d(x_{i}) = (d_{1}(x_{i}), \ldots, d_{\vert k \rvert} (x_{i})), i = 1, \ldots, \vert k \rvert $$

(5)

where

$$ d_{j}(x_{i}) = 1 - cos(\theta_{ij}), j = 1, \ldots, \vert k \rvert $$

(6)

is the distance between x_i and x_j. The smaller the value, the more similar the distribution between the two solutions. This indicator can effectively remove points close to the coordinate axis instead of on the PF.

As the key operation of 1by1EA, the execution process of the one-by-one selection strategy in environment selection can be described as follows:

Step 1:: Boundary maintenance. Corner individuals are selected from K in set K_s.
Step 2:: Determine the set K_s.
Step 2.1:: The convergence index is calculated for the remaining individuals in K and they are put into K_s one by one according to the one-by-one selection strategy.
Step 2.2:: The distribution index is used to measure which solutions are close to the individuals in K_s, and then, the solutions and those dominated by these solutions are de-emphasized. Thus, the non-dominated solutions are retained.
Step 2.3:: Stopping criterion. If K is not null, repeat Step 2.1 and Step 2.2, and if K is null, go to Step 2.4.
Step 2.4:: Selection. If the number of solutions in K_s is greater than N, then the first N individuals are the initial parents of the next generation. If the number of solutions in K_s is less than N, the de-emphasized and dominated solutions will be selected and filled into K_s according to the principle of survival of the fittest until the number is N.
Step 3:: P = K_s. K_s is the next generation of parents.

Before calculating the two index values, the normalization of each individual in K ensures that only corner solutions are retained in K. Previous studies have shown that the normalization operation can effectively deal with the problem of dimensionality curse [37]. For other details about the algorithm, please refer to the original paper [27].

3.3 Information feedback model

As mentioned before, IFM can use a simple fitness weighting method to extract and fully use the information in the previous iteration. In the previous iteration, a fixed position or random method can be used to select k individuals from the population. The method of selecting individuals in a fixed manner is classified as F, and selecting individuals in a random manner is classified as R. If k is 1, the model can be defined as M-F1, and six IFMs can be obtained.

In M-F, individuals are selected in a fixed manner, which means that for individuals who need to be updated at the current generation, individuals are selected at the same position at the current and previous generations. Here are three models for selecting individuals based on fixed methods: M-F1, M-F2, and M-F3. In M-R, individuals are selected in a random manner, which means that for individuals who need to be updated at the current generation, the location of individuals selected from the previous generation is random. Therefore, three models for selecting individuals based on random methods are M-R1, M-R2, and M-R3.

Suppose that the current generation is t, the position of the next generation individual is i, ${x_{i}^{t}}$ is the i-th individual of the t-th generation, ${f_{i}^{t}}$ is the fitness value corresponding to the i-th individual of the t-th generation, and y_i is the individual generated by the basic algorithm. The corresponding fitness value is F^t+ 1. λ and μ_k are the weight vectors, satisfy , λ > 0, and μ_k > 0, where the value of k is 1, 2, or 3, λ + Σμ_k = 1, λ > 0, and μ > 0. j_m is a randomly selected individual position, i and j are not equal, and then, m is between 1 and the population size N. These six models can be defined as follows:

For the individual to be generated at the (t + 1)-th generation, we select an individual from the random position j of the previous generation and combine with the individual at the current (t + 1)-th generation to update individual at the next generation. The model is expressed as follows [28]:
$$ \begin{array}{@{}rcl@{}} x_{i}^{t+1} &=& \lambda y_{i}^{t+1} + \mu {x_{j}^{t}}\\ \lambda &=& \frac{{f_{j}^{t}}}{F^{t+1}+{f_{j}^{t}}}\\ \mu &=& \frac{F^{t+1}}{F^{t+1}+{f_{j}^{t}}} \end{array} $$
(7)
when i = j, this model can be defined as M-F1.
We randomly select an individual from the t-th generation and the (t - 1)-th generation. A total of two individuals are selected to update the next-generation individuals. The model is expressed as follows [44]:
$$ \begin{array}{@{}rcl@{}} & x_{i}^{t+1} = \lambda y_{i}^{t+1} + \mu_{1} x_{j_{1}}^{t} + \mu_{2} x_{j_{2}}^{t-1}\\ & \lambda = \frac{1}{2} \cdot \frac{f_{j_{2}}^{t-1}+f_{j_{1}}^{t}}{F^{t+1}+f_{j_{1}}^{t}+f_{j_{2}}^{t-1}}\\ & \mu_{1} = \frac{1}{2} \cdot \frac{F^{t+1}+f_{j_{2}}^{t-1}}{F^{t+1}+f_{j_{1}}^{t}+f_{j_{2}}^{t-1}}\\ & \mu_{2} = \frac{1}{2} \cdot \frac{F^{t+1}+f_{j_{1}}^{t}}{F^{t+1}+f_{j_{1}}^{t}+f_{j_{2}}^{t-1}} \end{array} $$
(8)
where $x_{j_{1}}^{t}$ and $x_{j_{2}}^{t-1}$ are individuals selected from random positions at the t-th and (t - 1)-th generations. The corresponding fitness function values are $f_{j_{1}}^{t}$ and $f_{j_{2}}^{t-1}$. When i = j₁ = j₂, this model can be defined as M-F2.
We randomly select three individuals from the t-th, (t - 1)-th, and (t - 2)-th generations to update individuals at the next generation. The model is expressed as follows [45]:
$$ \begin{array}{@{}rcl@{}} & x_{i}^{t+1} = \lambda y_{i}^{t+1} + \mu_{1} x_{j_{1}}^{t} + \mu_{2} x_{j_{2}}^{t-1} + \mu_{3} x_{j_{3}}^{t-2}\\ & \lambda = \frac{1}{3} \cdot \frac{f_{j_{3}}^{t-2}+f_{j_{2}}^{t-1}+f_{j_{1}}^{t}}{F^{t+1}+f_{j_{1}}^{t}+f_{j_{2}}^{t-1}+f_{j_{3}}^{t-2}}\\ & \mu_{1} = \frac{1}{3} \cdot \frac{F^{t+1}+f_{j_{3}}^{t-2}+f_{j_{2}}^{t-1}}{F^{t+1}+f_{j_{1}}^{t}+f_{j_{2}}^{t-1}+f_{j_{3}}^{t-2}}\\ & \mu_{2} = \frac{1}{3} \cdot \frac{F^{t+1}+f_{j_{3}}^{t-2}+f_{j_{1}}^{t}}{F^{t+1}+f_{j_{1}}^{t}+f_{j_{2}}^{t-1}+f_{j_{3}}^{t-2}}\\ & \mu_{3} = \frac{1}{3} \cdot \frac{f_{j_{2}}^{t-1}+f_{j_{1}}^{t}+F^{t+1}}{F^{t+1}+f_{j_{1}}^{t}+f_{j_{2}}^{t-1}+f_{j_{3}}^{t-2}} \end{array} $$
(9)
where $x_{j_{1}}^{t}$, $x_{j_{2}}^{t-1}$, and $x_{j_{3}}^{t-2}$ are individuals selected from random positions at the t-th, (t - 1)-th and (t - 2)-th generations. The corresponding fitness function values are $f_{j_{1}}^{t}$, $f_{j_{2}}^{t-1}$, and $f_{j_{3}}^{t-2}$. When i = j₁ = j₂ = j₃, this model can be defined as M-F3.

3.4 Motivation

As mentioned earlier, MOEAs have difficulty in dealing with MaOPs. The main reason is that a sharp increase in the number of non-dominated solutions in high-dimensional objective spaces can lead to greater convergence pressure; so, traditional MOEAs tend to stagnate in terms of convergence. Moreover, with the explosion of dimensions, the order of magnitude of MaOPs has become large. Although many researchers have tried to adopt MOEAs such as NSGA-II and MOEA/D to solve MaOPs, experiments show that owing to the curse of dimensions, MOEAs cannot balance the relationship between convergence and diversity. Recently, many studies have been conducted to modify the dominance relationship so as to enhance the ability to distinguish solutions [29,30,31]. Convergence is an important property of solutions, and many researchers have used this property to make distinguishing solutions easier by improving a specific stage of the evolutionary process. In this improvement, the excellent solutions are selected one by one, thus increasing the selection pressure toward the Pareto front. The convergence indicator used in the 1by1EA is driven by this idea, which logically resembles the aggregate function in MOEA/D, but without a pre-defined weight vector. MOEA/D can optimize N standard quantum problems simultaneously, rather than by directly solving MOPs as a whole. The same idea applies to solving MaOPs, with 1by1EA being an example.

However, if only the effect of convergence is considered, the algorithm may be more likely to fall into local optimization. To solve this problem, the distribution of solutions in MaOPs also needs to be considered; generally, density estimation and niche techniques are more popular methods. The density estimation method estimates the neighborhood density value for each individual, selects individuals with smaller density value in the next evolution, and then deletes individuals with a larger density value. The characteristic of niche techniques is to form several stable sub-groups, namely niches, and then let individuals evolve in a specific environment. Jiang et al. [48] improved SPEA by introducing an efficient reference direction-based density estimator, which can maintain the distribution of the population. The series of NSGA proposed by Deb et al. is a concrete embodiment of this technology, but the difficulty lies in determining the scope of the niche. Motivated by the above ideas, a distribution threshold is defined in the 1by1EA. Once a solution is selected by the convergence indicator, all solutions with distance to the selected one is less than the threshold are abandoned.

Based on the above description, 1by1EA can effectively balance the convergence and diversity of the solution, and it is theoretically feasible to select this algorithm for research in this paper. But this algorithm still has the following two limitations:

1by1EA has a certain absoluteness in the process of iteration, that is, to retain the best individuals at each iteration and discard all remaining individuals.
1by1EA does not perform as well when dealing with large-scale problems.

One of the main reasons why 1by1EA has difficulty in solving large-scale many-objective problems is that it only considers the relationship between different solutions. In the one-by-one selection strategy of the algorithm, once a solution is selected, all solutions with distribution distance from the selected solution less than the specified threshold are not valued. When the number of decision variables increases significantly, although this method can select the solutions one by one according to the convergence and distribution indicators, because the distribution of the solutions is very random, the effect of removing a lot of solutions within the threshold range of each selected solution may still be relatively large.

Therefore, to resolve the issues discussed above, this paper uses IFM to enhance 1by1EA and applies the enhanced algorithm to solve large-scale MaOPs. We preserve part of the historical information in the population in a fixed or random way, so that the transfer of information between different generations promotes the inheritance of population diversity by 1by1EA. While the individuals that are preserved are not necessarily the best, they can be used for the next renewal of the population, which allows for the continuous use of much useful information. Furthermore, this inheritance is of great significance for the solution set approaching the Pareto optimal front of large-scale MaOPs. After theoretical analysis, follow-up experiments also verified our idea. Thus, while ensuring the good coverage of Pareto optimal front, 1by1EA-IFM enhances the ability of the algorithm to solve large-scale problems. In addition, given the NSGA-III-IFM and MOEA/D-IFM proposed by Gu et al. [44] and Zhang et al. [45], the role of IFM is more effectively reflected in our research results than in previous research. The similarity between these several algorithms is that they all adopt the information retention mechanism of IFM, the difference is that our proposed 1by1EA-IFM is more convincing in the selection of candidate solutions. The details of 1by1EA-IFM are introduced in the next section.

4 1by1EA-IFM

In this section, in order to save the information in the historical iteration, we integrated IFM into 1by1EA. First, we introduce the overall framework of the proposed 1by1EA-IFM, and then explain the details. Finally, we choose one from the six algorithms as an example to describe the execution flow of the new algorithm in detail.

4.1 General framework of 1by1EA-IFM

To use the historical information in the previous iteration, we introduce the information feedback mechanism after producing new offspring individuals by the mutation operator before environmental selection and then updating these individuals. The approximate operation process of the algorithm based on 1by1EA is as follows. First, initialize the entire population and ideal points randomly, and then update the individuals in the population; this mainly includes three steps: parent mating selection to produce new offspring, operation operators for variation operations, and IFM to update individuals after variation. Finally, in the environment selection stage, the next generation parent population (P) is selected according to the one-by-one selection strategy. For more details about the algorithm, see Section 4.3.

4.2 Operators

A binary tournament selection strategy is proposed in mating selection. Density estimation based on distribution index and information ranking based on one-by-one selection are used as tournament strategies. Promising individuals can then be produced to form a mating pool. First, two individuals are randomly selected from the parent population. The lower ranked individual is selected from the two individuals, whose ranking value rank(x) can be obtained from the one-by-one selection results of previous generations. If the two individuals are equally ranked, individuals with lower density estimates tend to be preferred, and density values can be calculated in conjunction with (5) and (6) according to the following formula [27]:

$$ d_{k}(x) = \frac{1}{{\Sigma}_{i=1}^{k}d_{i}^{min}(x)+1} $$

(10)

where $d^{min}_{i}(x)$, i = 1, …, k is one of the k smallest values in {d₁(x),…,d_∣k∣(x)}. Finally, if no amount of density estimation can distinguish between the two individuals, then a random selection is made between them.

The crossover operator used in this algorithm is a simulated binary crossover, which inherits useful genetic information from two or more offspring in the form of a single point crossover [49]. In this operator, p₁ and p₂ are two parents, c₁ and c₂ are two children, they satisfy (p₁ + p₂)/2 = (c₁ + c₂)/2. β is a spread factor, which is defined as $\lvert (c_{2} - c_{1})/(p_{2} - p_{1}) \rvert $. If there is a random number ε in the range [0,1], then the range of value of ε will determine which of the following formulas will solve for β.

$$ \begin{array}{@{}rcl@{}} c_{1} &=& (p_{1} + p_{2})-0.5 \beta(p_{2}-p_{1})\\ c_{2} &=& (p_{1} + p_{2})+0.5 \beta(p_{2}-p_{1})\\ c(\beta) &=& \left\{\begin{array}{ll} 0.5 (\varepsilon + 1) \beta^{\varepsilon}, & \varepsilon \leq 0.5\\ 0.5 (\varepsilon + 1) \frac{1}{\beta^{\varepsilon+2}}, & \varepsilon > 0.5 \end{array}\right. \end{array} $$

(11)

After the crossover operator is executed, the individual is updated by the polynomial mutation [50]. In this operator, p_m is defined as a parent, and q_m is the offspring produced by mutation operator. σ is a random number in the range [0,1]. η is a distribution index. u_m and l_m are the upper and lower bounds of m-dimensional decision variables. Thus, the form of this mutation operator is q_m = p_m + δ(u_m − l_m). The following formula can be referred to calculate δ:

$$ \begin{array}{@{}rcl@{}} \delta &=& \left\{\begin{array}{ll} [2\sigma+(1-2\sigma)(1-\delta_{1})^{\frac{1}{\eta+1}}]-1, & \sigma \leq 0.5\\ 1-[2(1-\sigma)+2(\sigma-0.5)(1-\delta_{2})^{\eta+1}]^{\frac{1}{\eta+1}}, & \sigma > 0.5 \end{array}\right.\\ \delta_{1}&=&(p_{m}-l_{m})/(u_{m}-l_{m})\\ \delta_{2}&=&(u_{m}-p_{m})/(u_{m}-l_{m}) \end{array} $$

(12)

4.3 1by1EA-F1

Here, we describe in detail how 1by1EA is combined with IFM and give detailed workflow. We mainly chose the model of selecting an individual in a fixed way from the previous generation as an example. The new algorithm after their combination was named 1by1EA-F1. The remaining five algorithms (1by1EA-F2, 1by1EA-F3, 1by1EA-R1, 1by1EA-R2, and 1by1EA-R3) all performed the same as this algorithm. The main execution of 1by1EA-F1 can be interpreted as the following process:

Step 1:: Initialization. Generate random population P and ideal point Z_min. Rank of each solution in one-by-one selection and initialize distribute threshold.
Step 2:: Update. Supposing the current generation is t and i = 0 (i is the subscript of the individual).
Step 2.1:: Generate offspring Y^t+ 1 by using the crossover and mutation operator of the original 1by1EA. $y^{t+1}_{i}$ produced by the above process and constitute Y^t+ 1, and then, their fitness value will be calculated by functions.
Step 2.2:: Calculate $x^{t+1}_{i}$ by combining y^t+ 1 with ${x^{t}_{i}}$ according to (7). K^t is the set of $x^{t+1}_{i}$.
Step 2.3:: Determine whether the calculation has been completed for each individual. If it is satisfied, perform Step 2.4. If not, i = i + 1 and repeat Step 2.1 and Step 2.2.
Step 2.4:: Environmental selection.
Step 2.4.1:: Normalize each solution in K^t.
Step 2.4.2:: Calculate the convergence indicator c(x) and distribution indicator d(x) of each solution x𝜖K^t, select some excellent one by one and put them into K_s.
Step 2.4.3:: The corner solution is chosen to form set K_s.
Step 2.4.4:: Guarantee the number of individuals in K_s to be N, which is the population size.
Step 2.4.5:: Update the distribution threshold.
Step 2.4.6:: Verify whether termination condition is met. The termination condition is the maximum number of iterations defined in advance. If it is satisfied, go to Step 3. If not, t = t + 1 and repeat Step 2.
Step 3:: Output. Output P^t+ 1 to be equal to K_s.

The flow chart of 1by1EA-F1 can be shown in Fig. 1.

4.4 Complexity

In this subsection, the complexity of the algorithm is analyzed. For convenience, assuming that the problem to be optimized has m objectives and n decision variables, the decision-maker requires n solutions. Its complexity is analyzed according to Algorithm 1 and the above steps. In addition, it is assumed that the size of the population is set to N, and the generation is set to be t. Therefore, in the initialization process (Step 1), the computational complexity of generating the ideal point is O(mN). This is because in the process of generating the ideal point, an aggregation function is used to aggregate all the objectives, and each component in the ideal point corresponds to the minimum value of each sub-objective. Furthermore, generating a random population and ranking of each solution require O(nN) and O(mN) computations, respectively. As the convergence indicator and the distribution indicator need to be calculated according to the ideal point and the solution needed by the decision maker, the computational complexity in Step 2.4.2 is O(mN²) and O(nN²), respectively. Then, generation of offspring (Step 2.1) requires O($\frac {N}{2}(n+n)$), because the size of the mating pool is set to N. In Step 2.2 and Step 2.3, IFM begins to work, because each generation in the iterative process needs information from the historical generation, the computational complexity is about O($\frac {N}{2}nmt$). During environmental selection, owing to the introduction of one-by-one selection strategy, the best case of computational complexity is O(N²), while the worst case is O(N³), and N individuals are linearly assignment to the ideal points. Furthermore, it is fact that generally N ≫ m and N ≫ n in MaOPs. Therefore, when the iteration termination condition is reached, the complexity of the proposed algorithm is O(tN³).

5 Experiments

In this section, we describe the verification process of the proposed 1by1EA-IFM in detail. The experiment was divided into three stages. First, we compared the performance of the proposed 1by1EA-IFM with 1by1EA in dealing with large-scale multi-objective test problems (LSMOPs) with different decision variables. Then we focused on the work of previous researchers based on IFM and compared the outstanding algorithms in 1by1EA-IFM (selected in the first stage) with the excellent algorithms in NSGA-III-IFM and MOEA/D-IFM on the same indicators and test issues. In addition, we focused on MOKP and selected two other comparison algorithms to obtain the verification results. In the last stage, we chose an algorithm specifically designed to solve the large-scale problem and the other five state-art-of MaOEAs to measure our work from a vertical perspective. In order to avoid contingency, all algorithms had been verified to deal with LSMOPs [51] with different numbers of decision variables. In these problems, the parameters that had to be set included the number of objectives (m), the dimension of the decision variable (n), and the number of subcomponents in each variable group (n_k). Here, we took n = m * 100, n_k = 5. These settings proved valid when LSMOPs were proposed as test functions [51].

As a test suite, LSMOPs were composed of optimization problems of linear problems, convex problems, multi-model problems, and the disconnected Pareto optimal front, which were defined by specified functions. The different characteristics of these fronts are shown in Table 1. In this paper, the objective number m did not exceed 15, and most experiments used m values of 3, 5, 8, 10, and 15. The inverted generational distance (IGD) [52] and the generational distance (GD) [53] were adopted as performance indicators in most experiments. GD was used to evaluate the convergence of the algorithm. It was mainly used to express the separation distance between the approximate PF and the real PF. Its calculation formula is as follows [54]:

$$ GD = \frac{({\Sigma}_{i=1}^{n}{d_{i}^{m}})^{\frac{1}{m}}}{n} $$

(13)

where n is the number of points on the approximate PF, m is the number of dimensions of the objective, and d_i represents the average value of the nearest Euclidean distance between each point on the approximate PF and the reference point on the true PF. The smaller the value, the better the convergence of the algorithm.

Table 1 The characteristics of LSMOPs

Improving evolutionary algorithms with information feedback model for large-scale many-objective optimization

Abstract

Similar content being viewed by others

A large-scale multi-objective evolutionary algorithm based on importance rankings and information feedback

A Two-phase evolutionary algorithm framework for multi-objective optimization

Many-objective Optimization Using Evolutionary Algorithms: A Survey

Explore related subjects

1 Introduction

2 Related work

2.1 Selection strategy

2.2 Large-scale optimization

3 Preliminaries

3.1 Basic definitions

3.2 1by1EA

3.3 Information feedback model

3.4 Motivation

4 1by1EA-IFM

4.1 General framework of 1by1EA-IFM

4.2 Operators

4.3 1by1EA-F1

4.4 Complexity

5 Experiments

5.1 Comparison of 1by1EA with 1by1EA-IFM

5.1.1 Parameter settings

5.1.2 Experimental results and analysis

5.1.3 Statistical analysis of the results

5.2 Comparison with NSGA-III-IFM and MOEA/D-IFM

5.2.1 Comparison of 1by1EA-IFM with NSGA-III-IFM

5.2.2 Comparison of 1by1EA-IFM with MOEA/D-IFM

5.2.3 Statistical analysis of the results

5.2.4 Multi-objective knapsack problem (MOKP)

5.3 Comparision with other many-objective algorithms

5.3.1 Compared algorithms

5.3.2 Parameter settings

5.3.3 Experimental results and analysis

5.3.4 Statistical analysis of the results

6 Conclusion

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for Publication

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation