Keywords

1 Introduction

Multi-objective optimization problems (MOP) exist in various real-world applications such as engineering, financial, and scientific applications. In MOP, each objective function is often in a trade-off relationship, and in such a case, it is impossible to obtain one optimal solution. Therefore, the purpose of MOP is to get a set of solutions called Pareto-optimal (or non-dominated) solutions. Multi-objective optimization requires not only convergence but also the diversity of solutions. In recent years, many kinds of multi-objective evolutionary algorithms (MOEA) have been proposed and shown to demonstrate good performance in real-world optimization problems.

In MOEA, it is known that the performance is changed in different crossover operators. Each crossover operator has some parameters. For example, Simulated Binary crossover(SBX) [1] has a parameter \(\eta \), and Parent Centric crossover (PCX) [10] has two parameters \(\eta \) and \(\zeta \). When we run MOEA, we have to properly determine the values of these parameters in advance. However, in most cases, we use the default values set in each MOEA as they are. Table 1 shows the default crossover operator and the default value of the crossover operator in some major MOEA. The default crossover operator as shown in Table 1 is almost SBX, and \(\eta \) is set to between 15 and 30. It is known that the optimal parameter value is changed by the optimization problem [7, 15]. In [7], the effect of Differential Evolution(DE) operator [13] is investigated with 4 benchmark problems. However, the effect of various crossover operators has not yet been investigated.

In this study, we consider 5 crossover operators such as SBX, DE, simplex crossover (SPX) [14], PCX, and unimodal normal distribution crossover (UNDX) [8] as well as 10 benchmark problems to widely investigate the effects of crossover operators. For the number of parameters in each crossover operator, there is one in SBX and SPX, and two in DE, PCX, and UNDX. The objective of this study is to investigate the effect of changing parameters of each crossover operator. NSGA-II is used as a MOEA. DTLZ2,3,4 [3], WFG1,2,6,8,9 [9] and ZDT1,4 [17] were used as benchmark problems.

Table 1. Default values of crossover operator in MOEA

2 Background

2.1 Related Work

Yuan et al. compared the performance when changing parameter values in the DE operator using several test problems. In Yuan’s paper, cr is fixed to 0.1, and the performance is compared when F is varied. The performance of F is good at about 0.5 [15]. Besides, F is fixed to 0.5, and the performance is compared when cr is varied. The performance of cr is good at about 0.1. Hadka et al. examined the correlation of various parameters in the Borg MOEA, which has adaptive operator selection (AOS). However, in Hadka’s paper, performance on each parameter value of crossover operator has not been reported [6].

Thus, a comparison of performance when fixing one of the parameter values in DE and a comparison of correlation with various parameters in AOS has already been done. However, the case of not fixing one of the parameter values and the case of changing to various parameter values in crossover operators excluding DE has not been compared yet.

In this study, we compare the performance when various parameter values are changed variously in SBX, DE, SPX, PCX, and UNDX. Although DE, PCX, and UNDX each have 2 parameters, in this study, we examine the performance without fixing one of these parameters. In addition, to gain insight into the parameters of each operator for each property of the problem, we used 10 kinds of benchmark problems with different features.

2.2 Crossover Operator

In MOEA, offspring solutions are generated from parent solutions by crossover and mutation in each generation. Crossover and mutation operators are different depending on whether the design variables are handled as binary or real numbers. In this research, we focus on cases where design variables are treated as real numbers. Five crossover operators are used in this research.

SBX imitates one point crossover of a binary number with a real number. The algorithm of SBX is shown below.

$$\begin{aligned} \beta (u)= & {} \left\{ \begin{array}{l} (2u)^{ \frac{1}{\eta }}, \quad \quad if \quad u(0,1) \le 0.5\\ \frac{1}{2(1-u)^{\frac{1}{\eta +1}}}, else \end{array} \right. \end{aligned}$$
(1)
$$\begin{aligned} y^{(1)}= & {} \frac{1}{2}\{(1-\beta )x^{(1)}+(1+\beta )x^{(2)}\} .\end{aligned}$$
(2)
$$\begin{aligned} y^{(2)}= & {} \frac{1}{2}\{(1+\beta )x^{(1)}+(1-\beta )x^{(2)}\} . \end{aligned}$$
(3)

where \(x^{(1)},x^{(2)}\) denote parent solutions, and \(y^{(1)},y^{(2)}\) denote offspring solutions. \(\beta \) denotes a distribution function, and \(\eta \) denotes a SBX parameter.

DE is a crossover operator featuring stochastic linear search. DE randomly selects 3 solutions from the parent population and calculates a vector from the first 2 solutions. It then generates an offspring solution from at last a parent solution to the vector direction. The algorithm of DE is shown below.

$$\begin{aligned} x_i^\prime= & {} x_{p1} + F(x_{p2}-x_{p3}) .\end{aligned}$$
(4)
$$\begin{aligned} y_{i,j}= & {} \left\{ \begin{array}{l} x_{j,i}^\prime \,,~if \quad u(0,1) \le cr\\ x_{j,i} \,,~else \end{array} \right. \end{aligned}$$
(5)

Here, \(\{x_{p1}, x_{p2}, x_{p3}\}\) denote parent solutions, y denotes offspring solution, and j denotes a design variable. Terms cr and F denote DE parameters.

SPX is a crossover operator that generates a child solution from the part that extends past the range surrounded by the centroid. These are parent individuals of multiple parent solutions. The SPX algorithm wich three parents is shown below.

$$\begin{aligned} O= & {} \frac{1}{3} \sum _{i=1}^{3} x^{(i)}. \end{aligned}$$
(6)
$$\begin{aligned} r_i= & {} u^{\frac{1}{i}},~i = 1, 2, 3.\end{aligned}$$
(7)
$$\begin{aligned} v^{(i)}= & {} O + \epsilon (x^{(i)} - O).\end{aligned}$$
(8)
$$\begin{aligned} c^{(i)}= & {} \left\{ \begin{array}{l} 0,~i = 1\\ r_{i-1}(y^{(i-1)}-v^{(i)}+c^{(i-1)}),~i=2 \end{array} \right. \end{aligned}$$
(9)
$$\begin{aligned} y= & {} v^{(3)} + c^{(3)}. \end{aligned}$$
(10)

Here, \(\{x^{i}, i=1,2,3\}\) denotes parent solutions, y denotes offspring solutions, and O denotes centroid parent solutions. Term \(\epsilon \) denotes the SPX parameter.

PCX is a crossover operator that generates offspring solutions around the parent solution via the centroid of multiple parent solutions. The algorithm of PCX with three parents is shown below.

$$\begin{aligned} O= & {} \frac{1}{3} \sum _{i=1}^{3} x^{(i)} .\end{aligned}$$
(11)
$$\begin{aligned} d^{(p)}= & {} x^{(p)} - O \quad p = 1, 2 \; or\; 3 .\end{aligned}$$
(12)
$$\begin{aligned} \overline{D}= & {} \frac{1}{2}\sum _{i=1,i \ne p}^{3} \frac{\sqrt{d(x^{(i)})}}{||d||} .\end{aligned}$$
(13)
$$\begin{aligned} y= & {} x^{(p)} + \omega _\zeta |d^{(p)}| + \sum _{i=1,i \ne p}^{3} \omega _\eta \overline{D}e^{(i)} . \end{aligned}$$
(14)

Here, \(\{x^{i}, i=1,2,3\}\) denotes parent solutions, y denotes offspring solution, and O denotes the centroid of the parent solutions. Term \(e^{(i)}\) is the orthogonal basis vector defined by \(d^{(p)}\), and D denotes the distance of the perpendicular vectors of the remaining 2 individuals and the \(d^{(p)}\). Terms \(\eta \) and \(\zeta \) are parameters of PCX. Term \(\omega _\eta \) denotes a random number generated from a Gaussian distribution with mean 0, and variance \(\sigma _\eta ^2\). Term \(\omega _\zeta \) denotes a random number generated from a Gaussian distribution with mean 0 and variance \(\sigma _\zeta ^2\).

UNDX is a crossover operator creating a centroid from parent solutions. It generates offspring solutions using a normal distribution around the centroid. The algorithm of UNDX with three parents is shown below.

$$\begin{aligned} O= & {} \sum _{i=1}^{2} x^{(i)} .\end{aligned}$$
(15)
$$\begin{aligned} d= & {} x^{(2)} - x^{(1)} .\end{aligned}$$
(16)
$$\begin{aligned} D= & {} \sqrt{|x^{(3)}-x^{(1)}|^2 - \frac{[d(|x^{(3)}-x^{(1)})]^2}{|d|^2}} .\end{aligned}$$
(17)
$$\begin{aligned} y= & {} O + \sigma _\zeta d + D \sum _{i=1}^{n-1} \sigma _{\eta _i} e_i . \end{aligned}$$
(18)
$$\begin{aligned} \sigma _\zeta\sim & {} N(0, \zeta ^2), \quad \quad \sigma _\eta \sim N(0, \eta ^2) . \end{aligned}$$
(19)

Here, \(\{x^{i}, i=1,2,3\}\) denotes parent solutions, y denotes offspring solutions, and O denotes centroid of 2 parent solutions. Term \(e^{(i)}\) is the orthogonal basis vector defined by d. Terms \(\eta \) and \(\zeta \) denote parameters of UNDX.

3 Computational Condition

In this study, we use NSGA-II as MOEA. The crossover operator in NSGA-II is calculated for each of the 5 operators described above. We also use polynomial mutation for mutation operator. Table 2 shows the parameter values of each crossover operator in this study.

Table 2. Parameter values of crossover operators

Therefore, we adopt 10 benchmark problems with different properties (DTLZ2, 3, 4, WFG1, 2, 6, 8, 9, and ZDT1, 4). DTLZ and WFG problems can be scaled with any number of objectives and design variables. Here, we consider that the number of objectives in DTLZ and WFG is 3. The number of the objective functions in ZDT problems is set to 2. For DTLZ and WFG problems, the number of design variables is 12. In the ZDT1 problem, the number of design variables is 30, and in the ZDT4 problem, the number of design variables is 10. Therefore, none of the benchmark problems have a constraint function. Table 3 summarizes the properties of the benchmark problems from the view of separability of design variables, modality, shape of Pareto front, and whether each problem has bias or not. In all problems, both the population size and the number of generations are set to 100. Furthermore, the number of trials is 10.

Table 3. Summary of the problems of the test problems

To evaluate the performance in each pattern, the hypervolume (HV) [18] is used as the performance metric. The HV can measure both convergence and diversity of a optimal solution set and a higher value means better performance. The calculation formula of HV is shown below.

$$\begin{aligned} HV(P) = volume(\bigcup _{y \in S}[y_1,y_1^*]\times .....\times [y_m,y_m^*] ) . \end{aligned}$$
(20)

Here, m denotes the number of objectives, and S denotes the Pareto set. In this study, \(y^*\) is set to 1.5 times the maximum value of each objective function value in the true Pareto front except for DTLZ3. In DTLZ3, \(y^*\) is set to \(\{150,150,150\}\). Furthermore, to avoid changing the value for each problem, divide by the volume surrounded by the origin point and \(y^*\) to convert it to the proportion of the HV value surrounded by the origin point and \(y^*\). Pareto set in all search solutions is used for calculation of HV.

4 Results and Discussion

In all parameters mentioned above in Table 2, the trials are run using different 10 random seeds. The HV calculates the average of 10 trials and compares it to each parameter value. Figures 1, 2, 3, 4 and 5 show the average values of HV wich 10 trials in each crossover operator. A bluer color is preferred. In Figs. 1, 2, 3, 4 and 5, the HV value is converted from 0 to 1 for each problem, and in that range, each cell is colored in 50 hierarchies. The conversion formula is shown below.

$$\begin{aligned} HV_{new} = \frac{HV_{old} - min(HV_{old})}{max(HV_{old})-min(HV_{old})} . \end{aligned}$$
(21)

For each cell in Figs. 1, 2, 3, 4 and 5, each cell shows the proportion of the HV value surrounded by the origin point and \(y^*\).

Fig. 1.
figure 1

Result of SBX

Fig. 2.
figure 2

Result of SPX

Fig. 3.
figure 3

Result of DE

Fig. 4.
figure 4

Result of PCX

Fig. 5.
figure 5

Result of UNDX

Figure 1 shows the result of SBX - the number of parameters is only one. From Fig. 1, it is clear when the value of \(\eta \) is large, each cell is bluer, but there is little difference in the HV value for each parameter except DTLZ3 and ZDT4. We think that \(\eta \) does not change the performance in most test problems. In the case of DTLZ3 and ZDT4, there is a large difference in the \(\eta = 5\) case and other cases. Performance is not good with \(\eta = 5\).

Figure 2 shows the result of SPX. The number of parameters is only one. Figure 2 shows that the value of \(\epsilon = 3, 5, 7\), each cell is bluer. Thus, the performance is considered good with \(\epsilon = 3, 5, 7\) except WFG9. With WFG9, there is almost no difference in the HV value for each parameter, and the value of \(\epsilon \) does not change the performance in WFG9.

Figure 3 shows the result of DE - the number of parameters is two. In DE, PCX, and UNDX, there are two parameters: the horizontal axis and the vertical axis, and these are set as the respective parameters. The performance is compared with each problem. Therefore, in DE, PCX, and UNDX, the performance is compared to 10 figures. Thus, Fig. 3(a)–(j) show the result of DE in each test problem. From Fig. 3, it is understood when the value of cr and F are small, each cell is bluer in most test problems. However, for ZDT4, it is understood that each cell is bluer when the value of cr and F are small. Thus, in most test problems, the performance is good when the value of cr and F are small. For ZDT4, the performance is good when the value of cr and F are large.

Figure 4 shows the result of PCX, and Fig. 4(a)–(j) show the results in each problem. From Fig. 4, the color tone differs for each problem, and the optimal parameter value may be different depending on the problem in PCX. Especially for WFG2, 6, 8, 9, each cell is bluer when the value of \(\eta \) is large. A common characteristic of WFG2, 6, 8, 9 is that design variables are non-separable. Thus in PCX, if the design variables are non-separable, then the performance is good when the value of \(\eta \) in PCX is large.

Figure 5 shows the result of UNDX, and Fig. 5(a)–(j) show the results in each problem. Figure 5 shaws that when the value of \(\eta \) is large, each cell is bluer in most test problems. Thus, the performance is good when the value of \(\eta \) is large. However, in the case of DTLZ3 and WFG1, there is little difference in the HV value for each parameter. Thus, the value of \(\eta \) likely does not change the performance in DTLZ3 and WFG1. On the other hand, in the case of \(\zeta \), it becomes bluer from 0.3 to 0.5 in each problem. Thus, the performance is good when \(\zeta = 0.3, 0.5\).roblem. Thus, the performance is good when \(\zeta = 0.3, 0.5\).

5 Conclusion

In this study, we consider five crossover operators such as SBX, DE, SPX, PCX, and UNDX as well as 10 benchmark problems to widely investigate the effects of crossover operators. For the number of parameters in each crossover operator, there is one in SBX and SPX, and there are two in DE, PCX, and UNDX. The objective of this study is to investigate the effect of changing the parameters in each crossover operator. NSGA-II is used as a MOEA. DTLZ2,3,4 [3], WFG1,2,6,8,9 [9] and ZDT1,4 [17] is used as benchmark problems.

In SBX, a large value of \(\eta \) is preferred. In SPX, values of 3 to 7 are preferred for \(\epsilon \). However, there is little difference in the performance for \(\eta \) and \(\epsilon \). These results indicate that the effect of parameter variation is relatively small.

In DE, small values are preferred for both cr and F. In PCX, appropriate values of \(\eta \) and \(\zeta \) depend on the problem. When the design variable is non-separable, a large value is preferred for \(\eta \). In UNDX, a large value of \(\eta \) is preferred. For \(\zeta \), values about 0.3 to 0.5 are preferred. These results indicate that as for DE, PCX, and UNDX, there are optimum parameters in each benchmark problem. This also indicates that the choice of crossover operator is significantly important in MOEA for achieving good performance.

In future work, we will investigate not only the effect when the number of objectives is more than four but also the effect when the different mutation is considered. In addition, it is also necessary to consider other types of MOEA such as MOEA/D and IBEA.