Keywords

1 Introduction

Difference Evolution [1] is an optimizing algorithm without gradients that has good search performance and is applicable to a variety of applications. DE is a relatively simple algorithm with the advantage that it has only three control parameters: mutation coefficient F, crossover rate CR, and population size N. However, due to the small number of parameters, the accuracy of the search is affected by the parameters. Therefore, it is necessary to set the appropriate parameter value according to the problem to be handled and the search situation. To solve this problem, SaDE [2], jDE [3], SHADE [4], JADE [5, 6] have been proposed.

JADE is a method for efficiently adapting the parameters of DE to the environment by sampling them from a Cauchy or Gaussian distribution adapted to the environment. Specifically, the mean value in the probability distribution used to generate the mutation coefficient F and the crossover rate CR is adjusted according to the success value of each parameter, allowing each parameter to be adjusted automatically. This allows the appropriate parameters to be used, increasing the accuracy of the search.

As with other methods, out-group search is not considered much in JADE, and performance degrades if a solution is missed outside the group after the search has progressed. In our basic experiment [7], we confirmed that searching can be improved by adding the probability of selecting \(F=1.5\), which is an enhanced parameter of out-group search that is not normally set in DE. Therefore, we believe that the performance can be further improved by adding out-group search to JADE.

In this study, referring to Nelder-Mead method [8], we propose an algorithm to add out-group search to JADE and to improve the accuracy without decreasing the search speed as much as possible. If the distance of the solution with the best and worst solutions is more than half of the distance from the end of the search point group, it is judged that there is a large bias in the direction of the update of the search point group, and the outside of the solution group is searched. The composition of this paper is as follows. In Sect. 2, JADE and Nelder-Mead method which are handled in this study are explained. In Sect. 3, we propose and explain JADE with the addition of out-group search. In Sect. 4, the results of numerical experiments using the proposed method are discussed, and in Sect. 5, future problems are summarized.

2 Base Algorithm

2.1 JADE

JADE is one of the improved methods of DE, and the mutation coefficient F and the crossover rate CR are sampled from the probability distribution and used. The Cauchy distribution is used for the mutation coefficient, and the Gaussian distribution is used for the crossover rate, and the adjustment of each distribution is carried out based on the proportion in which the solution was renewed in the past. In JADE, parameters are generated for each individual. As a mutation strategy, “DE/current-to-pbest” is used. The generation method of each parameter is shown below.

For each generation, the mutation factor \(F_i\) of each individual \(x_i\) is generated according to the Cauchy distribution of the positional parameter \(\mu _F\) and the scaling parameter \(\sigma _F=0.1\) as follows:

$$\begin{aligned} F_i \sim C(\mu _F, \sigma _F) \end{aligned}$$

\(F_i\) is regenerated if \(F_i \le 0\), and truncated to 1 if \(F_i \ge 1\). The positional parameter \(\mu _F\) is initialized at 0.5 and updated for each generation as follows:

$$\begin{aligned} \mu _F = (1-c)\cdot \mu _F+c\cdot \frac{\sum _{F \in S_F}F^2}{\sum _{F \in S_F}F} \end{aligned}$$

c is a constant for (0, 1] and the recommended value is 0.1. \(S_F\) is the set of mutation coefficients that have successfully updated the solution in that generation.

Similarly, for each generation, the crossover rate \(CR_i\) of each individual \(x_i\) is generated according to a Gaussian distribution with a mean of \(\mu _{CR}\) and a standard deviation of \(\sigma _{CR}=0.1\) as follows:

$$\begin{aligned} CR_i \sim N(\mu _{CR}, \sigma _{CR}^2) \end{aligned}$$

\(CR_i\) is truncated to the interval [0, 1]. The average \(\mu _{CR}\) is initialized at 0.5 and updated for each generation as follows:

$$\begin{aligned} CR_i \sim N(\mu _{CR}, \sigma _{CR}^2) \end{aligned}$$

\(S_N\) is the number of times the solution is updated successfully in each generation, and \(S_{CR}\) is the set of crossover rates CR when the solution is updated successfully.

The following describes the mutation strategy “DE/current-to-pbest” used in JADE. In this strategy, the mutation vector \(\boldsymbol{v}_{i,g}\) for each individual \(x_{i,g}\) of each generation g is generated by:

$$\begin{aligned} \boldsymbol{v}_{i,g}=\boldsymbol{x}_{i,g}+F_i\cdot (\boldsymbol{x}_{best,g}^p-\boldsymbol{x}_{i,g})+F_i\cdot (\boldsymbol{x}_{r1,g}-\boldsymbol{x}_{r2,g}) \end{aligned}$$

\(F_i\) is the mutation coefficient of each individual \(x_i\), and \(\boldsymbol{x}_{best, g}^p\) is an individual selected from the top 100p% individuals. Also, \(x_{r1, g}\) and \(x_{r2, g}\) are two points randomly selected from the search points other than \(x_i\) so that they do not overlap. Here p is a constant in (0, 1), and the recommended value is 0.1. In addition, there is a way to use an archive in this strategy. In this case, the mutation vector \(\boldsymbol{v}_{i,g}\) is generated by the following equation:

$$\begin{aligned} \boldsymbol{v}_{i,g}=\boldsymbol{x}_{i,g}+F_i\cdot (\boldsymbol{x}_{best,g}^p-\boldsymbol{x}_{i,g})+F_i\cdot (\boldsymbol{x}_{r1,g}-\boldsymbol{\tilde{x}}_{r2,g}) \end{aligned}$$

\(\boldsymbol{\tilde{x}}_{r2,g}\) is an individual selected at random from the aggregate of past failed individuals stored in the archive and the aggregate of current solution populations. The archive is initially empty and adds failed individuals at the end of each generation update.

The next search point is generated by crossing over this mutation vector with the individual as follows based on the crossover rate:

$$x_{i,g+1,j}= {\left\{ \begin{array}{ll}v_{i,g,j} (w \le CR)\\ x_{i,g,j} (otherwise) \end{array}\right. } $$

w is uniform random number between 0 and 1, and \(x_{i,g,j}\) means the j-th element of the i-th individual in generation g.

2.2 Nelder-Mead Method

The Nelder-Mead method is a kind of optimizing algorithm without using gradient information. By giving \(D+1\) search points in D dimension space and repeating reflection, expansion and contraction for them, the optimum solution can be searched. The algorithm of function minimization in the Nelder-Mead method is shown below.

  • Step 0. For \(D+1\) search points, \(f(\boldsymbol{x}_1) \le f(\boldsymbol{x}_2) \le \dots \le f(\boldsymbol{x}_{D+1})\) is set in the order of the values of the objective function f, and if the end condition is not satisfied, \(\mathbf{Step 1}\) is assumed, and if the end condition is satisfied, \(\boldsymbol{x}_1\) is assumed as the solution.

  • Step 1. Using the centroids \(\boldsymbol{x}_c\) of \(\boldsymbol{x}_1\) \(\dots \) \(\boldsymbol{x}_n\), the reflection point \(\boldsymbol{x}_{ref}\) of \(\boldsymbol{x}_{D+1}\) is determined by the following equation:

    $$\begin{aligned} \boldsymbol{x}_{ref}=\boldsymbol{x}_c+\alpha (\boldsymbol{x}_c-\boldsymbol{x}_{D+1}) \end{aligned}$$
  • Step 2.

    • case 1 If \(f(\boldsymbol{x}_1) \le f(\boldsymbol{x}_{ref}) < f(\boldsymbol{x}_D)\), replace \(\boldsymbol{x}_{D+1}\) with \(\boldsymbol{x}_{ref}\) and go to \(\mathbf{Step} 0\).

    • case 2 If \(f(\boldsymbol{x}_{ref}) < f(\boldsymbol{x}_1)\), the expansion point, which is the point where the reflection point is further extended, is obtained as follows:

      $$\begin{aligned} \boldsymbol{x}_{exp}=\boldsymbol{x}_{c}+\gamma (\boldsymbol{x}_{ref}-\boldsymbol{x}_c) \end{aligned}$$

      If \(f(\boldsymbol{x}_{exp}) \le f(\boldsymbol{x}_{ref})\), replace \(\boldsymbol{x}_{D+1}\) with \(\boldsymbol{x}_{exp}\) and go to \(\mathbf{Step} 0\), otherwise replace \(\boldsymbol{x}_{D+1}\) with \(\boldsymbol{x}_{ref}\) and go to \(\mathbf{Step} 0\).

    • case 3 If \(f(\boldsymbol{x}_D) \le f(\boldsymbol{x}_{ref})\)

      • case 3–1 If \(f(\boldsymbol{x}_{ref}) < f (\boldsymbol{x}_{D+1})\), the contraction point is obtained as follows:

        $$\begin{aligned} \boldsymbol{x}_{con}=\boldsymbol{x}_c+\beta (\boldsymbol{x}_{ref}-\boldsymbol{x}_c) \end{aligned}$$

        Then go to \(\mathbf{Step} 3\).

      • case 3–2 Otherwise, the contraction point is obtained as follows:

        $$\begin{aligned} \boldsymbol{x}_{con}=\boldsymbol{x}_c+\beta (\boldsymbol{x}_{D+1}-\boldsymbol{x}_c) \end{aligned}$$

        Then go to \(\mathbf{Step} 3\).

  • Step 3. If \(f(\boldsymbol{x}_{con}) < \min \{f(\boldsymbol{x}_{ref}), f(\boldsymbol{x}_{D+1})\}\), replace \(\boldsymbol{x}_{D+1}\) with \(\boldsymbol{x}_{con}\) and go to \(\mathbf{Step} 0\), otherwise go to \(\mathbf{Step} 4\).

  • Step 4. Shrink all individual i to point \(\boldsymbol{x}_1\) as follows:

    $$\begin{aligned} \boldsymbol{x}_i=\boldsymbol{x}_1+\delta (\boldsymbol{x}_i-\boldsymbol{x}_1) \end{aligned}$$

    Then go to \(\mathbf{Step} 0\).

3 Proposed Method

In the proposed method, one of the search points of JADE is made to search outside the group according to the situation. By this, the search is made so that the solution is not missed, when the function in which there are multiple local solutions is searched. In addition, escape from the local solution by the search outside the group is expected, when it falls into the local solution. And, the lowering of the search speed by the increase of the outside group search is prevented by limiting the point of the outside group search to one point.

Concretely, if the distance between the search point with the best solution and the search point with the worst solution is more than half of the distance from the end to the end of the search point group, one search point searches the point by extending a \(2^n\) vector from the search point with the worst solution to the search point with the best solution. The value of n is determined by sampling from a geometric distribution. The success probability of the geometric distribution p is defined as \(1/G^{p_{update}}\) by the update rate of the solution \(p_{update}\) and the generation G. This \(p_{update}\) is updated in the same way as JADE with an initial value of 0 (Fig. 1).

$$\begin{aligned} \begin{aligned} \mathrm{if}\ ||&{\boldsymbol{x}}_\mathrm{best}-{\boldsymbol{x}}_\mathrm{worst}|| > ||{\boldsymbol{x}}_\mathrm{max}-{\boldsymbol{x}}_\mathrm{min}||/2 \\ {}&n=\mathrm{GeometricDistribution}(1/G^{p_{update}}) \\ {}&{\boldsymbol{x}}_\mathrm{last}={\boldsymbol{x}}_\mathrm{worst}+({\boldsymbol{x}}_\mathrm{best}-{\boldsymbol{x}}_\mathrm{worst})*2^n \end{aligned} \end{aligned}$$
Fig. 1.
figure 1

Out-group search

4 Numerical Experiments and Results

JADE and the proposed method (JADE+) are compared for 16 benchmark functions shown in Table 1. In the comparison experiment, calculation up to 1000 generations was tried 1000 times for each function at a search point of 10. Figure 2 through 24 show the average of 1000 times of the best solution for each generation, with the evaluation value on the vertical axis and the generation number on the horizontal axis. Figures 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 and 17 show the results of the experiment with the dimension of 2, and Figs. 18, 19, 20, 21, 22, 23 and 24 show the results of the experiment with the dimension of 10. In order to deal with the minimization problem, we show that the lower the graph is, the better the solution is. Tables 2 and 3 summarize the mean ± standard deviation of the final solutions for each function. Table 2 shows the case where the dimension is 2, and Table 3 shows the case where the dimension is 10. In Tables 2 and 3, the methods which obtained good solutions and obtained solutions are shown in bold type.

Table 1. Test functions used in the experiment
Fig. 2.
figure 2

F1 (D = 2)

Fig. 3.
figure 3

F2 (D = 2)

Fig. 4.
figure 4

F3 (D = 2)

Fig. 5.
figure 5

F4 (D = 2)

Fig. 6.
figure 6

F5 (D = 2)

Fig. 7.
figure 7

F6 (D = 2)

Fig. 8.
figure 8

F7 (d = 2)

Fig. 9.
figure 9

F8 (d = 2)

Fig. 10.
figure 10

F9 (d = 2)

Fig. 11.
figure 11

F10 (D = 2)

Fig. 12.
figure 12

F11 (D = 2)

Fig. 13.
figure 13

F12 (D = 2)

Fig. 14.
figure 14

F13 (D = 2)

Fig. 15.
figure 15

F14 (D = 2)

Fig. 16.
figure 16

F15 (D = 2)

Fig. 17.
figure 17

F16 (D = 2)

Fig. 18.
figure 18

F1 (D = 10)

Fig. 19.
figure 19

F2 (D = 10)

Fig. 20.
figure 20

F6 (D = 10)

Fig. 21.
figure 21

F7 (D = 10)

Fig. 22.
figure 22

F14 (D = 10)

Fig. 23.
figure 23

F15 (D = 10)

Fig. 24.
figure 24

F16 (D = 10)

We begin with a discussion of the two-dimensional case, Figs. 2 through 17. The graphs show that the proposed method has a better solution than JADE in Fig. 7 (Rastrigin function), Fig. 8 (Ackley function), Fig. 9 (Levi N. 13 function), Fig. 11 (Beale function), Fig. 12 (Goldstein-Price function), Fig. 13 (SchafferN2 function), Fig. 14 (Five-well potential function), Fig. 16 (Xin-She Yang function), and Fig. 17 (Styblinski-Tang function), and the convergence speed is almost the same. From this fact, it is proven that the accuracy heightens without lowering the search speed by carrying out the group outside search in one search point. In Fig. 9 (Levi N. 13 function) and Fig. 13 (SchafferN2 function), it can be seen that the proposed method escaped from the point where it almost converged once, and reached a better solution. This suggests that the search outside the group works well to escape from the local solution. However, in Fig. 2 (Sphere function), Fig. 3 (Rosenbrock function) and Fig. 5 (Matyas function), the search speed of the solution is inferior to JADE, and it can be read that the useless group outside search leads to the lowering of the search speed in the simple unimodal function.

Table 2. Comparison of mean and standard deviation (D = 2)
Table 3. Comparison of mean and standard deviation (D = 10)

We then discuss the 10 dimensional case, Figs. 18 through 24. First, as shown in Fig. 19, the result of F2 was improved. The results of other functions were not so different from those of the conventional method. In this experiment, it can be said that the result as intended was obtained, because the search frequency of the outside region did not increase, even if the dimension increased, so that the performance in the high dimension would not be lowered. And, it was proven that it was good to add outer region search like the proposed method, because the region near the optimum solution was lined in F2.

Table 2 shows that the proposed method improved the accuracy of the solution to 11 functions out of 16. For these 11 functions, the proposed method gives better values for both mean and standard deviation. Since all the functions whose accuracy is improved are multimodal functions, the proposed method is more stable and gives better solution in multimodal functions, and it is proven that the search outside the group is effective for the multimodal function. In the sphere function, Rosenbrock function and Matyas function which are the unimodal function, the reason why the accuracy is inferior to JADE is the lowering of the search speed by the out-of-group search, and there seems to be a large room of the improvement on the judging method of whether to carry out the out-group search or not.

Next, we discuss Table 3. In the proposed method, one of the search points searches outside the population. Therefore, as the dimension increases, the effect of out-of-group search decreases and the difference between JADE and JADE+ disappears. In Table 3, we can see that the results are similar between JADE and JADE+.

5 Conclusion

In this study, we propose a method to add search outside the group referring to the Nelder-Mead method to JADE which is adaptive differential evolution. A comparison experiment between the proposed method and the conventional method was carried out using 16 benchmark functions, and it succeeded in improving the accuracy of the solution in many functions. Especially, search outside the group works effectively in the multimodal function, and escape from the local solution is also observed. However, since there were some cases in which the search speed was inferior to JADE in the unimodal function, it is also necessary to examine how to decide whether to search outside the group.