1 Introduction

Optimization is an important technique in real-world life, which is widely used in engineering domain, agriculture and industry [1]. In general, an unconstrained optimization problem is identified as a single-objective global optimization which can be described as follows [2]. We need to search for a decision variables of vector \(\vec{x} = \left\{ {x_{1} ,\;x_{2} ,...,x_{D} } \right\}\), which satisfies the variable bound, xj,low x xj,upp and minimizes or maximizes a fitness function \(f\left( {\vec{x}} \right)\), where D denotes the dimension of the solution space, xj,low and xj,upp is the lower and upper bounds, respectively, j = 1,2,…,D. In practical problems, the decision variable can be discrete, continuous or hybrid and the objective function can be uni-modal or multi-modal, hybrid and composition. Therefore, the single-objective global optimization problems have various different characteristics and complicated mathematical proprieties, which make the algorithms difficult to get the desired results [3].

As exhaustive search-based methods are time expensive when solving complex optimization problems. Evolutionary algorithms (EAs) have been widely used in recently years [4]. EAs are population-based search methods that have been proved to be a promising approach to find the best solution in effective time. They are some reasons for explanation: (1) they are convenient for computers to carry out parallel computation; (2) the framework of algorithms is simple; and (3) the parameters of the algorithms are few. Nevertheless, on account of EAs are stochastic algorithms, it cannot be ensured that they will obtain the desired results. In addition, the performance of EAs is sensitive to parameter settings and the selection of mutation strategy.

DE as a classic evolutionary algorithm has been applied to engineering design, neural network and artificial intelligence [5]. However, when facing complex optimization problems, it is difficult for the DE to get satisfactory results. One reason is that the performance of DE algorithm depends on the mathematical properties of optimization problems.

As a consequence, till now many researches utilized the standard DE algorithm or some DE variants to solve optimization problems [6]. The improvements of the DE algorithm mainly focus on the aspects including self-adaptive control parameters, multi-mutation strategies (use more than a single mutation strategy) and ensemble based (adaptive control parameters and mutation strategy), whereas how to combine these mutation strategies and control parameters in a more appropriate way is still a difficult task. The mutation strategy of DE greatly affects the performance of the algorithm. In most DE algorithms, the selection of the mutation strategy relies on the experience in most cases. Sometimes, some criteria are used to choose the suitable mutation strategy, such as the best performing one in the previous generation, roulette selection method and machine learning. However, all the methods are concerned about the optimization algorithms, the characteristics of the optimization problem are not considered in the design of algorithm. Fitness landscape can be used to analyze an optimization problem and judge the complexity of the optimization problem, because the calculation of fitness landscape is complex. Therefore, the selection of the mutation strategy in the DE according to the fitness landscape is rare, even though it may be possible to improve the performance of an algorithm. However, for these algorithms that do exist, they have some limitations: (1) the fitness landscape was used to analyze the static optimization problems. However, the optimization problems are dynamic problems during the evolution process; (2) the time consume is very expensive when calculating the fitness landscape metrics; and (3) a pre-training and testing mechanism are used, which indicates the fitness landscape only utilized to judge the considered test problems, and the improved algorithm may not be suitable for another optimization problems. It may be a meaningful study to use fitness landscape information to guide the selection of mutation strategy for every problem at each generation.

In this paper, an improved DE algorithm is proposed, in which a problem's local fitness landscape is considered, so that more guidance is given to select the favored mutation strategy during the evolutionary process. A novel adaptive control parameter method is adopted in the algorithm. Furthermore, a linear population size reduction mechanism is considered, in which population size is decreased along with the number of generation. To evaluate the performance of the proposed algorithm, a total of 29 benchmark test functions derived from CEC2017 competition are used in the experiments. These benchmark test functions have three different kinds of mathematical properties. The experimental results indicate that the proposed algorithm is superior to other five classic DE algorithms.

The rest of the paper is organized as follows. In Sect. 2, we review recent advances in DE algorithms. Then, the LFLDE algorithm is described in Sect. 3. In Sect. 4, the experimental results are analyzed and discussed. Some conclusions and future work are given in Sect. 5.

2 Related work

In this section, the standard DE algorithm and some improved DE variants will be discussed.

2.1 DE algorithm

The structure of standard DE consists of mutation, crossover and selection operators, which was conceived by Storn and Price in 1997. DE has fewer control parameters and desired performance which becomes a popular EA. Assume that a population is made up of NP individuals. Each individual is a D-dimensional vector. The initial population can be expressed as [7]:

$$ x_{i,G} = \left\{ {x_{i,G}^{1} ,x_{i,G}^{2} ,...,x_{i,G}^{D} } \right\},\quad i = 1,...,NP $$
(1)

where G represents the generation. Each dimension of the individual is constrained by:

$$ x_{\min } = \left\{ {x_{\min }^{1} ,...,x_{\min }^{D} } \right\}\;and\;x_{\max } = \left\{ {x_{\max }^{1} ,...,x_{\max }^{D} } \right\} $$
(2)

Generally, the initial value of the ith individual of the jth component is created by:

$$ x_{i,0}^{j} = x_{\min }^{j} + {\text{rand}}\left( {0,\;1} \right) * \left( {x_{\max }^{j} - x_{\min }^{j} } \right) $$
(3)

where rand(0, 1) denotes a uniformly distributed random variable within the range [0,1].

2.1.1 Mutation operation

After initialization, the mutation operator is used to generate the mutant vectors vi,G with respect to each individual xi,G. There are six most frequently used mutation strategies [8]:

DE/rand/1

$$ v_{i,G} = x_{r1,G} + F * \left( {x_{r2,G} - x_{r3,G} } \right) $$
(4)

DE/rand/2

$$ v_{i,G} = x_{r1,G} + F * \left( {x_{r2,G} - x_{r3,G} } \right) + F * \left( {x_{r4,G} - x_{r5,G} } \right) $$
(5)

DE/best/1

$$ v_{i,G} = x_{{{\text{best}},G}} + F * \left( {x_{r1,G} - x_{r2,G} } \right) $$
(6)

DE/best/2

$$ v_{i,G} = x_{{{\text{best}},G}} + F * \left( {x_{r1,G} - x_{r2,G} } \right) + F * \left( {x_{r3,G} - x_{r4,G} } \right) $$
(7)

DE/current-to-rand/1

$$ v_{i,G} = x_{i,G} + F * \left( {x_{r1,G} - x_{i,G} } \right) + F * \left( {x_{r2,G} - x_{r3,G} } \right) $$
(8)

DE/current-to-best/1

$$ v_{i,G} = x_{i,G} + F * \left( {x_{{{\text{best,}}G}} - x_{i,G} } \right) + F * \left( {x_{r1,G} - x_{r2,G} } \right) $$
(9)

The indices xr1,G, xr2,G, xr3,G, xr4,G and xr5,G are mutually exclusive integers randomly chosen from the current population, which are different from the index i. The scale factor F is a positive control parameter. xbest,G is the global optimal individual at generation G.

2.1.2 Crossover operation

After the step of the mutation, the crossover operator is adopted to chose the trial vector ui,G between vi,G and xi,G, which can be expressed by the formula below:

$$ u_{i,G}^{j} = \left\{ {\begin{array}{*{20}l} {v_{i,G}^{j} ,} \hfill & {{\text{if}}\;\left( {{\text{rand}}_{j} \le {\text{CR}}\;{\text{or}}\;j = n_{j} } \right)} \hfill \\ {x_{i,G}^{j} ,} \hfill & {{\text{otherwise}}} \hfill \\ \end{array} } \right.\;j = 1,2,...,D $$
(10)

where randj is a random number, which generated within the range [0,1]. The control parameter of crossover rate CR is a preset constant within the range from 0 to 1, and nj is a random integer produced within the range [0,1].

2.1.3 Selection operation

After crossover operation, the values of the trial vectors should be checked whether they exceed the constrained bounds. We should reinitialize them within the predefined scope, after obtaining the final trial vectors. In the next, a selection operation is executed. If the trial vector has less or equal objective function value (for minimization problem) compared to the corresponding target vector, the trial vector will be chosen into the population in the next generation. Otherwise, the target vector will still retain in the population for the next generation. The greedy selection operation is performed as following:

$$ x_{i,G + 1} = \left\{ {\begin{array}{*{20}l} {u_{i,G} ,} \hfill & {{\text{if}}\;f\left( {u_{i,G} } \right) \le f\left( {x_{i,G} } \right)} \hfill \\ {x_{i,G} ,} \hfill & {{\text{otherwise}}} \hfill \\ \end{array} } \right. $$
(11)

2.2 DE variant algorithms

Recently, a plenty of DE algorithms for dealing with single-objective global optimization problems have been proposed, and some recently proposed variant DE algorithms are discussed in this section. Differential evolution is a population-based algorithm, which is suitable for numerical optimization, the excellent performance had been confirmed, and it has been applied to various fields [9]. The classic DE has three vital control parameters which are scaling factor F, crossover rate CR and population size NP. The performance of DE largely depends on the three control parameters; moreover, the three control parameters are fixed and it is very sensitive to different optimization problems; it requires to be adjusted constantly by a user when solving different problems, and it could be a difficult task to find appropriate values for them [10]. To address this problem, a large amount of adaptive and self-adaptive DE variants had been reported. Islam et al. presented an adaptive DE (ADE), in which the parameters F and CR are updated according to the successful values [11]. A self-adaptive (jDE) method to renewal the F and CR values of the DE algorithm was presented by Breast et al. In jDE, each individual will be assigned a parameter value F and CR. Each parameter is randomly generated with some probability, and the new generated parameters are reversed for the next generation only when the corresponding trial individual is selected as the new individual of the population [12]. Zhang et al. proposed an adaptive DE algorithm with an external archives, and “DE/current-to-pbest” mutation strategy. In addition, the control parameters CR and F of each individual are automatically updated according to the individuals survive to the next generation [13]. An improved version of JADE called SHADE was proposed by Tanabe and Fukunaga which uses a historical record of successful parameter setting [14]. Before long, Tanabe and Fukunaga proposed a L-SHADE, where a linear population size reduction (LPSR) strategy is applied to SHADE. The population linearly decreases along with the number of iterations increases. L-SHADE exhibited good performance, in comparison with the state-of-the-art DE algorithm on CEC2014 benchmark test function set [15]. Subsequently, a plenty of improved version of L-SHADE have been placed on top ranks at CEC competitions, such as JSO [16], LSHADE44 [17], LSHADE EpSin [18], LSHADE-RSP [19] and so on.

To better tackle various different optimization problems, the improvement of DE is not only focused on improving the three main control parameters, but also proposing different kinds of mutation selection strategies. Here, we will review some of them.

A self-adaptive DE (SaDE) was proposed by Qin et al. In SaDE, at first, four mutation strategies are stored in a strategy pool; each strategy is randomly selected for every individual at each generation. Before long, some promising strategies are picked up into strategy pool according to the success rate of the trial vector during the evolutionary [20]. Wang et al. proposed a composite DE (CoDE), in which three mutation operators are adopted. In each generation, the new trial vector derives from the best one which generates by three mutation strategies at the same time. The obtained results indicated that CoDE is an effective optimization algorithm for solving the CEC2005 test functions [21]. A self-adaptive multi-operator-based differential evolution (SAMO-DE) was conceived by Elsayed et al. In SAMO-DE, each search operators has its own sub-population. In each generation, the new selected operator relies on the reproductive success of the search operators. The experimental results showed that SAMO-DE was superior to other DE algorithms [22]. Mallipeddi et al. presented an ensemble differential evolution algorithm (EPSDE). In EPSDE, some strategy pools are designed to store the mutation operators and the control parameters. Besides, it establishes the relationship between mutation strategies and control parameters during the evolutionary process [23]. All of the above-mentioned algorithms did not try to use the fitness landscape metrics at the mutation strategy selection phase, so fitness landscape analysis is necessary to be taken into account for guidance of algorithm design.

3 Local fitness landscape-based adaptive mutation strategy differential evolution

This section proposes the framework of the LFLDE algorithm, in which the mutation strategy selection method is presented, and a novel control parameters adaptive strategy is adopted. Besides, population linear reduction is also used.

3.1 Local fitness landscape

The fitness landscape has been proven effective for analyzing evolutionary algorithms [24]. A fitness landscape is composed of fitness function values of all the individuals in the search space. Usually, fitness landscape is a static description of a problem. The difficulty of a problem is often measured according to the characteristics of the landscape [25]. Several landscape metrics have been developed for analyzing and evaluating the different characteristics of problems, such as the fitness distance correlation [26], information entropy measure [27], fitness cloud and length scale [24]. In this paper, a local fitness landscape method is utilized. Firstly, for a population (x1, x2,…, xμ), find out the best individual among the whole population and tag it with xbest. Then calculate the Euclidean distance between each individual xi (i = 1,…,µ) and xbest, which can be expressed as:

$$ d_{i} = \sum\limits_{j = 1}^{n} {\left( {x_{{i_{j} }} - x_{{best_{j} }} } \right)}^{\frac{1}{2}} $$
(12)

Secondly, sort the individuals according to the distance value, resulting in the following in ascending order: k1, k2,…, kμ. After that, calculate the local fitness landscape evaluation metric of the individual. The measure metric value is denoted by χ. Suppose that the initial value is 0. In the end, the value will be increased to 1, if \(f_{{k_{i + 1} }} \le f_{{k_{i} }}\). Normalize the value as:

$$\varphi = \frac{\chi }{\mu } $$
(13)

On account of the local fitness landscape is actual an ambiguous concept, the landscape feature φ can be deemed as the roughness of observed fitness landscape, and it can be used to judge the complexity of the optimization problem. If the value of φ = 0, then local fitness landscape is more like a uni-modal landscape; it means that the optimization problem is easy to solve. On the contrary, if the value of φ = 1, it suggests that the local fitness landscape is very rough, the optimal solution of the problem is hard to solve [28]. Thus, it is beneficial to choose different mutation strategies according to the landscape feature, because optimization problem is a dynamic process. Therefore, the landscape feature φ will fluctuate in each generation. To be consistent with φ, we use the dynamic mean value act as the landscape feature at each generation, which can be calculated as follows:

$$ DM_{\varphi } \left( G \right) = \frac{{\sum\nolimits_{G = 1}^{{G_{\max } }} {\varphi \left( G \right)} }}{G} $$
(14)

where DMφ is the mean feature value, and φ1, φ2,…, φG are every single value obtained from every generation.

3.2 Parameter adaptation

Numerous experiments have shown that the performance of the DE algorithm depends on the values of the control parameters (F and CR) during the evolutionary process. In generally, we use a different adaptation mechanism to generate their values. In the beginning, for each individual in the population, the control parameters CRi and Fi are generated using a Gaussian distribution and Cauchy distribution, respectively. CRi and Fi are calculated, respectively [13]:

$$ {\text{CR}}_{i,G} = {\text{rand}}n_{i} \left( {\mu_{CR} ,\;0.1} \right) $$
(15)
$$ F_{i,G} = {\text{rand}}c_{i} \left( {\mu_{F} ,\;0.1} \right) $$
(16)

where \(\mu_{F}\) and \(\mu_{CR}\) represent the mean values. At first, the standard deviations preset to a fixed constant 0.1; the mean \(\mu_{F}\) and \(\mu_{CR}\) are set to 0.5, then updated in each generation by the following formulas:

$$ \mu_{{{\text{CR}}}} = 0.95 - 0.05 * {\text{rand}} $$
(17)
$$ \mu_{F} = 0.65 - 0.15 * {\text{rand}} $$
(18)

Besides, an adaptive p value method is adopted. A linear reduction p value for current-to-pBest/1 mutation is updated by the following formula:

$$ p = \left( {\frac{{p_{\max } - p_{\min } }}{{{\text{FES}}_{\max } }}} \right) \cdot {\text{FES}} + p_{\min } $$
(19)

where FES is the current number of objective function evaluations, FESmax is the maximum number of objective function evaluations, and pmax and pmin are the values that were initially defined.

3.3 Population updating method

In the meantime, to increase the diversity of the population at the early phase of the evolutionary process, while to reduce computational cost at later stages, the population size NP decreases linearly along with the evolution process is utilized. It has a large value at the beginning of the iterative process, until the population size reaches to NPmin; the corresponding formula can be defined as following [15]:

$$ {\text{NP}}_{G + 1} = {\text{round}}\left( {{\text{NP}}_{\min } - {\text{NP}}_{\max } } \right)/{\text{FES}}_{\max } * {\text{FES}} + {\text{NP}}_{\max } $$
(20)

where NPmax is the biggest population size at generation G = 0.

3.4 Mutation strategy selection based on local landscape feature

According to the local fitness landscape feature, if dynamic mean value DMφ greater than ɛ which indicates the local fitness landscape is very rough, the current-to-pbest mutation strategy is used. Otherwise, the DE/best/2 mutation strategy is adopted. So, for an optimization problem, a local fitness landscape is used to judge the difficulty of the problem, then guiding the selection of the suitable mutation strategy.

$$ \left\{ {\begin{array}{*{20}l} {v_{i,G} = x_{i,G} + F * \left( {x_{best}^{p} - x_{i,G} } \right) + F * \left( {x_{r1,G} - x_{r2,G} } \right),} \hfill & {if\;\;{\text{DM}}_{\varphi } > \varepsilon } \hfill \\ {v_{i,G} = x_{best} + F * \left( {x_{r1,G} - x_{r2,G} } \right) + F * \left( {x_{r3,G} - x_{r4,G} } \right)} \hfill & {{\text{otherwise}}} \hfill \\ \end{array} } \right. $$
(21)

where is a predefined constant, in this paper ɛ = 0.5. The pseudocode of the proposed LFLDE algorithm is presented in Table 1.

Table 1 Pseudocode of the LFLDE algorithm

4 Experimental results

4.1 Benchmark test functions and parameters setting

To evaluate the performance of the proposed LFLDE algorithm, a series of single-objective test functions originated from the CEC2017 competition are used in the experiments [29]. This test function set contains 30 a bounded single-objective functions with a diverse set of characteristics. For all test functions, the domain of definition is [− 100,100]D. All the test functions in the test suite can be divided into four types: uni-modal functions f1f3; basic multimodal functions f4f10; hybrid functions f11f20; composition functions f21f30. See [30] for details..

The performance of proposed algorithm is evaluated and compared with five recently proposed DE algorithms such as CoDE [21], SaDE [20], EPSDE [23], MPEDE [31] and LSHADE [15]. For the above algorithms, the parameter setting values were recommended in the cited original papers. The parameters of LFLDE are listed in pseudocode. All the population sizes are ten times the dimensions of the problem.

4.2 Algorithm complexity

All experiments were performed using MATLAB R2019a running on a laptop with Intel Core i7-7700 (3.60 GHz) CPU and 16 GB RAM running on Windows 10 system. First, the Big O notation is used to measure the computation complexity [32]. To find the optimal solutions, the proposed algorithms have an initialization stage and a subsequent stage of iterations. The computation complexity depends on n, Popsize and Maxiter:

$$ {\text{Computation}}\;{\text{complexity}} = {\text{ O}}(n \times {\text{Popsize}} \times {\text{Maxiter}}) $$
(22)

For LFLDE algorithm, according to Table 1, the maximum computational complexity in terms of Big O notation is:

$$ {\text{Computation}}\;{\text{complexity}} = {\text{O}}(10,000 \times n \times n) $$
(23)

where Popsize is equal to 10, and the maximum number of iterations is equal to 1000. Thus, the computational complexity for all the proposed algorithms is directly proportional to the square of the input mark/channel value.

In the following, an intuitive computational complexity method is presented. Table 2 shows the time complexity of the LFLDE when solving 10, 30, 50 and 100 dimensions problem in detail. As defined in [33], T0 is the time calculated by executing the following loop function:

$$ \begin{array}{*{20}c} {for\;i = 1{:}1000000} \\ {x = 0.55 + \left( {{\text{double}}} \right)i;\;x = x + x;\;x = x/2;\;x = x + x;} \\ {x = {\text{sqrt}}\left( x \right);\;x = {\text{log}}\left( x \right);\;x = {\text{exp}}\left( x \right);\;x = x/\left( {x + 2} \right);} \\ {{\text{end}}} \\ \end{array} $$
Table 2 Algorithm complexity

T1 is the time to run 200,000 evaluations of test function f18 by itself with D dimensions, and T2 is the time to execute LFLDE with 200,000 evaluations of f18 in D dimensions. T3 is the mean T2 values of 5 runs.

4.3 Algorithm performance

To evaluate the performance of algorithms, a total of 29 test functions are performed in the experiments. We utilized the solution error measure f(x) − f(x*) as the final result, and error values and standard deviations less than 10–8 are regarded as zero [34], where x is the best solution searched by each algorithm in each generation and x* is the well-known global optimal solution of each test function. Every test for each function and each algorithm run 51 times independently, the condition for loop termination is set to 10,000 × D, which is the maximum number of function evaluations (FEs). The experimental results of the LFLDE carried out on the test functions with 10, 30, 50 and 100 dimensions are listed in Tables 3 and 4. It includes the gained best, worst, median, mean values and the standard deviations of error from the optimum solution of the proposed LFLDE over 51 runs for 29 benchmark functions. Note that the function f2 was excluded from the comparison by its design which is not perfect.

Table 3 The results of the LFLDE algorithm for 10D and 30D
Table 4 The results of the LFLDE algorithm for 50D and 100D

The proposed algorithm LFLDE was equipped to search the optimal solutions for uni-modal functions f1 and f3 for all dimensions. Among multi-modal functions f4f10 only for the function f6 and f9 the global optimum has been achieved on 10D, 30D and 50D. For hybrid functions f11f20, the 12th function f12 appears to be relatively difficult: this function has high error values. For composition functions f21f30, the proposed algorithm appears to fall into local optima quite often; the algorithm is difficult to obtain the global optimal solution.

4.4 Statistical test and comparison to other algorithms

In this section, we will compare the performance of the LFLDE with CoDE, SaDE, EPSDE, MPEDE and LSHADE on the CEC2017 benchmark functions. The results are given in Tables 5, 6, 7 and 8 for each dimension. In these Tables, the mean and standard deviation values are shown for the six algorithms. The best results for each problem are shown in boldface, and the statistical testing is also shown in a separate column. The + , −, ≈ indicate whether a given algorithm performed significantly better (+), significantly (−) or not significantly different better or worse (≈) compared to LFLDE according to the Wilcoxon rank-sum test at the 0.05 significance level [35].

Table 5 Comparison of results for 10D
Table 6 Comparison of results for 30D
Table 7 Comparison of results for 50D
Table 8 Comparison of results for 100D

Summarizing the results of the statistical testing compared with the LFLDE algorithm is presented in Table 9, if we compare a number of wins (+) and loses (−). We can see that LFLDE indicates the better performance than the state-of-the-DE algorithms on all dimensions. In addition, based on the average results obtained, the average ranking of all algorithms, as produced by the Friedman rank test, is summarized in Table 10. The results in Table 10 are consistent with the results in Table 9, in which LFLDE has the best rank, and LSHADE gets the second rank. All the experiments show that the proposed algorithm framework is effective, and the fitness landscape can boost the performance of the algorithm.

Table 9 The summarized results of the Wilcoxon rank-sum test (p = 0.05)
Table 10 Friedman ranks for all algorithms

5 Conclusions and further work

In this paper, we used fitness landscape information to analyze the optimization problems; then guiding the selection of mutation strategy at each generation, a novel control parameters adaptive mechanism is utilized to enhance the proposed algorithm. Besides, we also adopted population linear reduction method. The performance of the algorithm was evaluated on the set of benchmark functions provided for CEC 2017 special session on single-objective real-parameter optimization. The experimental results give evidence that the LFLDE algorithm is highly competitive when comparing to the advanced variant DE algorithms such as CoDE, SaDE, EPSDE, MPEDE and LSHADE on 29 benchmark functions with 10, 30, 50 and 100 dimensions. All the results indicated that the fitness landscape information is benefit for improve the performance of DE. For future research, we plan to apply the fitness landscape to realize the composite mutation strategy and control parameter selection in DE algorithm.