1 Introduction

Differential evolution (DE) algorithm, first proposed by Storn and Price (1997), has become one of the most efficient continuous optimization algorithms. Its simple structure, ease of use, efficiency, speed, and robustness have led to a large interest in optimization field from both researchers and practitioners in the last two decades. Specifically speaking, DE has shown its advantages in a variety of numerical optimization problems and practical applications, such as constrained optimization (Mohamed and Sabry 2012), multi-objective optimization (Qiu et al. 2016), multimodal optimization (Liang et al. 2019), dynamic optimization (Mukherjee et al. 2016), flexible capacity planning (Hu et al. 2018), engineering design (Yi et al. 2018), power systems (Zhu et al. 2018), neural networks (Arce et al. 2018) and so on. Review literatures (Das and Suganthan 2011a; Das et al. 2016; Al-Dabbagh et al. 2018) summarized various DE variants and applications, the interested readers can refer to the articles therein.

Similar to other evolutionary algorithms, classic DE contains three key operators: mutation, crossover and selection, but its most striking feature is adopting a differential mutation operator to produce the offspring population. Therefore, for DE algorithms, the mutation operator and its related control parameters usually play the foremost role in the performance. To enhance DE’s performance when dealing with different kinds of optimization problems, many researchers have proposed numerous mutation strategies and parameter tuning mechanisms. Here, we only introduce some literatures which are closely related and representative. In terms of operator improvement, (Zhang and Sanderson 2009) adopted the top \(100\times p\%\) best individuals (denoted by \(\varvec{x}_{p\mathrm{best}}\)) selected from the current population to create a novel current-to-pbest mutation strategy. Islam et al. (2012) proposed a novel mutation strategy that taking the best individual in a dynamic group, which is composed of randomly selected \(100\times q\%\) individuals from the current population, as the directional vector. Gong and Cai (2013) applied the fitness ranking of every individual in the current population to determine their respective selection rate and then proposed a new kind of ranking-based mutation operator. Gong et al. (2015) further improved their proposed ranking-based mutation operator and introduced an adaptive ranking strategy which according to the confronting situation of the current population. Wang et al. (2013) designed a novel Gaussian mutation operator (its mean value is set to the average value of the current individual and the best individual) and combined DE/best/1 to propose a DE variant. Sun et al. (2019) combined a novel Gaussian mutation operator which takes the best one of three randomly selected individuals as the mean value and a modified common mutation operator based on the order of three selected individuals to enhance DE. Mohamed et al. (2018) introduced a less greedy mutation strategy and a more greedy mutation strategy to balance the exploration capability and exploitation capability of DE, and both the two mutation strategies are based on the order of three vectors randomly selected from the current generation. Opara and Arabas (2018) provided some formulas for the expectation vectors and covariance matrices of the mutants’ distribution to evaluate several operators of differential mutation. Various techniques that combined different mutation strategies were proposed in the past decades, which are used to enable DE to solve a wide range of problems and further improve its performance. Qin et al. (2009) proposed a self-adaptive DE, in which for each iteration, a mutation strategy is chosen from a given strategy candidate pool according to the corresponding posterior probability calculated via the previous successful experience. Han et al. (2013) divided the individuals into a superior group and an inferior group based on their fitness values and designated two mutation operators with different search features to those two groups, respectively.  Fan and Yan (2016) proposed a self-adaptive DE with zoning evolution parameters and strategies, in which each mutation strategy is randomly allocated to one individual according to its corresponding selective probability and cumulative probability. Zhou and Zhang (2019) proposed an underestimation-based multimutation strategy for DE, in which a set of candidate offsprings are simultaneously generated for each target individual by utilizing multiple mutation strategies. Sun et al. (2019) proposed two mutation operators with different characteristics to produce the respective mutant vector and provided a historical success rate-based mechanism to coordinate the two adopted mutation operators. Many researchers focus on the parameters tuning mechanism to enhance DE. Sarker et al. (2014) used a new mechanism based on the success rate and a reset technique to dynamically select the best performing parameter combinations during the course of each single run. Yu et al. (2014) simultaneously applied population-level parameters (based on the information about exploration and exploitation statuses) and individual-level parameters (according to the individual’s fitness value and its distance to the global best individual) to improve the performance of DE. Tang et al. (2015) provided an individual-dependent mechanism to determine the parameter setting and mutation operators. Draa et al. (2015) applied two preset sinusoidal formulas to periodically control the scale factor and crossover rate during the search process and called it SinDE for short. Draa et al. (2018) further introduced an opposition-based learning method and a restart mechanism to boost the SinDE’s exploration ability and avoid stagnation. Sun et al. (2018) applied the fitness value information and one dynamic fluctuation rule to automatically compute the values of scale factor and crossover rate for each individual in every run.

Fig. 1
figure 1

The time-varying characteristic of function \(M_{g}\times P_{g}\) in the optimization process

Fig. 2
figure 2

The time-varying characteristic of scaling factor \(F_{g}\) in the optimization process

Fig. 3
figure 3

The time-varying characteristic of crossover rate \(\mathrm{CR}_{i,g}\) with different status in the optimization process

It is well-known that maintaining a proper balance between global exploration ability and local exploitation ability during the optimization routine is the core guideline of designing evolutionary algorithms (Črepinšek et al. 2013). For DE, although various mutation operators and effective parameter tunning rules have been proposed, how to build a proper balance model when handling different kinds of problems is still a challenging task. We all know that the trial vector generation strategy (mainly refer to mutation operator) and control parameter tunning strategy (scale factor F and crossover rate CR) play the vital role in balancing exploration ability and exploitation ability. In order to take full advantage of the partnership between mutation operator and control parameter, we first introduce a novel mutation operator which takes a dynamic combination of the current individual and the best individual as the base vector, and the combination depends on a decreasing function and a periodic function. In addition, one decreasing function, one individual-dependence function and two wave functions are used to compute the exact values of scale factor F and crossover rate CR during the run. Since all the adopted three key functions reflect the time-varying characteristics, we denote the new proposed DE variant as TVDE for short. To verify and analyze the performance of TVDE, numerical experiments were conducted using CEC 2014 (Liang et al. 2013) benchmarks, real-life optimization problem and seven efficient DE variants. Experimental results indicate that the TVDE defeats all the competitors on account of overall performance.

Table 1 Summary of the IEEE CEC 2014 benchmark functions
Table 2 Comparative results on functions \(f_{1}-f_{15}\) with \(D=30\)
Table 3 Comparative results on functions \(f_{16}-f_{30}\) with \(D=30\)
Table 4 Comparative results on functions \(f_{1}-f_{15}\) with \(D=50\)
Table 5 Comparative results on functions \(f_{16}-f_{30}\) with \(D=50\)
Table 6 Comparative results on functions \(f_{1}-f_{15}\) with \(D=100\)
Table 7 Comparative results on functions \(f_{16}-f_{30}\) with \(D=100\)
Table 8 Comparative results on real-world problems \(\textit{rf}_{1}-\textit{rf}_{4}\)
Table 9 Statistical results on all test functions and real-world problems

The remainder of this paper is organized as follows. Section 2 briefly introduces the basic operators of original DE algorithm. Section 3 describes the proposed TVDE algorithm and provides its overall procedure. Section 4 presents the comparative experiments between TVDE and its seven competitors. Section 5 draws the conclusions.

2 Differential evolution

In basic DE algorithm, an initial random population consists of NP individuals, and each individual is represented by one D-dimensional vector \(\varvec{x}_{i}=[x_{i,1},x_{i,2},\ldots ,x_{i,D}]\). The first step is that randomly generating NP vector based on uniform distribution to form the initial population. Then adopting mutation operator and crossover operator to generate a trial vector. One selection operator is executed between the parent and its corresponding trial vector to choose the vector survived in the next generation at last. Detailed steps of basic DE are provided below:

2.1 Initialization operation

For starting the optimization process, an initial population must be created. The most widespread approach is to adopt uniform distribution, i.e., each jth (\(j=1,2,\ldots ,D\)) component of the ith (\(i=1,2,\ldots ,\mathrm{NP}\)) individual in the initial population is obtained as the following:

$$\begin{aligned} x_{i,j}=L_{j}+ \text {rand}_{i,j}\times \big (U_{j}-L_{j}\big ), \end{aligned}$$
(1)

where vectors \(\varvec{L}=\big [L_{1},L_{2},\ldots ,L_{D}\big ]\) and \(\varvec{U}{=}\big [U_{1},U_{2},\ldots ,U_{D}\big ]\), respectively, represent the lower and upper bounds of search space, and \(\text {rand}_{i,j}\) denotes a uniformly distributed random number in the interval [0, 1].

2.2 Mutation operation

Following the initialization operation, for each target vector \(\varvec{x}_{i}\), mutation operator is used to generate its corresponding mutant vector \(\varvec{v}_{i}=\big [v_{i,1},v_{i,2},\ldots ,v_{i,D}\big ]\). The most frequently used mutation operators in various DE variants are listed as follows:

  1. (1)

    DE/rand/1

    $$\begin{aligned} \varvec{v}_{i}=\varvec{x}_{r_{1}}+F\times (\varvec{x}_{r_{2}}-\varvec{x}_{r_{3}}). \end{aligned}$$
    (2)
  2. (2)

    DE/best/1

    $$\begin{aligned} \varvec{v}_{i}=\varvec{x}_\mathrm{best}+F\times (\varvec{x}_{r_{1}}-\varvec{x}_{r_{2}}). \end{aligned}$$
    (3)
  3. (3)

    DE/current-to-best/1

    $$\begin{aligned} \varvec{v}_{i}=\varvec{x}_{i}+F\times (\varvec{x}_\mathrm{best}-\varvec{x}_{r_{1}})+F\times (\varvec{x}_{r_{2}}-\varvec{x}_{r_{3}}). \end{aligned}$$
    (4)
  4. (4)

    DE/best/2

    $$\begin{aligned} \varvec{v}_{i}=\varvec{x}_\mathrm{best}+F\times (\varvec{x}_{r_{1}}-\varvec{x}_{r_{2}})+F\times (\varvec{x}_{r_{3}}-\varvec{x}_{r_{4}}). \end{aligned}$$
    (5)
  5. (5)

    DE/rand/2

    $$\begin{aligned} \varvec{v}_{i}=\varvec{x}_{r_{1}}+F\times (\varvec{x}_{r_{2}}-\varvec{x}_{r_{3}})+F\times (\varvec{x}_{r_{4}}-\varvec{x}_{r_{5}}). \end{aligned}$$
    (6)

Indexes \(r_{1}, r_{2}, r_{3}, r_{4}\) and \(r_{5}\) are mutually different integers randomly generated from set \(\{1,2,\ldots ,\mathrm{NP}\}\), which are also different from the value of i. Control parameter F, called scaling factor, is a positive value used to scale the difference vector. Vector \(\varvec{x}_\mathrm{best}=(x_{\mathrm{best},1},x_{\mathrm{best},2},\ldots ,x_{\mathrm{best},D})\) represents the best individual vector with the best fitness value in the current population.

In the aforementioned mutation operators, taking DE/rand/1 for example, \(\varvec{x}_{r_{1}}\) and \(\varvec{x}_{r_{2}}\) are, respectively, referred to as base vector and directional vector, and \(\varvec{x}_{r_{2}}-\varvec{x}_{r_{3}}\) is often regarded as the difference vector. As a matter of fact, the base individual can be taken as the center point of the searching area, meanwhile the difference vector essentially determines the searching direction and search span.

Fig. 4
figure 4

Convergence graphs (mean curves) for eight algorithms on functions \(f_{1}, f_{3}, f_{6}, f_{7}, f_{8}\) and \(f_{9}\) with \(D=50\) over 50 independent runs

Fig. 5
figure 5

Convergence graphs (mean curves) for eight algorithms on functions \(f_{13}, f_{14}, f_{15}, f_{17}, f_{18}\) and \(f_{19}\) with \(D=50\) over 50 independent runs

Fig. 6
figure 6

Convergence graphs (mean curves) for eight algorithms on functions \(f_{20}, f_{21}, f_{22}, f_{18}, f_{29}\) and \(f_{30}\) with \(D=50\) over 50 independent runs

Table 10 Parameter setting of different TVDEs

2.3 Crossover operation

In original DE algorithm, there are two main crossover types: binomial and exponential, but we only elaborate the more widely used binomial crossover here. To yield the trial vector \(\varvec{u}_{i}=\big [u_{i,1},u_{i,2},\ldots ,u_{i,D}\big ]\), the binomial crossover between target vector \(\varvec{x}_{i}\) and mutated vector \(\varvec{v}_{i}\) is performed as follows:

$$\begin{aligned} u_{i,j} = \left\{ \begin{array}{ll} v_{i,j}, &{} \quad \text {if}~ \big (\text {rand}_{i,j}\le CR~\text {or}~j=j_\mathrm{rand}\big )\\ x_{i,j}, &{} \quad \text {otherwise}, \end{array} \right. \end{aligned}$$
(7)

where \(\mathrm{CR}\in [0,1]\), called the crossover rate, is a parameter which applied to control that how many components of trial vector are inherited from the mutant vector, and \(j_\mathrm{rand}\) is a random integer selected from set \(\{1, 2, \ldots , D\}\).

Fig. 7
figure 7

Convergence graphs (mean curves) for the TVDE with different values of parameter Freq on functions \(f_{1}, f_{3}, f_{6}, f_{7}, f_{8}\) and \(f_{9}\) with \(D=50\) over 50 independent runs

Fig. 8
figure 8

Convergence graphs (mean curves) for the TVDE with different values of parameter Freq on functions \(f_{13}, f_{14}, f_{15}, f_{17}, f_{18}\) and \(f_{19}\) with \(D=50\) over 50 independent runs

Fig. 9
figure 9

Convergence graphs (mean curves) for the TVDE with different values of parameter Freq on functions \(f_{20}, f_{21}, f_{22}, f_{28}, f_{29}\) and \(f_{30}\) with \(D=50\) over 50 independent runs

2.4 Selection operation

Selection operation based on greedy strategy, the better one between target vector \(\varvec{x}_{i}\) and trial vector \(\varvec{u}_{i}\) will survive into the next generation, is widely adopted in a mass of DE algorithms, and its specific forms (for a minimization problem) can be described as follow:

$$\begin{aligned} \varvec{x}_{i} = \left\{ \begin{array}{ll} \varvec{u}_{i}, &{}\quad \text {if}~ f(\varvec{u}_{i})\le f(\varvec{x}_{i})\\ \varvec{x}_{i}, &{} \quad \text {otherwise}, \end{array} \right. \end{aligned}$$
(8)

where \(f(\cdot )\) expresses the objective function to be minimized.

3 Description of TVDE

In this section, we firstly provide a detailed description about the new proposed mutation operator, the adopted parameter control methods for scale factor F and crossover rate CR and then summarize the overall procedure of TVDE.

3.1 DE/tvbase-to-rand/1

Depending on the structure of mutation operator, it is easy to understand that the base vector, one important component of mutation operator, essentially determines the core region that generates mutation vector and affect the diversity of population. To be more specific, if each individual takes itself as base vector, the generated mutation vectors are around different individuals, which certainly have better population diversity, more powerful global exploration ability and lower possibility to locate in local optimal solution, but the convergence speed and the accuracy of solution may perform not well. Otherwise, if all individuals take the best individual as base vector, the mutation vector will demonstrate powerful local exploitation ability and fast convergence speed, but due to low population diversity, it is easy to fall into the local optimal solution. Therefore, to simultaneously take advantage of the current individual and the best individual as the base vector, we design a dynamic combination of them so that the current individual and the best individual can be taken by a conjoint way. The specific representation is expressed as follows:

$$\begin{aligned} \varvec{v}_{i,g}=\varvec{b}_{i,g}+F_{g}\times (\varvec{x}_{r_{1},g}-\varvec{x}_{i,g})+F_{g}\times (\varvec{x}_{r_{2},g}-\varvec{x}_{i,g}).\nonumber \\ \end{aligned}$$
(9)

The concrete form of base vector \(\varvec{b}_{i,g}\) is described as follows:

$$\begin{aligned} \varvec{b}_{i,g}=M_{g}\times P_{g}\times \varvec{x}_{i,g}+(1-M_{g}\times P_{g})\times \varvec{x}_{\mathrm{best},g}, \end{aligned}$$
(10)

where \(M_{g}=(G-g+1)/G\) is a macro-control function and \(P_{g}=0.5\times (\cos (2\pi \times \mathrm{Freq} \times g)+1)\) is a periodic function, in which the pregiven parameter Freq represents the frequency of one cosine function, and g is the index of current generation and G denotes the maximum allowable generations. The essential feature of the new proposed mutation operator is that it has a base vector with time-varying characteristic, and the difference vector is composed of two random individuals and the current individual. Based on the existing naming conventions, we name the novel mutation operator as DE/tvbase-to-rand/1 for short. Furthermore, in order to visually display the time-varying characteristics of base vector, we depict one picture (i.e., Fig. 1) about time-varying function \(M_{g}\times P_{g}\) when \(G=10,000\) and \(\mathrm{Freq}=0.01\).

From Fig. 1, we can see that there are two main principles used in constructing the base vector. First, at the micro level, the base vector is constructed with the dynamic combination of current individual and best individual, and its value presents a kind of volatility, which is used to provide an effective balance between the global exploration ability and local exploitation ability. Second, at the macro level, the base vector focuses on the current individual in the early stage, but lays emphasis on the best individual in the later stage, which means the search work is gradually transferred from a large global area to a key local area during the optimization process.

The value of scaling factor \(F_{g}\) can be computed via the following formula,

$$\begin{aligned} F_{g}=\sqrt{0.5\times (1-T_{g})\times (M_{g}+P_{g})}, \end{aligned}$$
(11)

where \(T_{g}=0.5\times (\cos (2\pi \times \mathrm{Freq} \times g)\times g/G+1)\) is another time-varying function. The time-varying characteristic of scaling factor \(F_{g}\) is intuitively represented in Fig. 2.

From Fig. 2, it can be seen that scaling factor \(F_{g}\) also has a certain degree of volatility and an overall downward trend. Owing to the fact that larger value of scaling factor usually leads to better global exploration ability but weaker local exploitation ability, however, smaller value of scaling factor often gives rise to the opposite result. Based on the role of \(F_{g}\) in balancing the exploration ability and exploitation ability, we can know that the change rule in the value of \(F_{g}\) is in line with the widely accepted general rule in evolutionary algorithms.

After the mutation vectors are generated, the following crossover operation can be executed as follow:

$$\begin{aligned} u_{i,j} = \left\{ \begin{array}{ll} v_{i,j}, &{}\quad \text {if}~ \big (\text {rand}_{i,j}\le \mathrm{CR}_{i,g}~\text {or}~j=j_\mathrm{rand}\big )\\ x_{i,j}, &{}\quad \text {otherwise}. \end{array} \right. \end{aligned}$$
(12)

The value of crossover rate \(\mathrm{CR}_{i,g}\) is computed by the following formula:

$$\begin{aligned} \mathrm{CR}_{i,g}=M_{g}\times T_{g}+(1-M_{g})\times I_{i,g}, \end{aligned}$$
(13)

where \(I_{i,g}=(f(\varvec{x}_{i,g})-f(\varvec{x}_\mathrm{best}))/(f(\varvec{x}_\mathrm{worst})-f(\varvec{x}_\mathrm{best})+1.0e-99)\), which essentially represents the status of individual \(\varvec{x}_{i,g}\) in the current population, and \(\varvec{x}_{worst}\) is the worst individual in the current population. Formula (13) illuminates a basic fact that different individuals adopt inequable values of crossover rate. To visually display the change rule of crossover rate \(\mathrm{CR}_{i,g}\) that corresponding to individuals with different status, we describe the time-varying characteristics of crossover rate \(\mathrm{CR}_{i,g}\) when \(I_{i,g}=0.1\), \(I_{i,g}=0.4\), \(I_{i,g}=0.6\) and \(I_{i,g}=0.9\), respectively.

According to the computation rule of procedure parameter \(I_{i,g}\), the individual with smaller value of \(I_{i,g}\) actually has more superiority in the population. From Fig. 3, there are two important principles used in determining the crossover rate \(\mathrm{CR}_{i,g}\) for each individual. One is that all the crossover rates related to individuals with different status should have the volatility characteristics, which is conducive to balance the exploration ability and exploitation ability. The other one is that the individuals with different status should have different changing trends at the macro level. More specifically, for the superior individual, i.e., the value of \(I_{i,g}\) is smaller than 0.5, its value of crossover rates \(\mathrm{CR}_{i,g}\) is decreasing at the macro level, and the better individual has the more obvious trend. For the inferior individual, the trend is just the opposite. The changing rule means that the crossover rates of all the individuals should have no significant difference in the early stage, but the better individuals should have less change, meanwhile the poorer individuals have major change in the later stage. Followed by the crossover operation, the greedy selection operation (8) is applied to determine the individual survived into the next generation.

3.2 The overall procedure of TVDE

We have provided a detailed description of DE/tvbase-to-rand/1 and the adopted parameter tunning mechanism. Now, we summarize the overall procedure of TVDE into Algorithm 1.

figure a

4 Numerical experiments and comparisons

In this section, we first provide the experimental setup, then summarize and analyze the comparative results between TVDE and the adopted competitors.

4.1 Experimental setup

The CEC 2014 (Liang et al. 2013) benchmark set is utilized to demonstrate the performance of the proposed TVDE, listed in Table 1, which includes 30 functions with different characteristic and dimensions. In our paper, all the benchmark functions are tested on 30D, 50D and 100D. In addition, four real-life optimization problems, often used in evaluating the performance of various algorithms, are added to enrich the comparative experiment; they are parameter estimation for frequency-modulated sound waves (Das and Suganthan 2011b), spread spectrum radar poly-phase code design (Das and Suganthan 2011b), systems of linear equations (García-Martínez et al. 2008) and parameter optimization for polynomial fitting problem (Herrera and Lozano 2000).

Suggested by Liang et al. (2013), all the algorithms involved in the experiment are terminated when the number of function evaluations reaches \(10,000\times D\). In order to provide a convincing comparison, the adopted competitors contain five state-of-the-art DE variants [JADE (Zhang and Sanderson 2009), MGBDE (Wang et al. 2013), SADE (Qin et al. 2009), GDE (Han et al. 2013), SinDE (Draa et al. 2015)] and two up-to-date DE variants [GPDE (Sun et al. 2019), IDDE (Sun et al. 2018)]. For convenience, for the benchmark functions in CEC 2014 (Liang et al. 2013), the population size of all involved DE algorithms is set to the dimension (D) of the benchmark function that will be optimized, and the suggested values of other control parameters are adopted in those competitors. It should be pointed out that for the real-life optimization problems, owing to their dimensions are small, the population size is all reset to 5D. In our TVDE, there is only one additional parameter Freq which is set to 0.05 for all different problems. Moreover, since the population size \(\mathrm{NP}=D\) or \(\mathrm{NP}=5D\), the function evaluations \(10,000\times D\) are equivalent to 10,000 or 2000 generations, i.e., the value of parameter G should be set to 10,000 when handling the benchmark functions, but \(G=2000\) when solving the real-life optimization problems.

4.2 Comparative results

For each algorithm, 50 independent runs are asked to perform for each benchmark function, and the average and standard deviation based on the error of (\(f(\varvec{x}_\mathrm{best})-f(\varvec{x}^{*})\)) are calculated to evaluate its performance, where \(\varvec{x}_\mathrm{best}\) represents the best solution achieved by the algorithms, and \(\varvec{x}^{*}\) denotes the global optimum that already known of each benchmark function.Wilcoxon’s rank-sum test is conducted at a 5% significance level to compare the significance between TVDE and each corresponding competitor. The compared results are marked with “\(+\)”, “\(=\)” and “−” to indicate that TVDE is significantly better than, worse than, and similar to the corresponding competitor, respectively. The mean and standard deviation based on 50 independent runs are given in Tables 2, 3, 4, 5, 6, 7 and 8, and the summary of comparative results obtained from Wilcoxon’s rank-sum test is collected in Table 9.

A scrutiny of the comparative results summarized in Table 9 shows that TVDE outperforms all the involved DE contestant algorithms in the view of overall performance. Specifically speaking, TVDE performs better than SADE, JADE, GDE, MGBDE, SinDE, IDDE and GPDE on 57, 54, 72, 60, 49, 35 and 41 problems, shows a similar performance on 21, 15, 13, 11, 33, 38 and 30 problems, and shows an inferior performance on 16, 25, 9, 23, 12, 21 and 23 problems, respectively. Furthermore, from the dimensional point of view, TVDE wins all its competitors on 30D, 50D and 100D benchmark functions, and from the perspective of problem characteristics, TVDE only loses to IDDE (Sun et al. 2018) when handling the hybrid and composition functions. A frustrating fact is that TVDE almost loses to all its competitors with regard to real-life problems, maybe caused by that the allowed value of G is too small.

To compare the convergence characteristics of TVDE and its contestants, we select 18 benchmarks functions when \(D=50\) and depict the convergence graphs of all involved algorithms based on their mean values over 50 runs and exhibit the results in Figs. 4, 5 and 6. From Figs. 4, 5 and 6, we can see that TVDE has no eye-catching performance in the early stage, but it has a remarkable performance in the later stage in terms of the extent of the improvement.

It is well-known that the population size NP indeed affects the performance of DE algorithms, but the impact as usually small. Therefore, we only evaluate the influence of parameter Freq on TVDE algorithm, hence TVDE with different parameter setting listed in Table 10 has been executed on test functions with dimensions. However, for saving space and providing an intuitive display, we only depict the convergence graphs of TVDEs with different Freq on 18 benchmarks functions when \(D=50\), and the results are plotted in Figs. 7, 8 and 9. Figures 7, 8 and 9 show that although parameter Freq has an effect on the performance of TVDE, but the impact is relatively small. As a result, we can say that parameter Freq is robust in TVDE.

5 Conclusions

In order to enhance the overall performance of the basic DE algorithm, we introduced a novel mutation operator combined with dynamic schemes for scaling factor and crossover rate. The new proposed mutation operator is based on a base vector with time-varying combination of current individual and best individual, and the values of scaling factor and crossover rate depend on three functions with time-varying characteristic and one fitness value-based function. The design of mutation operator and parameter tuning mechanisms takes full account of the balance between exploration ability and exploitation ability during optimization process. To test the effectiveness of the proposed algorithms, 30 benchmark functions with different dimensions from CEC 2014, four real-life optimization problems, and seven DE algorithms are used to form the test experiment. The obtained results have shown that TVDE outperforms all the seven competitors in the view of overall performance.