Differential evolution optimization with time-frame strategy adaptation

Wang, Shih-Chang

doi:10.1007/s00500-015-1982-0

Differential evolution optimization with time-frame strategy adaptation

Methodologies and Application
Published: 11 January 2016

Volume 21, pages 2991–3012, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Soft Computing Aims and scope Submit manuscript

Differential evolution optimization with time-frame strategy adaptation

Download PDF

Shih-Chang Wang ORCID: orcid.org/0000-0002-7062-1537¹

370 Accesses
4 Citations
Explore all metrics

Abstract

In differential evolution (DE) research, many successful empirical guidelines in selecting appropriate trial vector generation strategies and control parameter values for various problems have been investigated. The comprehensive exploration of the experience can be an effective way to develop an advanced DE variant. In this paper, an improved DE approach with time-frame strategy adaptation called the time-frame adaptive differential evolution (TFADE) is proposed. It employs diverse trial vector generation strategies with various control parameter values that can be adaptively determined to generate promising solutions and dynamically adjusted to deal with premature convergence during evolution, according to successful experience over a period of preceding generations called the time frame. In the experimental study, TFADE is compared with 4 commonly used conventional DEs, 3 outstanding state-of-the-art adaptive DEs, and 2 novel non-DE approaches, evaluated by a test suite of 25 benchmark functions. The experimental results show that the performance of TFADE is significantly superior to these competitors.

A simple differential evolution with time-varying strategy for continuous optimization

Article 24 June 2019

Self-adaptive differential evolution with global neighborhood search

Article 22 January 2016

A Self-adaptive Differential Evolution Algorithm for Solving Optimization Problems

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Among contemporary heuristic techniques, evolutionary algorithms (EAs), inspired by biological evolution, have been increasingly important in recent decades. They have been the kernel of the field known as evolutionary computation (EC) in artificial intelligence and succeeded in solving various problems. Nowadays, using EAs to solve optimization problems has become a common practice due to their efficient and effective performance in solving complex problems that cannot be easily solved by conventional techniques.

The differential evolution (DE) algorithm, proposed by Storn and Price (1995), is a popular evolutionary approach and a promising global optimizer in the continuous search domain. It is a population-based stochastic search technique employing mutation, crossover, and selection operations to move a population towards promising solutions. DE is a simple yet powerful search technique revealing remarkable performance in a wide variety of problems. It has been successfully utilized in diverse fields such as scheduling and planning problems (Cheng et al. 2014; Damak et al. 2009; Wang and Yeh 2014), data clustering (Dong et al. 2014; Xiang et al. 2015), pattern recognition (Das and Konar 2009; Du et al. 2007; Maulik and Saha 2009), and power dispatch (Chiou 2009; Varadarajan and Swarup 2008).

The performance of DE highly depends on two parts. One is the trial vector generation strategy, i.e., the mutation and crossover operations; the other is the control parameter setting, i.e., the scaling factor (F), the crossover rate (CR), and the population size (NP). In DE research, many encouraging trial vector generation strategies have been reported, different strategies can be suitable for solving different kinds of problems (Qin et al. 2009). However, the performance of a given trial vector generation strategy is still significantly influenced by its control parameter setting. Although there are many important guidelines for the choice of an appropriate control parameter setting (Gämperle et al. 2002; Mezura-Montes et al. 2006; Price et al. 2005), the interaction between control parameter settings and DE performance is still complicated and not completely understood.

Therefore, in order to successfully solve a practical problem by DE, as the characteristics of the given problem are usually unknown, it is commonly required to perform a trial-and-error procedure to choose an appropriate strategy and fine-tune its corresponding parameter setting. However, such trial-and-error search process is really a time-consuming and error-prone task requiring high computational costs. Besides, the population of target vectors may move through different kinds of regions during evolution. Using diverse trial vector generation strategies coupled with various control parameter settings at different stages of evolution can be more effective than just single one (Mallipeddi et al. 2011; Qin et al. 2009; Yong et al. 2011). Therefore, for DE it is preferable to adaptively determine an appropriate strategy and its parameter setting during evolution. There also has been an increasing interest in designing new DE variants with adaptive ability in the recent decade.

In this paper, we propose a novel adaptive DE variant called the time-frame adaptive differential evolution (TFADE) algorithm to avoid the time-consuming trial-and-error search process for seeking a suitable strategy and its parameter values. Both of which can be adaptively determined to generate promising solutions and dynamically adjusted to deal with the difficulty while premature convergence or stagnation occurs during evolution, according to successful experience over a period of preceding generations called the time frame. The rest of this paper is organized as follows. Section 2 reviews related work on conventional DE and state-of-the-art adaptive DE variants. Section 3 introduces the details of the proposed time-frame adaptive differential evolution algorithm. Some numerical experiments, results and discussions are reported in Sect. 4. Section 5 concludes this paper.

2 Related work

2.1 The differential evolution (DE) algorithm

An initialized population of NP D-dimensional parameter vectors, x $_{i,G}$ = ($x_{1i,\mathrm{G}}$, $x_{2i,\mathrm{G}}$, ..., $x_{Di,\mathrm{G}})$ where $i = 1, 2, {\ldots }$, NP and G be a generation, called target vectors encodes some candidate solutions towards the global optimum. The target vectors in the population are randomly generated from a uniform distribution with minimum bounds of ${{\mathbf x}}_\mathrm{min} =\{x_{1 \mathrm{min}}, x_{2\mathrm{min}}, \ldots , x_{D \mathrm{min}} \}$ and maximum bounds of ${{\mathbf x}}_\mathrm{max} =\{x_{1 \mathrm{max}}, x_{2\mathrm{max}},\ldots , x_{D \mathrm{max}} \}$ within a search space. After initialization the population performs a series of evolutionary operations, which are iterations of mutation, crossover, and selection. The mutation and crossover operations are utilized to generate new vectors called trial vectors. Both operations are so-called the trial vector generation strategy. After that, the selection operation then determines whether each of the new trial vectors will survive into next generation. The common evolutionary process of DE is described as follows.

1. Mutation operation

For each target vector x $_{i,G}$, DE employs the following mutation operation to produce its corresponding mutant vector v $_{i,G}$ (for the case of DE/rand/1):

$$\begin{aligned} {{\mathbf v}}_{i,G} ={{\mathbf x}}_{r1,G} +F({{\mathbf x}}_{r2,G} -{{\mathbf x}}_{r3,G} ), \end{aligned}$$

(1)

where the indices r1, r2 and r3 are mutually exclusive integers randomly selected from the set of $\{1,2,\ldots , NP\}\backslash \{i\}$. F is a mutation factor called the scaling factor for scaling the difference vector (x $_{r2,G} - \mathbf{x}_{r3,G})$. It is a positive real number where $F\in (0,1+)$ that controls the rate at which the population evolves. It is generally the mutation operation that separates one DE variant from another. The mutation operation given in (1), which uses a randomly selected vector ${{\mathbf x}}_{r1,G} $ and one weighted difference vector $F({{\mathbf x}}_{r2,G} -{{\mathbf x}}_{r3,G} )$, is denoted as DE/rand/1. This is the idea how the different DE variants are named. The other commonly used conventional DE variants (Das et al. 2009; Price et al. 2005; Price 1999) are listed as follows:

$$\begin{aligned} \text{ DE }/\text{ rand }/\text{2 }:{{\mathbf v}}_{i,G}= & {} {{\mathbf x}}_{r\text{1 },G} +F( {{{\mathbf x}}_{r\text{2 },G} -{{\mathbf x}}_{r\text{3 },G} })\nonumber \\&+F( {{{\mathbf x}}_{r\text{4 },G} -{{\mathbf x}}_{r\text{5 },G} }) \end{aligned}$$

(2)

$$\begin{aligned} \text{ DE }/{\text{ best }}/\text{1 }:{{\mathbf v}}_{i,G} ={{\mathbf x}}_{\mathrm{best},G} +F( {{{\mathbf x}}_{r1,G} -{{\mathbf x}}_{r2,G} }) \end{aligned}$$

(3)

$$\begin{aligned} \text{ DE }/\text{ best }/\text{2 }:{{\mathbf v}}_{i,G}= & {} {{\mathbf x}}_{\mathrm{best},G} +F( {{{\mathbf x}}_{r1,G} -{{\mathbf x}}_{r2,G} })\nonumber \\&+F( {{{\mathbf x}}_{r3,G} -{{\mathbf x}}_{r4,G} }) \end{aligned}$$

(4)

$$\begin{aligned}&\text{ DE }/\text{ rand }-\text{ to }-\text{ best }/\text{1 }:{{\mathbf v}}_{i,G} ={{\mathbf x}}_{i,G} +F( {{{\mathbf x}}_{\mathrm{best},G} -{{\mathbf x}}_{i,G} })\nonumber \\&\quad +F( {{{\mathbf x}}_{r\text{1 },G} -{{\mathbf x}}_{r\text{2 },G} }) \end{aligned}$$

(5)

$$\begin{aligned}&\text{ DE }/\text{ rand }-\text{ to }-\text{ best }/\text{2 }:{{\mathbf v}}_{i,G} ={{\mathbf x}}_{i,G} +F( {{{\mathbf x}}_{\mathrm{best},G} -{{\mathbf x}}_{i,G} })\nonumber \\&+F( {{{\mathbf x}}_{r1,G}} -{{\mathbf x}}_{r2,G})+F( {{{\mathbf x}}_{r3,G} -{{\mathbf x}}_{r4,G} }) \end{aligned}$$

(6)

$$\begin{aligned}&\text{ DE }/\text{ current }-\text{ to }-\text{ rand }/\text{1 }:{{\mathbf v}}_{i,G} ={{\mathbf x}}_{i,G} \nonumber \\&\quad +K( {{{\mathbf x}}_{r1,G} -{{\mathbf x}}_{i,G} })+F( {{{\mathbf x}}_{r2,G} -{{\mathbf x}}_{r3,G} }) \end{aligned}$$

(7)

2. Crossover operation

Once the mutation operation is finished, the crossover operation is then employed to each pair of the target vector x $_{i,G}$ with its corresponding mutant vector v $_{i,G}$, to generate a new vector u $_{i,G}$ = ($u_{i1,\mathrm{G}}$, $u_{i2,\mathrm{G}}$, ..., $u_{iD,G})$ called the trial vector. The binomial crossover operation is the basic version defined as follows:

$$\begin{aligned} {{\mathbf u}}_{i,G} =u_{ij,G} =\left\{ {{\begin{array}{l@{\quad }l} {v_{ij,G} ,} &{} {\text{ if } \text{(rand }_j \le C\text{ R }) \text{ or } \text{( }j=j_\mathrm{rand} )}\\ {x_{ij,G} ,} &{} {\text{ otherwise }} \\ \end{array} }} \right. \end{aligned}$$

(8)

where $j = 1, 2, {\ldots }, D,$ rand$_{j}$ is a uniform random number generated within the range of [0, 1]. $CR\in [0,1]$ is the crossover probability called the crossover rate, which is a user-defined value controls the fraction of parameter values that are copied from mutant vectors. $j_\mathrm{rand} \quad \in \{1,2,\ldots , D\}$ is a randomly chosen index ensures that the trial vector u $_{i,G}$ does not duplicate the target vector x $_{i,G}$.

3. Selection operation

After the mutation and crossover operations, the selection operation selects a better one between the target vector x $_{i,G}$ and the trial vector u $_{i,G}$ into next generation according to the following fitness measurement of a greedy selection scheme:

(9)

2.2 DE with adaptive scheme

The performance of DE is significantly influenced by its control parameter settings, but the interaction between both is still complicated and not completely understood. Many adaptive DE variants then have been proposed to dynamically adjust control parameter values without involving the knowledge of the relationship among strategies, control parameters, and the characteristics of given problems.

Abbass (2002) adapted the crossover rate CR by encoding it into each individual to simultaneously evolve with other parameter for multi-objective optimization problems. The scaling factor F is generated for each variable from a Gaussian distribution N(0, 1). Teo (2006) then proposed a DE variant with a dynamic population sizing strategy called DESAP based on Abbass’ self-adaptive Pareto DE. He implemented two versions of DESAP using absolute and relative encoding methodologies for dynamically self-adapting the population size parameter. Das et al. (2005) linearly reduced the scaling factor F with increasing generation count from a maximum to a minimum value. They also employed a uniform distribution between 0.5 and 1.5 with a mean value of 1 to obtain a new hybrid DE variant. Brest et al. (2006) encoded control parameters F and CR into the individuals and adjusted by introducing two new parameters $\tau _{1} $ and $\tau _{2} $. In their algorithm (called jDE), a set of F values are assigned to individuals in the population. Then, a random number rand is uniformly generated in the range of [0, 1]. If $\hbox {rand} < \tau _{1}$, the F is re-initialized to a new random value in the range of [0, 1], otherwise it is kept unchanged. The CR is adapted in the same manner but with a different re-initialized range of [0, 1].

Later, adapting not only parameter values but also trial vector generation strategies may bring great performance for DE, some outstanding contemporary adaptive DE schemes focused on both adaptations have been proposed. Qin et al. (2009) proposed a self-adaptive DE algorithm named SaDE, in which both trial vector generation strategies and their associated control parameter values are gradually self-adapted by learning from their previous experience in generating promising solutions. As a result, more suitable strategies with its control parameter settings can be determined adaptively to match different phases of search process. The performance of SaDE is extensively evaluated by 26 bound-constrained numerical optimization problems and compared with several conventional DEs and state-of-the-art adaptive DE variants. Zhang and Sanderson (2009) proposed an adaptive differential evolution called JADE to improve the performance of DE by implementing an efficient mutation scheme of “DE/current-to-pbest” with the optional external archive. The “DE/current-to-pbest” is a generalization of the conventional “DE/current-to-best” and the optional archive operation saves historical information which has the potential to provide promising direction of search for evolution progress. The simulation results show that JADE can not only diversify the population, but also improve convergence performance. Liu and Lampinen (2005) introduced a fuzzy adaptive differential evolution (FADE) using fuzzy logic controllers into DE, the inputs of which incorporate relative function values and individuals of successive generations to adapt the parameters of mutation and crossover operations. The experimental results show that the FADE algorithm outperforms several conventional DEs on higher dimensional problems. Mallipeddi et al. (2011) proposed an ensemble of trial vector generation strategies and control parameters called EPSDE. It contains a pool of distinct trial vector generation strategies and a pool of control parameter values competing to produce offspring during evolution process. Yong et al. (2011) combined several effective trial vector generation strategies with some control parameter settings, proposed a method called composite DE (CoDE). It uses three trial vector generation strategies and three control parameter settings randomly combining them to generate promising solutions.

In recent years, an increasing number of researchers have contributed their effort to enhance the efficiency and effectiveness of DE. Zou et al. (2013) introduced a modified differential evolution algorithm (MDE), in which two distributions of Gauss and Uniform are employed to adjust the scaling factor F and crossover rate CR, and two mutation strategies are adopted to produce better solutions. To guarantee the quality of the population, MDE uses an external archive, and some solutions of high quality in the external archive can be selected for candidate solutions. Lu et al. (2014) proposed a framework called DE with surrogate-assisted self-adaptation (DESSA) to deal with computationally expensive problems. DESSA generates multiple trial vectors using different trial vector generation strategies and parameter settings, and then employs a surrogate model to identify the potentially best trial vector to undergo real fitness evolution. The surrogate model acts like a strategy and parameter setting selector which aims to identify the most suitable strategy and parameter setting for each target vector. Yeh et al. (2014) proposed a novel grey adaptive DE algorithm (GADE) by updating the control parameters in adaptive manner with the help of grey relational analysis. In the algorithm, each target vector using strategies has its own values of scaling factor F and crossover rate CR depending on corresponding grey relational grade. Since the relational grade of an individual is varied over the generations, those two control parameters can be adaptive time-varying. Based on the works coupled with grey evolutionary analysis (GEA), they also proposed a GEA-based DE algorithm (GEA-DE) which enables the control parameters F and CR to adapt in the evolution process by introducing a GEA-based parameter automation approach (Yeh et al. 2015). The simulation results show that GEA-DE with the best mutation strategy (GEA/best) outperforms DE/rand/1, DE/best/1, jDE, and SaDE on most of the test functions in terms of solution accuracy and convergence speed.

3 The time-frame adaptive differential evolution (TFADE) algorithm

To solve a given problem by DE, using diverse trial vector generation strategies coupled with various control parameter settings at different stages of evolution can be more effective than just single one (Mallipeddi et al. 2011; Qin et al. 2009; Yong et al. 2011). Moreover, the performance of DE highly depends on the selection of the trial vector generation strategy and its control parameter setting. To choose a suitable trial vector generation strategy coupled with an appropriate control parameter setting for a given problem depends on the characteristics of the problem. It is common to perform a trial-and-error search for choosing a trial vector generation strategy and fine-tune its control parameter values, which would be really a time-consuming and error-prone task. In the last decade, DE researchers have investigated many empirical guidelines in selecting suitable trial vector generation strategies and control parameter settings for various problems. However, when solving a practical problem, the characteristics of the problem are usually ambiguous, resulting in easily choosing the bad ones. An inappropriate choice of the strategy and control parameter setting may lead to premature convergence or stagnation during evolution, which has been extensively surveyed (Gämperle et al. 2002; Lampinen and Zelinka 2000; Price et al. 2005; Zaharie 2003). In order to reduce the risk of occurring premature convergence during evolution, a stagnation breaking scheme embedded in DE would be able to improve its performance.

Towards this direction, a DE variant with time-frame strategy adaptation called the time-frame adaptive differential evolution (TFADE) algorithm is proposed. In TFADE, several specific trial vector generation strategies and their control parameter values diverse and restricted are embedded. It can be gradually adapted to select suitable strategies to produce promising solutions according to successful experience over a period of preceding generations. Besides, a stagnation breaking scheme possessing extreme features of control parameter settings is also introduced to deal with the situation of premature convergence or stagnation during evolution. The idea of TFADE is described as follows.

3.1 The time-frame strategy candidate pool

A time-frame strategy candidate pool containing several diverse and effective strategies is maintained. Many different characteristics of trial vector generation strategies have been vastly investigated (Mallipeddi et al. 2011; Qin et al. 2009; Yong et al. 2011). Putting more strategies into the pool might cause unfavorable and ineffective influence, which can reduce the chance of generating promising offspring. A good candidate pool should be restricted, but exhibits effective and diverse characteristics. In TFADE, several trial vector generation strategies with effective and diverse characteristics are selected into the time-frame strategy candidate pool, as follows.

Table 1 Three trial vector generation strategies in the time-frame strategy candidate pool

Full size table

Table 2 Guidelines for choosing control parameter values

Full size table

“DE/rand/1/bin” is the most commonly used trial vector generation strategy in DE literature. It is robust and has no bias to any search direction that can demonstrate various searches in a random manner. Its extension, “DE/rand/2/bin”, can lead to a better perturbation improving diversity and generate a wide variety of trial vectors because two difference vectors are added to the base vector (Gämperle et al. 2002; Storn and Price 1997). They are robust, diverse and can provide broad exploration capability to any problem. Especially it is suitable for solving multimodal problems (Qin et al. 2009). For these reasons, both strategies are selected into the time-frame strategy candidate pool.

“DE/current-to-rand/1” is a rotation-invariant strategy, which is kind of special than others and more effective for rotated problems. It commonly uses the rotation-invariant arithmetic crossover rather than the binomial to generate trial vectors (Das et al. 2009; Price 1999). As its effectiveness and robustness, especially to the rotated problems, it is selected into the pool.

The strategies of “DE/best/1/bin”, “DE/best/2/bin”, “DE/rand-to-best/1/bin”, and “DE/rand-to-best/2/bin” employ the information of the best individual found so far. They are suitable for solving unimodal problems but less effective to highly multimodal problems, especially they are easier to get stuck at local optimum. For these reasons, we do not consider these strategies putting into the pool.

All strategies selected into the time-frame strategy candidate pool are shown in Table 1. Due to the popularity in DE research, the binomial-type crossover operation is employed to the strategies in the pool except the strategy of “DE/current-to-rand/1”.

Table 3 Algorithmic description of TFADE

Full size table

Table 4 Experimental results of TFADE with conventional DEs on D10 functions

Full size table

Table 5 Experimental results of TFADE with conventional DEs on D30 functions

Full size table

3.2 The parameter candidate pool

Two kinds of parameter candidate pools are maintained in TFADE. One is called the time-frame parameter candidate pool; the other the extreme parameter candidate pool. Some empirical guidelines in choosing suitable parameter values for various problems also have been investigated. A detailed examination of important guidelines will lead us to select a good combination of parameters into the pool.

Storn and Price (1997) suggested that a reasonable value for the population size NP should be between 5D and 10D. A good initial choice of F is 0.5 and the effective range of F values is suggested between 0.4 and 1. If the population converges prematurely, either F or NP can be increased. First reasonable attempt of choosing CR value can be 0.1. However, because the larger CR value can speed up the convergence, the value of 0.9 for CR may be a good initial choice if the problem is near unimodal or fast convergence is desired. Gämperle et al. (2002) examined different parameter settings for DE on Sphere, Rosenbrock, and Rastrigin functions. Their experimental results showed that the searching capability and convergence speed are very sensitive to the choice of NP, F and CR. They recommended that the population size NP should be between 3D and 8D, the scaling factor F equals to 0.6, and the crossover rate CR should be between 0.3 and 0.9. Ronkkonen et al. (2005) suggested that the population size NP should be between 2D and 4D. Using F values between 0.4 and 0.95, and $F=0.9$ as an initial choice is a good way for DE. CR values should lie in [0, 0.2] for separable functions and in [0.9, 1] for multimodal and non-separable functions. Runarsson and Xin (2000) showed that although it has been found that $F{\in }$ [0.3,1] can be effective (Mezura-Montes and Coello Coello 2003), the use of a large population in conjunction with large F will cause the algorithm to converge too slowly to produce a good result by the end of its run.

Montgomery and Chen (2010) pointed out an analysis of DE why low and high values of CR are effective while values around 0.5 are not. At its extremes, CR leads to vastly different search behaviors. Low values of CR result in a search that is not just aligned with a small number of search space axes, but is gradual, slow and robust. High values of CR result in a search where fewer solutions generated may be improving, but the improvements can be larger. Coello Coello (2002) showed that a key parameter that affects DE’s performance is the crossover rate CR. The crossover rate CR = 0.9 has been found to work well across a large range of problem domains. It is often used as a default value in many DE implementations. Yong et al. (2011) showed that three trial vector generation strategies, “rand/1/bin”, “rand/2/bin”, and “current-to-rand/1”, and three control parameter settings, [$F =1.0$, CR = 0.1], [$F =1.0$, CR = 0.9], and [$F =0.8$, CR = 0.2] are frequently used in many DE variants, the properties of which have been well studied. Mallipeddi et al. (2011) suggested that separable unimodal problems require lower CR values while parameter-linked multimodal problems require higher CR values. These guidelines are summarized in Table 2.

Based on these investigated guidelines, as the parameter values should be restricted to enhance the speed and effectiveness, and with diverse characteristics for solving various problems; therefore, choosing m of F and CR values within the range of [0.3, 1] and [0.1, 0.9] in an equal step, respectively, nearly covered above-mentioned guidelines, is considered putting into the time-frame parameter candidate pool.

However, once premature convergence or stagnation occurred during evolution, some extraordinary parameter values need to be employed instead to ruffle the situation. From these guidelines, it is shown that F is closely related to convergence speed. The higher F value can avoid premature convergence or stagnation, while the lower F may increase convergence speed. CR is usually more sensitive to specific problems. Separable unimodal problems need lower CR value while multimodal ones require higher CR. Therefore, an extreme parameter candidate pool is developed that only extreme values of $F=\left\{ \text{1 } \right\} $ and CR=$\left\{ {\text{0.1 },\text{0.9 }} \right\} $ are selected to increase the chance for escaping the stagnation. Thus, in the stagnation breaking scheme of TFADE, once the premature convergence occurred during evolution, the extreme parameter candidate pool will be enabled instead of the time-frame parameter candidate pool to deal with the difficulty. As for the parameter NP, it does not need to be fine-tuned because just a few typical values can be tried according to the complexity of a given problem (Qin et al. 2009). Therefore, it is left as a user-defined parameter.

3.3 The TFADE algorithm

The process of TFADE is shown in Fig. 1. The first important scheme is called the time-frame container, which maintains crucial information to be monitored during evolution, such as strategy/parameter candidate pools, the selection probability of strategies, and the best fitness of individuals (discussed later), over a given period of preceding generations called the time frame. Concerning each target vector in the current population, one trial vector generation strategy is selected from the time-frame strategy candidate pool in the container, according to the selection probability of the strategy which is calculated by generating promising solutions within the time frame. The selection probability of a strategy is defined as a normalized success rate that a trial vector generated by the strategy which can successfully enter next generation within the time frame. During evolution, the time frame moves forward along time horizon of generations, the selection probability of each strategy and the best fitness of individuals can be gradually updated and stored in the container by experiencing different generations. A time-frame container of size TF from generation G evolving to G+t is illustrated in Fig. 2.

Table 6 Experimental results of TFADE with state-of-the-art DEs on D10 functions

Full size table

Table 7 Experimental results of TFADE with state-of-the-art DEs on D30 functions

Full size table

More specifically, to the selection probability of a strategy, assume that we discuss a time frame of size TF and S strategies maintained in the time-frame strategy candidate pool. The selection probability with respect to each strategy for a target vector x $_{i}$, where $i = 1, 2, {\ldots },$ NP, in the initialized population is 1/S., i.e., all strategies have the same probability to be selected at the beginning.

During evolution at current generation G after TF generations, the success rate of a strategy s that a trial vector u $_{i}$ generated by the strategy which can enter next generation can be calculated by

$$\begin{aligned} \mathrm{SucRate}_{i,s,G} =\frac{\sum \nolimits _{g=G-\mathrm{TF}}^G {\mathrm{Success}_{i,s,g} } }{\sum \nolimits _{g=G-\mathrm{TF}}^G {\mathrm{Selection}_{i,s,g} } }+\varepsilon , \end{aligned}$$

(10)

where Selection$_{i,s,g}=\left\{ {\text{0 },\text{1 }} \right\} $ means whether the strategy s is selected to generate u $_{i}$ at the generation g, and Success$_{i,s,g }=\left\{ {\text{0 },\text{1 }} \right\} $ stands for whether it is successful to enter next generation at generation g. $g=(G-\mathrm{TF}), (G-\mathrm{TF}+1), {\ldots }, G$ be the generations within the time frame. Note that a small value $\varepsilon $ is placed to ensure nonzero success rate and that all strategies in the time-frame strategy candidate pool always have a chance to be selected.

Therefore, the selection probability of the strategy s at generation G for a target vector x $_{i}$ can be denoted as the normalized success rate of the strategy s as

$$\begin{aligned} \mathrm{SelProb}_{i,s,G} =\frac{\mathrm{SucRate}_{i,s,G} }{\sum \nolimits _{s=1}^S {\mathrm{SucRate}_{i,s,G} } }, \end{aligned}$$

(11)

where strategy $s=1,2, {\ldots } S$. At this stage, the roulette wheel selection method is then employed to select a promising trial vector generation strategy for target vector x $_{i}$.

After a trial vector generation strategy is selected, the values of its control parameters, F and CR, are then randomly chosen from the time-frame parameter candidate pool. Both the selected strategy and control parameter values are subsequently employed to corresponding target vector to generate a trial vector.

All these dynamic data are maintained in the time-frame container and gradually adapted during evolution. Once the records overflow over TF generations, the earliest record will be removed, keeping the most up-to-date records of TF generations in the container to be monitored.

The above procedure works as the typical evolution process of TFADE. However, during evolution once premature convergence or stagnation occurred throughout the time frame of size TF, it means that the typical evolution process in pursuit of promising solutions might be dysfunctional at this evolution stage. In order to escape this situation, the mechanism of the stagnation breaking scheme will immediately take control over the strategy/parameter selection. This mechanism destroys the current promising selection records in the container at once, stop monitoring and recording for TF generations, and randomly selecting strategies and parameters from the time-frame strategy candidate pool and the extreme parameter candidate pool, respectively for TF generations to ruffle the difficulty. After TF generations, the typical evolution process will then automatically switch on to adapt a better solution again in the current population. The algorithmic description of TFADE is illustrated in Table 3.

Table 8 Experimental results of TFADE with non-DE approaches on D10 functions

Full size table

4 The experimental study

Three kinds of experiments were conducted in the experimental study. First, TFADE was compared with 4 commonly used conventional DEs coupled with widely suggested parameter settings. Next, with 3 outstanding state-of-the-art adaptive DEs proposed recently, i.e., SaDE (Qin et al. 2009), EPSDE (Mallipeddi et al. 2011), and CoDE (Yong et al. 2011). Last, 2 novel non-DE algorithms, i.e., CLPSO (Liang et al. 2006) and CMA-ES (Hansen and Ostermeier 2001), were also compared. To evaluate the performance of TFADE with these approaches, a test suite of benchmark functions (Suganthan et al. 2005) reported in 2005 IEEE Congress on Evolutionary Computation (CEC2005) was conducted. The test suite comprises 25 benchmark functions that can be divided into 4 groups:

1.
5 unimodal functions: $f_1 -f_5 $,
2.
7 basic multimodal functions: $f_{6} -f_{12} $,
3.
2 expanded multimodal functions: $f_{13} -f_{14} $,
4.
11 hybrid composition functions: $f_{15} -f_{2{5}} $.

Both 10-dimensional (D10) and 30-dimensional (D30) of the benchmark functions were considered. 30 independent runs were conducted for each algorithm to each function. The population size NP was set to 50. The maximum number of function evaluations was set to multiplying function dimensions by 10,000 as the termination criteria, which are 100,000 for D10 functions and 300,000 for D30, respectively. The mean and standard deviation of the function error $f(x_\mathrm{best} )-f(x^*)$ were calculated for measuring the performance of each algorithm to a function f, where $x_\mathrm{best}$ stands for the best solution of the function found by an algorithm in an independent run and $x^{*}$ the global optimum of the function. The number of preceding generations for the time frame in TFADE was set to 10, and $m=5$ was chosen for the number of F and CR putting into the time-frame parameter candidate pool.

Table 9 Experimental results of TFADE with non-DE approaches on D30 functions

Full size table

To present statistically significant difference between two algorithms, the statistical t test with a significance level of $\alpha =0.05$ was performed on each pair of experimental results between TFADE and a compared algorithm on each function. The indicators of “$+$”, “$=$”, and “$-$” define that the outcome of the corresponding algorithm is statistically better than, similar to, and worse than that of TFADE on a function, respectively. In order to show the overall excellence degree of TFADE against a competitor on the test suite of all 25 functions, we define a quality measurement called the advantage percentage of TFADE, denoted by %adv, as the equation (12).

$$\begin{aligned} \% \mathrm{adv}=\frac{\mathrm{num.\,of \,worse\,performance\,func.(''-'')-num.\,of\,better\,performance\,func.(''+'')}}{\mathrm{total\,func. (25)-num.\,of\,similar\,performance\,func.(''='')}} \end{aligned}$$

(12)

All these performance records were listed in the last 4 rows of experimental result tables for comparison.

Table 10 The advantage percentages of TFADE

Full size table

4.1 Comparison with conventional DEs

Five test instances of four conventional DEs with fixed control parameter settings, “Rand/1/bin ($F=0.9$, CR = 0.1)”, “Rand/1/bin ($F=0.9$, CR = 0.9)”, “Rand-to-best/1/bin ($F=0.9$, CR = 0.1)”, “Current-to-rand/1 ($F=0.9$, CR = 0.1)” and “Rand/2/bin ($F=0.9$, CR = 0.1)”, were compared with TFADE. The experimental results on D10 and D30 functions are illustrated in Tables 4 and 5, respectively. Figures 3 and 4 present the convergence characteristics of each test instance for 4 functions in terms of the median of the best fitness in 30 independent runs on D10 and D30 functions, respectively. The 4 functions, one is chosen from each group, are (a) the unimodal function of $f_{1}$, (b) the basic multimodal function of $f_{6}$, (c) the expanded multimodal function of $f_{13}$, and (d) the hybrid composition function of $f_{15}$.

In the experimental results, obviously, TFADE is outstanding among these test instances on most functions of D10 and D30, especially on the groups of unimodal functions ($f_1 -f_5 )$ and basic multimodal functions ($f_{6} -f_{12} )$, because no one can be significantly better than TFADE on test functions of D10 and D30 except “Current-to-rand/1 ($F=0.9$, CR = 0.1)” on 2 functions of D10 in the two groups. The advantage percentages of TFADE on the test suite to each are 87, 83, 83, 77, and 83 % for D10 functions and 90, 95, 90, 100, and 90 % for D30, respectively. Overall, the performance of TFADE is significantly better than that of these competitors; especially it outstandingly outperforms the competitors on both groups of the unimodal functions and basic multimodal functions, and on the functions with high dimensions (30-dimensional functions).

4.2 Comparison with state-of-the-art DEs

TFADE was also compared with three state-of-the-art adaptive DE variants proposed recently, i.e., SaDE, EPSDE, and CoDE. In these approaches, the control parameter settings were employed as in their original papers. The experimental results on D10 and D30 functions are presented in Tables 6 and 7, respectively. Figures 5 and 6 also illustrate the convergence characteristics of each approach for the 4 functions aforementioned in terms of the median of the best fitness in 30 independent runs on D10 and D30 functions, respectively.

Clearly, in the experimental results, TFADE also outperforms the state-of-the-art adaptive DE variants on all functions of D10 and D30 in the groups of the unimodal functions ($f_1 -f_5 )$ and basic multimodal functions ($f_{6} -f_{12} )$. No one is significantly better than TFADE on the test functions of D10 and D30 except EPSDE on 2 functions of D30 and CoDE on 2 functions of D30. The advantage percentages of TFADE on the test suite to each approach are 78, 95, and 80 % for D10 functions and 79, 65, and 71 % for D30, respectively. In summary, the performance of TFADE is greater than that of SaDE, EPSDE, and CoDE, especially on the groups of the unimodal functions and basic multimodal functions.

4.3 Comparison with non-DE approaches

Two novel non-DE algorithms, CLPSO and CMA-ES, were compared with TFADE in this experiment. CLPSO proposed by Liang et al. (2006) is an outstanding PSO variant, and CMA-ES proposed by Hansen and Ostermeier (2001) is a very efficient and famous evolution strategy. Both algorithms with control parameter settings used in their original papers were conducted for comparison. The experimental results on D10 and D30 functions are shown in Tables 8 and 9, respectively.

In the experimental results, TFADE performs better than CLPSO and CMA-ES on 21 and 22 out of 25 functions of D10, and on 20 and 12 out of 25 functions of D30, respectively. The advantage percentages of TFADE to both are 84 and 88 % for D10 functions and 83 and 57 % for D30, respectively. In summary, TFADE has an absolute advantage than CLPSO and CMA-ES except that it has a slight advantage than CMA-ES on the functions with high dimensions (30-dimensional functions).

5 Conclusion

The performance of DE is extremely determined by the trial vector generation strategy and control parameter setting. The effectiveness of a strategy is also significantly influenced by its control parameter values. In recent years, DE researchers have reported many empirical guidelines in selecting suitable trial vector generation strategies with control parameter settings for solving various features of problems. The comprehensive exploration of the experience can be a good way to develop an advanced DE variant.

Towards this direction, in this paper, an improved DE variant with time-frame strategy adaptation called the time-frame adaptive differential evolution (TFADE) algorithm is proposed. It employs diverse trial vector generation strategies with various control parameter values that exhibit distinct characteristics from each other. Both strategies and control parameter values can be adaptively determined to generate promising solutions and dynamically adjusted to deal with the difficulty while premature convergence or stagnation occurs during evolution, according to successful experience over a period of preceding generations called the time frame.

Although TFADE is simple and easy to be implemented, its performance is significantly outstanding. In the experimental study, the performance of TFADE was compared with that of 5 test instances of 4 commonly used conventional DEs, i.e., “Rand/1/bin ($F=0.9$, CR = 0.1)”, “Rand/1/bin (F=0.9, CR = 0.9)”, “Rand-to-best/1/bin ($F=0.9$, CR = 0.1)”, “Current-to-rand/1 ($F=0.9$, CR = 0.1)” and “Rand/2/bin ($F=0.9$, CR = 0.1)”, 3 outstanding state-of-the-art adaptive DEs, i.e., SaDE, EPSDE, and CoDE, and 2 novel non-DE approaches, i.e., CLPSO and CMA-ES, evaluated by a test suite of 25 benchmark functions reported in CEC2005. The experimental results show that its comprehensive performance is significantly superior to these competitors. The overall excellence degree of TFADE against these competitors, i.e., the advantage percentage (% adv), also shows that TFADE has an excellent advantage than these competitors. The advantage percentages of TFADE on the experimental results are summarized in Table 10.

Related work is considered for further research. From the literature review, we conclude that putting more strategies and parameter values into the strategy/parameter candidate pool will lead to an unfavorable and ineffective influence, which could reduce the performance of DE. Therefore, the strategies and parameter settings involved in the time-frame container should be restricted and preserve diversity. It is desirable to study whether the performance of TFADE can be significantly improved if the number of strategies and parameter values putting into the container can be dynamically adapted during evolution. What and how many strategies and parameter values should be combined can be more effective? Moreover, the discussion on the practical application of TFADE is also an important issue. In recent years, many distinguishing topics have been reported, such as the industrial applications on wireless sensor networks and RFID (Zhang et al. 2014a, b, 2015b), the web-based mobile application (Zhang 2012; Zhang and Liang 2013; Zhang et al. 2015a; Zhang and Zhang 2011), the image de-noising (Zhang et al. 2012b), and Internet of Things (Zhang et al. 2012a). The practical application of TFADE on these subjects is really worth investigating in the future.

References

Abbass HA (2002) The self-adaptive Pareto differential evolution algorithm. In: Proceedings of the 2002 congress on evolutionary computation, pp 831–836
Brest J, Greiner S, Boskovic B, Mernik M, Zumer V (2006) Self-adapting control parameters in differential evolution: a comparative study on numerical benchmark problems. IEEE Trans Evol Comput 10(6):646–657
Article Google Scholar
Chakraborty UK, Das S, Konar A (2006) Differential evolution with local neighborhood. In: IEEE congress on evolutionary computation, pp 2042–2049
Cheng M-Y, Tran D-H, Wu Y-W (2014) Using a fuzzy clustering chaotic-based differential evolution with serial method to solve resource-constrained project scheduling problems. Autom Constr 37:88–97
Article Google Scholar
Chiou JP (2009) A variable scaling hybrid differential evolution for solving large-scale power dispatch problems generation. Transm Distrib 3(2):154–163
Article Google Scholar
Coello Coello CA (2002) Theoretical and numerical constraint-handling techniques used with evolutionary algorithms: a survey of the state of the art. Computer Methods Appl Mech Eng 191(11–12):1245–1287
Article MathSciNet MATH Google Scholar
Damak N, Jarboui B, Siarry P, Loukil T (2009) Differential evolution for solving multi-mode resource-constrained project scheduling problems. Comput Oper Res 36(9):2653–2659
Article MathSciNet MATH Google Scholar
Das S, Abraham A, Chakraborty UK, Konar A (2009) Differential evolution using a neighborhood-based mutation operator. IEEE Trans Evol Comput 13(3):526–553
Article Google Scholar
Das S, Konar A (2009) Automatic image pixel clustering with an improved differential evolution. Appl Soft Comput 9(1):226–236
Article Google Scholar
Das S, Konar A, Chakraborty UK (2005) Two improved differential evolution schemes for faster global search. In: Proceedings of the 7th annual conference on genetic and evolutionary computation, pp 991–998
Dong CR, Ng WWY, Wang XZ, Chan PPK, Yeung DS (2014) An improved differential evolution and its application to determining feature weights in similarity-based clustering. Neurocomputing 146:95–103
Du J-X, Huang D-S, Wang X-F, Gu X (2007) Shape recognition based on neural networks trained by differential evolution algorithm. Neurocomputing 70(4):896–903
Article Google Scholar
Gämperle R, Müller SD, Koumoutsakos P (2002) A parameter study for differential evolution. In: International conference on advances in intelligent systems, fuzzy systems, evolutionary computation, pp 293–298
Hansen N, Ostermeier A (2001) Completely derandomized self-adaptation in evolution strategies. Evol Comput 9(2):159–195
Article Google Scholar
Lampinen J, Zelinka I (2000) On stagnation of the differential evolution algorithm. In: 6th International Mendel conference on soft computing, pp 76–83
Liang JJ, Qin AK, Suganthan PN, Baskar S (2006) Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Trans Evol Comput 10(3):281–295
Article Google Scholar
Liu J, Lampinen J (2005) A fuzzy adaptive differential evolution algorithm. Soft Comput 9(6):448–462
Article MATH Google Scholar
Lu X, Tang K, Sendhoff B, Yao X (2014) A new self-adaptation scheme for differential evolution. Neurocomputing 146:2–16
Article Google Scholar
Mallipeddi R, Suganthan PN, Pan QK, Tasgetiren MF (2011) Differential evolution algorithm with ensemble of parameters and mutation strategies. Appl Soft Comput 11(2):1679–1696
Article Google Scholar
Maulik U, Saha I (2009) Modified differential evolution based fuzzy clustering for pixel classification in remote sensing imagery. Pattern Recognit 42(9):2135–2149
Article MATH Google Scholar
Mezura-Montes E, Coello Coello CA (2003) Adding a diversity mechanism to a simple evolution strategy to solve constrained optimization problems. In: The 2003 congress on evolutionary computation, 8–12 Dec, pp 6–13
Mezura-Montes E, Velazquez-Reyes J, Coello CAC (2006) A comparative study of differential evolution variants for global optimization. In: Proceedings of the 8th annual conference on genetic and evolutionary computation, pp 485-492
Montgomery J, Chen S (2010) An analysis of the operation of differential evolution at high and low crossover rates. In: 2010 IEEE congress on evolutionary computation, 18–23, pp 1–8
Price K, Storn RM, Lampinen J (2005) Differential evolution: a practical approach to global optimization. Springer, Berlin
MATH Google Scholar
Price KV (1999) An introduction to differential evolution. New ideas in optimization. McGraw-Hill Ltd., UK, pp 79–108
Google Scholar
Qin AK, Huang VL, Suganthan PN (2009) Differential evolution algorithm with strategy adaptation for global numerical optimization. IEEE Trans Evol Comput 13(2):398–417
Article Google Scholar
Ronkkonen J, Kukkonen S, Price KV (2005) Real-parameter optimization with differential evolution. In: The IEEE congress on evolutionary computation (CEC-2005), pp 506–513
Runarsson TP, Xin Y (2000) Stochastic ranking for constrained evolutionary optimization. IEEE Trans Evol Comput 4(3):284–294
Article Google Scholar
Storn R, Price K (1995) Differential evolution—a simple and efficient adaptive scheme for global optimization over continuous spaces
Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 11(4):341–359
Article MathSciNet MATH Google Scholar
Suganthan PN, Hansen N, Liang JJ, Deb K, Chen Y, Auger A, Tiwari S (2005) Problem definitions and evaluation criteria for the CEC 2005 special session on real-parameter optimization
Teo J (2006) Exploring dynamic self-adaptive populations in differential evolution. Soft Comput 10(8):673–686
Article Google Scholar
Varadarajan M, Swarup KS (2008) Differential evolution approach for optimal reactive power dispatch. Appl Soft Comput 8(4):1549–1561
Article Google Scholar
Wang S-C, Yeh M-F (2014) Applying differential evolution to aggregate production planning. Univers J Ind Bus Manag 2(7):164–172
Google Scholar
Xiang WI, Zhu N, Ma SF, Xl Meng, An MQ (2015) A dynamic shuffled differential evolution algorithm for data clustering. Neurocomputing 158:144–154
Article Google Scholar
Yeh M-F, Leu M-S, Wang S-C, Wang W-J (2014) Grey adaptive differential evolution algorithm. J Grey Syst UK 17(2):67–74
Google Scholar
Yeh M-F, Wang S-C, Leu M-S (2015) Differential evolution with grey evolutionary analysis. J Grey Syst 27(2):38–46
Google Scholar
Yong W, Zixing C, Qingfu Z (2011) Differential evolution with composite trial vector generation strategies and control parameters. IEEE Trans Evol Comput 15(1):55–66
Zaharie D (2003) Control of population diversity and adaptation in differential evolution algorithms. In: Proceedings of Mendel 2003, 9th international conference on soft computing, pp 41–46
Zhang DG (2012) A new approach and system for attentive mobile learning based on seamless migration. Appl Intell 36(1):75–89
Article Google Scholar
Zhang DG, Liang YP (2013) A kind of novel method of service-aware computing for uncertain mobile applications. Math Computer Model 57(3–4):344–356
Article Google Scholar
Zhang DG, Song XD, Wang X, Li K, Li WB, Ma Z (2015a) New agent-based proactive migration method and system for big data environment (BDE). Eng Comput 32(8):2443–2466
Zhang DG, Zhang XD (2011) Design and implementation of embedded un-interruptible power supply system (EUPSS) for web-based mobile application. Enterp Inf Syst 6(4):473–489
Zhang DG, Zheng K, Zhang T, Wang X (2015b) A novel multicast routing method with minimum transmission for WSN of cloud computing service. Soft Comput 19(7):1817–1827
Zhang DG, Zhu YN, Zhao CP, Dai WB (2012a) A new constructing approach for a weighted topology of wireless sensor networks based on local-world theory for the internet of things (IOT). Computers Math Appl 64(5):1044–1055
Zhang D, Kang X, Wang J (2012) A novel image de-noising method based on spherical coordinates system. J Adv Signal Process 2012(1):110
Article Google Scholar
Zhang D, Li G, Zheng K, Ming X, Pan Z-H (2014a) An energy-balanced routing method based on forward-aware factor for wireless sensor networks. IEEE Trans Ind Inf 10(1):766–773
Zhang D, Wang X, Song X, Zhao D (2014b) A novel approach to mapped correlation of ID for RFID anti-collision. IEEE Trans Serv Comput 7(4):741–748
Zhang J, Sanderson AC (2009) JADE: adaptive differential evolution with optional external archive. IEEE Trans Evol Comput 13(5):945–958
Article Google Scholar
Zou D, Wu J, Gao L, Li S (2013) A modified differential evolution algorithm for unconstrained optimization problems. Neurocomputing 120:469–481
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Business Administration, Lunghwa University of Science and Technology, 300, Sec. 1, Wanshou Rd., Guishan District, Taoyuan City, 33306, Taiwan
Shih-Chang Wang

Authors

Shih-Chang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shih-Chang Wang.

Ethics declarations

Conflict of interest

The author declares that he has no conflict of interest in this work.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, SC. Differential evolution optimization with time-frame strategy adaptation. Soft Comput 21, 2991–3012 (2017). https://doi.org/10.1007/s00500-015-1982-0

Download citation

Published: 11 January 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s00500-015-1982-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Differential evolution optimization with time-frame strategy adaptation

Abstract

Similar content being viewed by others

A simple differential evolution with time-varying strategy for continuous optimization

Self-adaptive differential evolution with global neighborhood search

A Self-adaptive Differential Evolution Algorithm for Solving Optimization Problems

1 Introduction

2 Related work