1 Introduction

Evolutionary algorithms (EAs, including genetic algorithms, evolution strategies, evolutionary programming, and genetic programming) have received much attention regarding their potential as global optimization techniques (Bäck 1996), both in single and in multi-objective optimization. Inspired by the natural evolution and survival of the fittest, EAs utilize a collective learning process of a population of individuals. Descendants of individuals are generated using randomized operations such as mutation and recombination. Mutation corresponds to an self-replication of individuals, while recombination exchanges information between two or more existing individuals. According to a fitness measure, the selection process favors better individuals to reproduce more often than those that are relatively worse.

Differential evolution (DE) (Storn and Price 1997) is a simple yet powerful population-based, direct search algorithm with the generation-and-test feature for global optimization problems using real-valued parameters. DE uses the distance and direction information from the current population to guide the further search. It won the third place at the first International Contest on Evolutionary Computation on a real-valued function test-suite (Storn and Price 2008). Among DE’s advantages are its simple structure, ease of use, speed and robustness. Price and Storn (1997) gave the working principle of DE with single scheme. Later on, they suggested ten different schemes of DE (Storn and Price 2008; Price et al. 2005). However, DE has been shown to have certain weaknesses, especially if the global optimum should be located using a limited number of fitness function evaluations (NFFEs). In addition, DE is good at exploring the search space and locating the region of global minimum, but it is slow exploiting of the solution (Noman and Iba 2008).

Biogeography-based optimization (BBO), proposed by Simon (2008a, b), is a new global optimization algorithm based on the biogeography theory, which is the study of the geographical distribution of biological organisms. Similar to genetic algorithms (GAs), BBO is a population-based, stochastic global optimizer. In the original BBO algorithm, each solution of the population is a vector of integers. BBO adopts the migration operator to share information between solutions. This feature is similar to other biology-based algorithms, such as GAs and PSO. It makes BBO applicable to many of the same types of problems that GAs and PSO are used for. However, BBO also has several unique features compared with biology-based algorithms. For example, it maintains its set of solutions from one iteration to the next one (Simon 2008a, b). Simon compared BBO with seven state-of-the-art EAs over 14 benchmark functions and a real-world sensor selection problem. The results demonstrated the good performance of BBO. With the migration operator, BBO has a good exploitation ability.

Hybridization of EAs is getting more and more popular due to their capabilities in handling several real world problems (Grosan et al. 2009). A good review of hybrid metaheuristics with EAs specializing in intensification and diversification can be found in Lozano and García-Martínez (2010). In order to balance the exploration and the exploitation of DE, in this paper, we propose a hybrid DE with BBO, referred to as DE/BBO, for the global numerical optimization problems. In DE/BBO, a hybrid migration operator is proposed, which combines the exploration of DE with the exploitation of BBO effectively. Experiments have been tested on 23 benchmark functions chosen from the literature. In addition, five performance criteria are employed to fairly compare our approach with other algorithms. Furthermore, the influence of the population size, dimensionality, different mutation schemes, and the self-adaptive control parameters of DE are also investigated.

The rest of this paper is organized as follows. Section 2 describes briefly function optimization problem, the DE algorithm, and the BBO algorithm. In Sect. 3, some related work of DE are presented. Our proposed approach is presented in detail in Sect. 4. In Sect. 5, we verify our approach through 23 benchmark functions. Moreover, the experimental results are compared with several other approaches. The last section, Sect. 6, is devoted to conclusions and future work.

2 Preliminary

2.1 Problem definition

Global numerical optimization problems are frequently arisen in almost every field of engineering design, applied sciences, molecular biology and other scientific applications. Without loss of generality, the global minimization problem can be formalized as a pair (Sf) , where \(S\,\subseteq\,R^D\) is a bounded set on \(R^D\) and \(f:S\,\rightarrow\,R\) is a D-dimensional real-valued function. The problem is to find a point \(X^{\ast}\,\in\,S\) such that \(f(X^{\ast})\) is the global minimum on S (Yao et al. 1999). More specifically, it is required to find an \(X^{\ast}\,\in\,S\) such that

$$ \forall X \in S:f(X^{\ast})\,\leq\,f(X) $$
(1)

where f does not need to be continuous but it must be bounded. In this work, we only consider the unconstrained function optimization.

In global numerical optimization problems, the major challenge is that an algorithm may be trapped in the local optima of the objective function. This issue is particularly challenging when the dimension is high. Recently, using the evolutionary computation (EC) (Bäck 1996) to solve the global optimization has been very active, producing different kinds of EC for optimization in the continuous domain, such as GAs (Zhong et al. 2004; Herrera et al. 1998; Herrera 2000), evolution strategy (Yao and Liu 1997; Hansen and Kern 2004; Auger and Hansen 2004), evolutionary programming (Yao et al. 1999), particle swarm optimization (PSO) (Liang et al. 2006; Langdon and Poli 2007), immune clonal algorithm (Jiao et al. 2008), DE (Storn and Price 1997), memetic algorithms (Lozano et al. 2004), etc.

figure a

2.2 Differential evolution

Differential evolution (Storn and Price 1997) is a simple EA that creates new candidate solutions by combining the parent individual and several other individuals of the same population. A candidate replaces the parent only if it has better fitness. Among DE’s advantages are its simple structure, ease of use, speed and robustness. Due to these advantages, it has been applied to many real-word applications, such as data mining (Alatas et al. 2008; Das et al. 2008), pattern recognition, digital filter design, neural network training, etc. (Price et al. 2005; Feoktistov 2006; Chakraborty 2008). Most recently, DE has also been used for the global permutation-based combinatorial optimization problems (Onwubolu and Davendra 2009).

The pseudo-code of the original DE algorithm is shown in Algorithm 1. Where D is the number of decision variables. NP is the size of the parent population P. F is the mutation scaling factor. CR is the probability of crossover operator. \(X_i(j)\) is the jth variable of the solution \(X_i.\) \(U_i\) is the offspring. \(\hbox{rndint}(1,D)\) is a uniformly distributed random integer number between 1 and n. And \(\hbox{rndreal}_j[0,1)\) is a uniformly distributed random real number in [0, 1). Many schemes of creation of a candidate are possible. We use the DE/rand/1/bin scheme (see lines 6–13 of Algorithm 1) described in Algorithm 1 (more details on DE/rand/1/bin and other DE schemes can be found in Storn and Price (2008) and Price et al. (2005).

From Algorithm 1, we can see that there are only three control parameters in this algorithm. These are NP, F and CR. As for the terminal conditions, one can either fix the maximum NFFEs Max_NFFEs or the precision of a desired solution VTR (value to reach).

figure b

2.3 Biogeography-based optimization

BBO (Simon 2008a, b) is a new population-based, biogeography inspired global optimization algorithm. In BBO, each individual is considered as a “habitat” with a habitat suitability index (HSI), which is similar to the fitness of EAs, to measure the individual. A good solution is analogous to an island with a high HSI, and a poor solution indicates an island with a low HSI. High HSI solutions tend to share their features with low HSI solutions. Low HSI solutions accept a lot of new features from high HSI solutions.

In BBO, each individual has its own immigration rate \(\lambda\) and emigration rate \(\mu.\) A good solution has higher \(\mu\) and lower \(\lambda,\) vice versa. The immigration rate and the emigration rate are functions of the number of species in the habitat. They can be calculated as follows:

$$ \lambda_k = I \left(1-{\frac{k}{n}}\right) $$
(2)
$$ \mu_k = E \left({\frac{k}{n}}\right) $$
(3)

where I is the maximum possible immigration rate. E is the maximum possible emigration rate. k is the number of species of the kth individual in the ordered population according to the fitness. For example, for the minimization problem the population is ordered with the descent order based on the fitness, so that for the best solution \(k=n,\) and \(\mu_k=E,\) \(\lambda_k=0,\) it means that this solution can share more information to other poor solutions. n is the maximum number of species. Note that Eqs. 2 and 3 are just one method for calculating \(\lambda\) and \(\mu.\) There are other different options to assign them based on different specie models (Simon 2008a, b). In this work, we use Eqs. 2 and 3 to calculate the immigration rate and the emigration rate as proposed in Simon (2008a, b).

Suppose that we have a global optimization problem and a population of candidate individuals. The individual is represented by a D-dimensional vector. The population consists of \(NP=n\) parameter vectors. In BBO, there are two main operators, the migration and the mutation. One option for implementing the migration operator can be described in Algorithm 2.Footnote 1 Where \(\hbox{rndreal}(0,1)\) is a uniformly distributed random real number in (0,1) and \(X_i(j)\) is the jth SIV of the solution \(X_i.\) With the migration operator, BBO can share the information among solutions. Especially, poor solutions tend to accept more useful information from good solutions. This makes BBO be good at exploiting the information of the current population. More details about the two operators can be found in Simon (2008a, b) and in the Matlab code (Simon 2008a, b).

3 Related work to DE

Some previous researches pointed out that there are three main drawbacks of the original DE algorithm. First, the parameters of DE are problem dependent and the choice of them is often critical for the performance of DE (Gäperle et al. 2002; Liu and Lampinen 2005). Second, choosing the best among different mutation schemes available for DE is also not easy for a specific problem (Qin and Suganthan 2005; Qin et al. 2009). Third, DE is good at exploring the search space and locating the region of global minimum, but it is slow exploiting of the solution (Noman and Iba 2008). Due to these drawbacks, many researchers are now working on the improvement of DE, and many variants are presented.

Adapting the DE’s control parameters is one possible improvement. Liu and Lampinen (2005) proposed a Fuzzy Adaptive DE (FADE), which employs fuzzy logic controllers to adapt the mutation and crossover control parameters. Brest et al. (2006) proposed self-adapting control parameter settings. Their proposed approach encodes the F and CR parameters into the chromosome and uses a self-adaptive control mechanism to change them. Salman et al. (2007) proposed a self-adaptive DE (SDE) algorithm that eliminates the need for manual tuning of control parameters. In SDE, the mutation scaling factor F is self-adapted by a mutation strategy similar to the mutation operator of DE. Nobakhti and Wang (2008) proposed a randomized adaptive differential evolution (RADE) method, where a simple randomized self-adaptive scheme was proposed for the mutation weighting factor F. Das et al. (2005) proposed two variants of DE, DERSF and DETVSF, that use varying scale factors. They concluded that those variants outperform the original DE. Teo (2006) presented a dynamic self-adaptive populations DE, where the population size is self-adapting. Through five De Jong’s test functions, they showed that DE with self-adaptive populations produced highly competitive results. Brest and Maučec (2008) proposed an improved DE method, where the population size is gradually reduced. They concluded that their approach improved efficiency and robustness of DE. Teng et al. (2009) proposed a variant of DE where two different techniques (absolute encoding and relative encoding) were used to implement the self-adaptive population size. The authors concluded that DE with the self-adaptive population size using relative encoding performed well (Teng et al. 2009).

Qin and Suganthan (2005) proposed a self-adaptive DE algorithm. The aim of their work was to allow DE to switch between two schemes: “DE/rand/1/bin” and “DE/best/2/bin” and also to adapt the F and CR values. The approach performed well on several benchmark problems. Recently, Qin et al. (2009) extended their previous work (Qin and Suganthan 2005). In their SaDE, four schemes were adopted. And different CR values were also used for different mutation schemes. Their proposed algorithm outperformed the original DE and some other compared adaptive/self-adaptive DE variants (Qin et al. 2009).

Hybridization with other different algorithms is another direction for the improvement of DE. Fan and Lampinen (2003) proposed a new version of DE that uses an additional mutation operation called trigonometric mutation operation. They showed that the modified DE algorithm can outperform the classic DE algorithm for some benchmarks and real-world problems. Sun et al. (2005) proposed a new hybrid algorithm based on a combination of DE with estimation of distribution algorithm (EDA). This technique uses a probability model to determine promising regions in order to focus the search process on those areas. Gong et al. (2006) employed the two level orthogonal crossover to improve the performance of DE. They showed that the proposed approach performs better than the classical DE in terms of the quality, speed, and stability of the final solutions. Noman and Iba (2005) proposed fittest individual refinement, a crossover-based local search (LS) method DE to solve the high dimensional problems. Based on their previous work (Noman and Iba 2005), they incorporated LS into the classical DE in Noman and Iba (2008). They presented a LS technique to solve this problem by adaptively adjusting the length of the search, using a hill-climbing heuristic. Through the experiments, they showed that the proposed new version of DE performs better, or at least comparably, to classic DE algorithm. Kaelo and Ali (2007) adopted the attraction-repulsion concept of electromagnetism-like algorithm to boost the mutation operation of the original DE. Wang et al. (2007) proposed a dynamic clustering-based DE for global optimization, where a hierarchical clustering method is dynamically incorporated in DE. Experiments on 28 benchmark problems, including 13 high dimensional functions, showed that the new method is able to find near optimal solutions efficiently (Wang et al. 2007). Rahnamayan et al. (2008) proposed a novel initialization approach which employs opposition-based learning to generate initial population. Through a comprehensive set of benchmark functions they showed that replacing the random initialization with the opposition-based population initialization in DE can accelerate convergence speed. In Caponio et al. (2009), proposed a memetic DE method, namely SFMDE. In SFMDE, the PSO algorithm is used in the beginning of the evolutionary process to generate the super-fit solutions. Additionally, two local searchers are adaptively used to refine the individuals of the population.

4 Our approach: DE/BBO

As mentioned above, DE is good at exploring the search space and locating the region of global minimum. However, it is slow exploiting of the solution (Noman and Iba 2008). On the other hand, BBO has a good exploitation for global optimization (Simon 2008a, b). Based on these considerations, in order to balance the exploration and the exploitation of DE, in this work, we propose a hybrid DE approach, called DE/BBO, which combines the exploration of DE with the exploitation of BBO effectively. Our proposed DE/BBO approach is described as follows.

4.1 Hybrid migration operator

The main operator of DE/BBO is the hybrid migration operator, which hybridizes the DE operator with the migration operator of BBO, described in Algorithm 3. From Algorithm 3 we can see that the offspring \(U_i\) is possibly constituted by three components: the DE mutant, the migration of other solutions, and its corresponding parent \(X_i.\) The core idea of the proposed hybrid migration operator is based on two considerations. First, good solutions would be less destroyed, while poor solutions can accept a lot of new features from good solutions. In this sense, the current population can be exploited sufficiently. Second, the mutation operator of DE is able to explore the new search space and make the algorithm more robust. According to Lozano and García-Martínez (2010) the original migration operator in DE/BBO focuses on the intensification; while, the DE mutation operator emphasizes on the diversification. From the analysis, it can be seen that the hybrid migration operator can balance the exploration and the exploitation effectively. It is worth pointing out that in Algorithm 1 the “DE/rand/1” mutation operator is illustrated, however, other mutation operators of DE can also be used in our proposed hybrid migration operator. The influence of different mutation operators will be discussed in Sect. 5.6.

figure c

4.2 Boundary constraints

In order to keep the solution of bound-constrained problems feasible, those trial parameters that violate boundary constraints should be reflected back from the bound by the amount of violation. In this work, the following repair rule is applied

$$ X(i)=\left\{ \begin{array}{ll} l_i+\hbox{rndreal}_i[0,1]\times (u_i-l_i) & \hbox{if}\;X(i) < l_i\\ u_i-\hbox{rndreal}_i[0,1]\times (u_i-l_i) & \hbox{if}\;X(i) > u_i\\ \end{array}\right. $$
(4)

where \(\hbox{rndreal}_i[0,1]\) is the uniform random variable from [0,1] in each dimension i.

4.3 Main procedure of DE/BBO

By incorporating the above-mentioned hybrid migration operator into DE, the DE/BBO approach is developed and shown in Algorithm 4. Compared with the original DE algorithm described in Algorithm 1, our approach needs only a small extra computational cost in sorting the population and calculating the migration rates. In addition, the structure of our proposed DE/BBO is also very simple. Moreover, DE/BBO is able to explore the new search space with the mutation operator of DE and to exploit the population information with the migration operator of BBO. This feature overcomes the lack of exploitation of the original DE algorithm. Note that the only difference between DE/BBO and the original DE algorithm is that the hybrid migration operator is used to replace the original DE mutation operator. In DE/BBO, the selection method of DE is kept the same; therefore, DE/BBO is also an elitist method like the original DE algorithm.

figure d

5 Experimental results

In order to verity the performance of DE/BBO, 23 benchmark functions are chosen from Yao et al. (1999) Since we do not make any modification of these functions, they are only briefly described in Table 1. A more detailed description of these functions can be found in Yao et al. (1999), where the functions were divided into three categories: unimodal functions, multimodal functions with many local minima, and multimodal functions with a few local minima.

Table 1 Benchmark functions used in our experimental studies

Functions f01–f13 are high-dimensional and scalable problems. Functions f01–f05 are unimodal. Function f06 is the step function, which has one minimum and is discontinuous. Function f07 is a noisy quartic function, where random [0,1) is a uniformly distributed random variable in [0,1). Functions f08–f13 are multimodal functions where the number of local minima increases exponentially with the problem dimension. They appear to be the most difficult class of problems for many optimization algorithms. Functions f14–f23 are low-dimensional functions that have only a few local minima.

5.1 Experimental setup

For DE/BBO, we have chosen a reasonable set of value and have not made any effort in finding the best parameter settings. For all experiments, we use the parameters shown in Table 2 unless a change is mentioned. The maximum NFFEs (Max_NFFEs) for each function at \(D=30\) are listed in the second column of Table 3.

Table 2 The parameter settings of the DE/BBO method used in this work
Table 3 Best error values of DE/BBO, DE, and BBO on all test functions, where “Mean” indicates the mean best error values found in the last generation, “Std Dev” stands for the standard deviation

Moreover, in our experiments, each function is optimized over 50 independent runs. We also use the same set of initial random populations to evaluate different algorithms in a similar way done in Noman and Iba (2008). All the algorithms are implemented in standard C++. The source code can be obtained from the first author upon request.

5.2 Performance criteria

Five performance criteria are selected from the literature (Rahnamayan et al. 2008; Suganthan et al. 2005) to evaluate the performance of the algorithms. These criteria are described as follows.

  • Error: Suganthan et al. (2005): The error of a solution X is defined as \(f(X)-f(X^{\ast}),\) where \(X^{\ast}\) is the global optimum of the function. The minimum error is recorded when the Max_NFFEs is reached in 50 runs. Also the average and standard deviation of the error values are calculated.

  • NFFEs: Suganthan et al. (2005). The number of fitness function evaluations (NFFEs) is also recorded when the VTR is reached. The average and standard deviation of the NFFEs values are calculated.

  • Number of successful runs (SR): Suganthan et al. (2005): The number of successful runs is recorded when the VTR is reached before the Max_NFFEs condition terminates the trial.

  • Convergence graphs: Suganthan et al. (2005): The convergence graphs show the mean error performance of the best solution over the total runs, in the respective experiments.

  • Acceleration rate (AR): Rahnamayan et al. (2008): This criterion is used to compare the convergence speeds between DE/BBO and other algorithms. It is defined as follows:

    $$ {\rm AR}=\frac{\hbox{NFFEs}_{{\rm other}}}{\hbox{NFFEs}_{{\rm DE/BBO}}} $$
    (5)

    where \({\rm AR} > 1\) indicates DE/BBO converges faster than its competitor.

5.3 General performance of DE/BBO

In order to show the superiority of our proposed DE/BBO approach, we compare it with the original DE algorithm and the BBO algorithm. The parameters used for DE/BBO and DE are the same as described in Sect. 5.1. The parameters of BBO are set as in Simon 2008a, b), and the mutation operator with \(m_{{\rm max}}=0.005\) is also used in our experiments. All functions are tested for 50 independent runs. Table 3 shows the best error values of DE/BBO, DE, and BBO on all test functions. The average and standard deviation of NFFEs are shown in Table 4. In addition, some representative convergence graphs of DE/BBO, DE, and BBO are shown in Fig. 1.

Table 4 NFFEs required to obtain accuracy levels less than VTR
Fig. 1
figure 1

Mean error curves of DE/BBO, DE, and BBO for selected functions. a f01, b f03, c f04, d f08, e f09, f f11, g f12, h f15, i f22

5.3.1 When compared with DE

From Table 3 we can see that DE/BBO is significantly better than DE on eight functions. However, DE/BBO is outperformed by DE on two functions (f03 and f05). For the rest 13 functions, there are no significant difference based on the t test.Footnote 2 For the multimodal functions with many local minima (f08–f13), DE/BBO can obtain the \({\rm VTR}=10^{-8}\) over all 50 runs within the Max_NFFEs. However, DE may trap into the local minima for five out of six functions. This indicates that our approach has the ability to escape from poor local optima and locate a good near-global optimum. Apparently, from Table 4 it can be seen that DE/BBO requires less NFFEs to reach the VTR than DE on 18 functions. DE is faster than DE/BBO on the rest 5 functions. Additionally, for the majority of the test functions DE/BBO converges faster than DE as shown in Fig. 1.

5.3.2 When compared with BBO

From Tables 3, 4 and Fig. 1, it is obvious that DE/BBO performs significantly better than BBO consistently with respect to all five criteria for all test functions. By carefully looking at Fig. 1, we can see that in the beginning of the evolutionary process BBO converges faster than DE/BBO while DE/BBO is able to improve its solution steadily for a long run. The reason might be that BBO has a good exploitation but lacks the exploration. However, for DE/BBO with the hybrid migration operator, it can balance the exploration and the exploitation effectively.

In general, the performance of DE/BBO is highly competitive with DE, especially for the high-dimensional problems. Moreover, DE/BBO is significantly better than BBO for all problems. Since for the majority of the low-dimensional functions (f14–f23), both DE/BBO and DE have no significant difference, we will not use these functions in the following experiments. In addition, we also do not compare the algorithms with BBO in the following experiments.

5.4 Influence of population size

The choice of the best population size of DE is always critical for different problems (Gäperle et al. 2002; Teo 2006; Brest and Maučec 2008). Increasing the population size will increase the diversity of possible movements, promoting the exploration of the search space. However, the probability to find the correct search direction decreases considerably (Feoktistov and Janaqi 2004). The influence of population size is investigated in this section. For both DE/BBO and DE, all the parameter settings are the same as mentioned in Sect. 5.1, only except for \(NP=50,\) \(NP=150,\) and \(NP=200.\)

The results for different population size are shown in Table 5. It can be seen that: (i) for \(NP=50\) DE/BBO is significantly better than DE on 10 functions while it is outperformed by DE for function f03. For f05, DE/BBO is sightly better than DE. And for f13, DE/BBO can locate the near-global optimum over all 50 runs, however, DE traps into the local minima on 14 out of 50 runs. (ii) When the population increases to \(NP=100,\) DE can obtain higher overall successful runs than \(NP=50.\) DE/BBO is significantly better than DE on eight functions based on the t-test results. DE is significantly better than DE/BBO for functions f03 and f05. For the rest three functions, DE/BBO is better than DE. (iii) For \(NP=150\) and \(NP=200,\) DE/BBO is able to obtain significant better performance than DE on eight and nine functions, respectively. Similarly, for f03 and f05, DE/BBO is outperformed by DE significantly. In addition, from Fig. 2 we can see that DE/BBO can obtain higher convergence speed to different population size for the majority of functions compared with DE.

Table 5 Influence of the performance to different population size for functions f01–f13 \((D=30)\)
Fig. 2
figure 2

Mean error curves of DE/BBO and DE to different population size for selected functions at D = 30. a f05 \((NP=50).\) b f08 \((NP=50).\) c f02 \((NP=150).\) d f10 \((NP=150).\) e f09 \((NP=200).\) f f13 \((NP=200)\)

In general, the overall performance of DE/BBO is better than DE to different population size. DE/BBO exhibits higher overall successful runs, higher convergence velocity, and more robustness than DE.

5.5 Effect of dimensionality

In order to investigate the influence of the problem dimension on the performance of DE/BBO, we carry out a scalability study comparing with the original DE algorithm for the scalable functions in the test suit. For functions f01–f13, \(D=10, 50, 100, \hbox{and}\, 200.\) The results are recorded in Table 6 after \(D \times 10,000\) NFFEs, and some representative convergence graphs are shown in Fig. 3. From Table 6 we can see that the overall SR is decreasing for both DE/BBO and DE, since increasing the problem dimension leads to the algorithms sometimes unable to solve the problem before reaching the Max_NFFEs. However, similarly to \(D=30,\) on the majority of functions, DE/BBO outperforms DE at every dimension. By carefully looking at the results, we can recognize that for f05 DE is better than DE/BBO at \(D=10\) and \(D=30,\) however, DE is outperformed by DE/BBO at higher dimension \((D=50, 100, \hbox{and} \, 200).\) So, from the experimental results of this section, we can conclude that the hybrid migration operator has the ability to accelerate DE in general, especially the improvements are more significant at higher dimensionality.

Table 6 Scalability study for functions f01–f13 at \(\hbox{Max}\_\hbox{NFFEs}=D\times10,000\)
Fig. 3
figure 3

Mean error curves to compare the scalability of DE/BBO and DE for selected functions. a f13 \((D=10).\) b f09 \((D=50).\) c f05 \((D=100).\) d f12 \((D=100).\) e f01 \((D=200).\) f f11 \((D=200)\)

5.6 Influence of different mutation schemes

There are ten mutation schemes proposed in the original DE (Storn and Price 2008; Price et al. 2005). Actually, choosing the best among different mutation schemes available for DE is also not easy for a specific problem (Qin and Suganthan 2005; Qin et al. 2009). In this section, we perform additional experiment to compare the performance of DE/BBO with that of DE to different schemes. Four schemes, namely, DE/best/1/bin, DE/rand/2/bin, DE/rand-to-best/1/bin, and DE/best/2/bin are chosen in these experiments. All remaining parameters are the same as mentioned in Sect. 5.1. Table 7 gives the results of DE/BBO and DE for the four schemes. Based on the t test, the results can be summarized as “\(w/t/l\)”, which means that DE/BBO wins in w functions, ties in t functions, and loses in l functions, compared with DE. From Table 7, for DE/best/1/bin, DE/rand/2/bin, DE/rand-to-best/1/bin, and DE/best/2/bin, they are \(12/1/0,\) \(9/2/2,\) \(8/3/2,\) and \(10/2/1,\) respectively. The results indicate that DE/BBO is able to obtain greater robustness than DE to different mutation schemes on the majority of functions.

Table 7 Comparison of DE/BBO and DE to different mutation schemes for functions f01–f13 (D = 30)

5.7 Influence of self-adaptive parameter control

As mentioned above, the choice of the control parameters F and CR is sensitive for different problems (Gäperle et al. 2002). Researchers have proposed the adaptive parameter control of DE, such as (Liu and Lampinen 2005; Brest et al. 2006), and so on. In order to show that DE/BBO can also improve the self-adaptive DE, in this section, we adopt the self-adaptive parameter control proposed in Brest et al. (2006) to replace \(F=\hbox{rndreal}(0.1,1.0)\) and \(CR=0.9\) in the previous experiments. All other parameter settings are kept unchanged. The results for the self-adaptive DE (SADE) and self-adaptive DE/BBO (SADE/BBO) are given in Table 8.

Table 8 Influence of self-adaptive parameter control to DE/BBO and DE for functions f01–f13 \((D=30)\)

According to Table 8, we can see that: first, for the Error values, both SADE/BBO and SADE can obtain the global optimum on five functions (f02, f06, f08, f09, and f11) over 50 runs. SADE/BBO is significantly better than SADE on five functions. SADE outperforms SADE/BBO on two functions (f03 and f05). Second, with respect to the NFFEs, it is obvious that SADE/BBO is significantly better than SADE on 11 functions. And the AR values are larger than one for these functions, it means that SADE/BBO is faster than SADE. For functions f03 and f05, SADE/BBO fails to solve the two functions over all 50 runs. Overall, integration of the hybrid migration operator can improve the performance of SADE.

5.8 Comparison with other DE hybrids

In this section, we make a comparison with other DE hybrids. Since there are many variants of DE, we only compare our approach with DEahcSPX proposed in Noman and Iba (2008), ODE proposed in Rahnamayan et al. (2008), and DE/EDA proposed in Sun et al. (2005).

5.8.1 Comparison with DEahcSPX and ODE

Firstly, we compare our approach with DEahcSPX and ODE. In DEahcSPX, a crossover-based adaptive local search operation to accelerate the original DE. The authors concluded that DEahcSPX outperforms the original DE in items of convergence rate in all experimental studies. In ODE, the opposition-based learning is used for the population initialization and generation jumping. In this section, we compare our proposed CDE with the original DE, DEahcSPX and ODE. All the parameter settings are the same as mentioned in Sect. 5.1. For DEahcSPX, the number of parents in SPX sets to be \(n_{\rm p} = 3\) (Noman and Iba 2008). For ODE, the jump rate \(J_{\rm r} = 0.3\) (Rahnamayan et al. 2008). The results are given in Table 9. The selected representative convergence graphs are shown in Fig. 4.

Table 9 Comparison the performance of DE/BBO with DE, DEachSPX, and ODE for functions f01–f13 \((D=30)\)
Fig. 4
figure 4

Mean error curves of DE/BBO, DE, DEahcSPX, and ODE for selected functions. a f02, b f03, c f07, d f10, e f11, f f13

It can be seen that, from Table 9, DE/BBO is significantly better than DEahcSPX on eight functions while it is outperformed by DEahcSPX on two functions (f03 and f05). For the rest three functions, there are no significant difference between DE/BBO and DEachSPX. However, for f12 and f13, DE/BBO can obtain the near-global optimum over all 50 runs while DEahcSPX traps into the local minima on one run, respectively. In addition, Fig. 4 shows that DE/BBO converges faster than DEahcSPX on the major functions.

With respect to ODE, the t test is 9/2/2 in Table 9. It means that DE/BBO significantly outperforms ODE on nine out of 13 functions. ODE is significantly better than DE/BBO on two functions (f03 and f07). For f01 and f10, there are no significant difference between DE/BBO and ODE, however, DE/BBO is slightly better than ODE. Moreover, on the majority of functions, DE/BBO exhibits higher convergence rate than ODE.

5.8.2 Comparison with DE/EDA

Secondly, the comparison between DE/BBO and DE/EDA is evaluated in this section. The reason is that DE/BBO is similar to DE/EDA, i.e., both algorithms combine DE with another global optimization algorithm to improve the performance of DE. DE/EDA combines global information extracted by EDA with differential information obtained by DE to create promising solutions (Sun et al. 2005). DE/BBO integrates the migration operator of BBO into DE to balance the exploration and the exploitation. In original DE/EDA algorithm,Footnote 3 the DE mutation scheme is described as follows

$$ \begin{aligned} U_i(j) & = {\left[X_{r_1}(j)+X_{i}(j)\right]}/{2} \\ &\quad +F \times \left[\left(X_{r_1}(j)-X_{i}(j)\right)+\left(X_{r_2}(j)-X_{r_3}(j)\right)\right]\\ \end{aligned} $$
(6)

where \(X_i\) is the target vector of DE. In order to make a fair comparison, all compared algorithms (DE, DE/BBO, and DE/EDA) adopt the mutation scheme shown in Eq. 6 to replace the DE/rand/1/bin scheme. All other parameters are the same as mentioned in Sect. 5.1. The results are presented in Table 10. And the selected representative convergence graphs are shown in Fig. 5.

Table 10 Comparison the performance of DE/BBO with DE, and DE/EDA for functions f01–f13 (D = 30)
Fig. 5
figure 5

Mean error curves of DE/BBO, DE, and DE/EDA for selected functions. a f03, b f05, c f06, d f08, e f10, f f12

5.8.3 When compared with DE

DE/BBO is significantly better than DE on 11 functions. For the rest two functions (f06 and f13), DE/BBO is also better than DE. Additionally, DE/BBO is able to obtain faster convergence velocity than DE for all functions.

5.8.4 When compared with DE/EDA

The overall SR of DE/BBO is better than DE/EDA. On nine functions, DE/BBO is significantly better than DE/EDA. DE/EDA is significantly better than DE/BBO only on one functions (f03). For the rest three functions, there are no significant difference. By carefully looking at the results in Table 10, we can see that DE/BBO is substantial better than DE/EDA for all multimodal functions (f08–f13). DE/BBO can locate the near-global optimum over all 50 runs for all these functions. However, DE/EDA traps into the local minima many times. Especially, for f08 and f09, DE/EDA fails to solve the two functions. In addition, Fig. 5 shows that DE/BBO converges faster than DE/EDA on the major functions.

5.9 Non-parametric statistical tests

In the previous sections, we presented the experimental results with paired t test, which is a parametric statistical test. However, recent studies indicated that the parametric statistical analysis is not appreciate especially when we tackle the multiple-problem results (Demsar 2006; García and Herrera 2008; García et al. 2009a, b). In order to further prove statistical significance of the results, in this section, we adopt the non-parametric statistical test to compare the DE/BBO method with other algorithms. There are many types of non-parametric statistical methods (García et al. 2009a, b), such as Bonferroni–Dunn’s procedure, Holm procedure, Hochberg procedure, Wilcoxon’s test, and so on. In this study, the Wilcoxon’s test is employed since this test is included in well-known software packages (e.g., SPSS, SAR, OriginPro, Matlab, etc.).

To apply the Wilcoxon’s test the results are obtained from the previous experiments at \(D=30\) for functions f01–f13. The results of DE/BBO is compared with those of BBO, DE, DEahcSPX, ODE, and DE/EDA. First, we present the results of the single-problem Wilcoxon’s test at \(\alpha=0.05\) in Table 11. From Table 11, we can see that DE/BBO is significantly better than the other algorithms in 51 out of 65 cases. In seven cases, DE/BBO is outperformed by the other algorithms. For the remaining seven cases, there is no significance. In summary, the DE/BBO approach outperforms the other algorithms in 78.5% of the cases and is outperforms in only 10.8% of the cases.

Table 11 Results of the single-problem non-parametric statical Wilcoxon’s test at \(\alpha=0.05\) for functions f01–f13 \((D=30)\)

Secondly, we present the results of the multiple-problem Wilcoxon’s test at \(\alpha=0.05\) in Table 12. It is clear that DE/BBO obtains higher \(R^+\) values than \(R^-\) values in all cases. According to the p value, we can see that DE/BBO is significantly better than BBO and DE/EDA. However, with a \(\alpha=0.10,\) DE/BBO is able to provide different results compared with BBO, DEahcSPX, ODE, and DE/EDA.

Table 12 Results of the multiple-problem non-parametric statical Wilcoxon’s test at \(\alpha=0.05\) for functions f01–f13 \((D=30)\)

5.10 Experimental results on CEC’05 test functions

In order to further evaluate the performance of DE/BBO, in this section, we report the results of SADE/BBO and SADE for CEC’05 test functions (Suganthan et al. 2005) at \(D=10.\) From the experimental results described in Sect. 5.7 showed that the parameter self-adaptation proposed in (Brest et al. 2006) is able to enhance both of the performance of DE/BBO and DE; therefore, this parameter self-adaptation is used. In addition, since most of CEC’05 test functions are rotated, in SADE/BBO for each individual \(\lambda_i=1.0.\) All other parameter settings are kept unchanged as Sect. 5.1. The results are summarized in Tables 13 and 14. All results are averaged over 50 independent runs when Max_NFFEs = 100,000.

Table 13 Best error values achieved when Max_NFFEs = 100,000 for CEC’05 test functions f01–f25 \((D=10)\)
Table 14 Successful NFFEs obtained by SADE/BBO and SADE for CEC’05 test functions f01–f25 \((D=10)\)

From Table 13 we can see that SADE/BBO is significantly better than SADE on 13 out of 25 functions. On two functions (f15 and f23), SADE/BBO is outperformed by SADE. On the rest 10 functions, there is no significant difference between SADE/BBO and SADE in terms of the best error values. The results shown in Table 14 indicate that SADE/BBO is able to obtain faster convergence rate on the successful functions. The reason is that the enhanced exploitation by the hybrid migration operator can accelerate the convergence rate. However, due to the enhanced exploitation, SADE/BBO gets stuck in the local minima of some very complex functions, e.g., f15 and f23. The population restart technique (Auger and Hansen 2004) might be used to remedy the premature convergence of SADE/BBO. We will verify it in our future work. In general, our proposed SADE/BBO obtains better results than SADE on the majority of CEC’05 test functions at \(D=10\) in terms of the quality of the final results and the convergence speed.

5.11 Discussions

The DE algorithm is a fast, robust, and simple global optimization algorithm. However, it may lack the exploitation. BBO is novel optimization algorithm for global optimization. BBO has a good exploitation with the migration operator. Therefore, in this work, we hybridize DE with BBO and propose a hybrid migration operator to generate the promising candidate solution. And then, the DE/BBO algorithm is proposed based on the hybrid migration operator. From the experimental results we can summarize that

  • Our proposed DE/BBO approach is effective and efficient. It can obtain the global, or near-global, optimum for the test functions.

  • The overall performance of DE/BBO is superior to or highly competitive with BBO and other compared state-of-the-art DE algorithms.

  • DE/BBO and DE were compared for different population sizes. On the majority of functions, DE/BBO is substantial better than DE.

  • The scalability studies show that DE/BBO is able to accelerate DE in general, especially the improvements are more significant at higher dimensionality.

  • Comparison of DE/BBO and DE to different mutation schemes, the overall performance of DE/BBO is more robust than that of DE.

  • The self-adaptive parameter control can enhance the performance of DE/BBO and DE. Our proposed hybrid migration operator shows the potential to accelerate the self-adaptive variants of DE.

  • According to the non-parametric single/multiple-problem Wilcoxon’s test, we can conclude that our approach is better, or at least highly competitive, than the compared algorithm for the test functions.

  • Due to the hybrid migration operator proposed in this paper, DE/BBO is able to balance the exploration and the exploitation. In addition, the hybrid migration operator can make the good solutions share more information with the poor ones, meanwhile, it can prevent the good solutions from being destroyed during the evolution. This might be the reason that DE/BBO is significantly better than the original DE algorithm on all tested multimodal functions.

  • For function f03, DE/BBO is worse than DE with DE/rand/1/bin scheme. However, from Tables 7 and 10, we can see that DE/BBO is better than DE for DE/best/1/bin and scheme described in Eq. 6. So, we can expect that the strategy adaptation as proposed in Qin et al. (2009) may be used to make DE/BBO more robust.

6 Conclusions and future work

In order to balance the exploration and the exploitation of DE, in this paper, we propose a hybrid DE approach, called DE/BBO, which combines the exploration of DE with the exploitation of BBO. In DE/BBO, a new hybrid migration operator is proposed to generate the promising solutions. Since the hybrid migration operator has a good trade-off between the exploration and the exploitation, it makes our proposed DE/BBO approach be very effective and efficient. To verify the performance of DE/BBO, 23 benchmark functions chosen from literature are employed. Experimental results demonstrate the good performance of our approach. Compared with BBO, DE, DEahcSPX, ODE, and DE/EDA, the results show that DE/BBO is superior to or at least highly competitive with them. Moreover, the influence of the population size, dimensionality, different mutation schemes, and the self-adaptive control parameters of DE/BBO and DE are also investigated. And the results confirm that DE/BBO exhibits a higher convergence rate and greater robustness compared with DE. In addition, compared the results between our approach and SADE on CEC’05 functions, SADE/BBO obtains better results than SADE on the majority of CEC’05 test functions at \(D=10.\)

In this work, 23 benchmark functions are used to evaluate the performance of our approach, we will test our approach on more problems, such as the high-dimensional \((D\,\ge\,30)\) CEC’05 test suit (Suganthan et al. 2005) and the real-world problems. Moreover, we will compare DE/BBO with other EAs, like the work in García et al. (2009a, b). In addition, we only consider the unconstrained function optimization in this work. Our future work consists on adding the diversity rules into DE/BBO for constrained optimization problems.