Keywords

1 Introduction

Evolutionary algorithms (EAs) are a broad class of stochastic search heuristics that perform optimization or learning tasks. While they have been successfully applied to various science and engineering problems, the performance of EAs depends critically on the setting of various parameters [1], among which the population size is perhaps the most important one. Generally, a too small population size will lead to the premature convergence of EAs, while a too large size will reduce the EAs’ efficiency. In traditional EAs, the population size is typically specified by the user to a fixed value and remains constant during the entire evolution. However, it has been well recognized that the optimal size of population is problem-specific [2,3,4,5,6]. Further, it has been demonstrated that different population sizes may be required at different stages of evolution [7,8,9,10]. To specify the population size in such a manner therefore may significantly limit the performance of EAs.

To deal with the above issue, dynamic population control has become an active yet important research topic and many different schemes have been proposed in literature [9, 11,12,13,14]. Several works of reviewing or comparing existing methods have been presented. In 1999, Costa [2] gave a comparative study of population control methods, in which five dynamic population control schemes are reviewed and compared. More than a decade later, Karafotias [15] provided a survey regarding to the parameter control in EAs. Although a few dynamic population control methods have been reviewed, this work focuses mainly on parameter setting schemes for variation operators of EAs. Recently, Piotrowski [16] presented another review along with comparison results of several dynamic population control schemes. The scope of this work, however, is limited to the differential evolution (DE).

In this paper, we give a comprehensive review of the population control methods proposed in EA literature. Further, several representative methods have been selected and compared to gain deep understanding of their pros and cons.

2 Population Control in Evolutionary Algorithms

Much work has been carried out regarding to the control of population size in EAs. In this section, we first provide an overview of theoretical studies in the field and then review exiting methods, which are divided in three categories, i.e., deterministic, adaptive and self-adaptive methods.

2.1 Theoretical Studies

In [17], Holdener recognized that the optimal allocation of trials in genetic algorithm (GA) characterized by the conjunction of substantial complexity and initial uncertainty as well as a requirement of employing new information in the population to rapidly reduce this uncertainty. This area remains untouched for many years until Goldberg studied the population size as a parameter from a theoretical point of view based on the decomposition for designing competent GAs. In this work, Goldberg [18] considers the evolution of GA as the growth of building blocks (BBs) and points out that a successful GA should ensure an adequate supply of BBs in the initial population. In [19], Reeves calculated the minimum population size required for at least one instance of every allele being presented at each locus in binary string as well as n-nary representation. While having all alleles present in the initial population is important, this work suggests that the present of desired building blocks in the initial population is crucial for the GAs’ performance. Later, in [20], Goldberg determined the probability of actual building blocks presented in a population and developed models to calculate population size required for the success probability of GA asymptotically reaching 100%. Apart from the building block theory, the population size has also been studied under the statistical decision. In [21], De Jong presented several equations to determine the population size based on the noise theory in statistical decision. In [7], Goldberg extended the equations to calculate the size of population based on the permissible errors between building blocks, variation and complexity of the problem. While, in [22], Harik developed an alternative equation to predict the quality of solution found in a given population size based on the theory of random walks and gamblers ruin. Unfortunately, the above theoretical studies are generally not practical for real applications. However, they do reveal that the population size is crucial for the performance of EAs.

2.2 Deterministic Methods

To deal with the issue of population size, many methods seek to dynamically change it during evolution based on deterministic rules. In [12, 23,24,25], Fernández et al. presented a method based on the phenomena of plague [12], in which a fixed number of individuals (i.e., the worst ones) are removed at every generation of evolution. Instead of removing individuals at each generation, Brest [13] proposed a method, which gradually reduces the population size by half each time when a specific condition is met during the evolution. This reduction mechanism has been successfully used in EAs for solving various optimization problems [26,27,28,29,30,31,32]. Extending the above mechanism, in [33], Brest et al. designed a method, which starts with a small population size at the beginning. Then, during the evolution, the population is first increased with a specific size determined by a constant value and then reduced by half. Although it has shown to be promising [33], Neri [34] pointed out that such a method is not working well in DE. Rather than increasing or decreasing a specific number of individual after each specific number of generations during evolution, a few methods have been proposed to automatically adjust the size of population based on predefined functions. For example, Koumousis et al. introduced functions with saw-tooth shape [35] and sinusoidal oscillating [36] for adjusting the population size. Other functions, such as linear functions [30, 37, 38] have also been used for population control and applied in various optimization problems [39,40,41,42].

2.3 Adaptive Methods

Adaptive methods utilize feedback from the search to determine the direction and magnitude of change of population size. Based on the feedback information used, we divide the existing methods into three categories and review them accordingly.

Fitness Based Methods. The methods falling into this category adaptively change the population size based on the fitness of individuals. In [43,44,45,46,47], the size of a population is modified if the fitness of the best individual does not change for a period of time. Specifically, Montiel [43] et al. calculated the amounts of individuals should be inserted or deleted from the population based on two variables, “Cycling” and “Variance”. The “Cycling” is used to account the number of times that the fitness of the best individual does not have a significant change and the “Variance” is used to calculate the variance of the fitness of the best individuals during the “Cycling” period. In [44], new individuals are inserted into the population when the fitness of the best individual does not improve in m generations. In [45, 48], the population is first ordered based on the fitness of individuals and a predefined percent of individuals is inserted or deleted from the population. In [46], a predefined number of individuals are added if the best individual does not change for m generations while the population is resized to its initial size if the best individual improves its fitness. In [47], a new population is created by selecting good individuals from an extinguished population when extinction occurs, and at every generation, individuals will be eliminated when they reach their lifetime. In [49,50,51,52,53], subpopulations are used for competitive evolution and are resized based their fitness. Typically, when the average fitness of a subpopulation falls below a predefined significance level, it will be removed. Specifically, in [50], Smorodkina provided a competition scheme between two parallel subpopulations, while Hinterding [51] and Harik [52] do not restrict the number of subpopulations. Schlierkamp-Voosen [49] employs a migration scheme, in which the best individual is migrated to all other subpopulations. While, Zhan [53] suggested that all the poor subpopulations randomly migrate one individual to the good subpopulations.

Apart from the above methods, some methods employ lifetime and survival probability, which is derived from the fitness, to control the population size. The classical method is the Genetic Algorithm with Varying Population Size (GAVaPS) proposed by Arabas et al. [11], which introduced the concept of age for the individual. In this method, population size is dictated by birth and death of individuals, controlled by their lifetime and measured by fitness. Proceeding in a generational manner, each individual’s age is increased at each time step and removed when it exceeds its lifetime. The lifetime mechanism is extended using non-random mating to prevent incest by Fernándes et al. [54, 55], and applied to multimodal optimization [56], vehicle detection [57] and distribution network reconfiguration [10]. The main drawback of GAVaPS is that, in the worst case, the population size will be doubled in the successive generations thus may lead to population explosion. To address such an issue, Bäck [58] proposed to preserve all the fittest individuals and only two new individuals will be generated and inserted into the population at each generation. This scheme has been applied to co-operative co-evolution by Lorio and Li [59]. In [47], Zhang developed an extinction scheme for population control. In this scheme, individuals will be eliminated if there are no improvements made during the evolution. While, in [60], Vellev introduced the concept of survival probability, which defines a collective probability used to control the size of the population. Along this line, Cook [61] described another formula to calculate the survival probability for population size control.

Distribution Based Methods. This kind of methods control the population size based on the distribution information of population. In [7], Goldberg first proposed a micro GA in which the population is reinitialized when a nominal convergence (e.g., when all the individuals have either identical or very similar genotypes) is detected. In [62], a similar re-initialization scheme is introduced with an elitist strategy along with a competition scheme, which chooses the winner individuals to be included into the population. The micro GA has been successfully applied in various applications [63,64,65], which show a better performance compared to the standard GA. Another way to measure the distribution of population is to calculate the distance between individuals. In [66, 67], Khor and Tan suggested the desired population size should be bounded within a limit, measured by the maximum distance between each other in the population. Liang [68] combined the relative ascending directions and the distance between two individuals to determine whether they are located in the same peak. This information is then used for population control. In [69], Yang calculated the sum Euclidean distances and generated a new population if it does not change for a period of time. While in [70], the individual with the smallest Euclidean distance will be removed at every generation as a mechanism for population control. In [71], Auge introduced a dispersion degree for population control. A population initialization is launched when the calculated dispersion degree meets the stopping criterion and the population size is increased by a factor of 2. In addition, Shi [72] assumed the center of the population moved slowly when the variance of population is large. In this case, the population size is decreased to accelerate the speed of convergence and vice versa.

Fitness and Distribution Based Methods. The methods in this category are based on both the fitness and distribution information for population control. Smith [73] presented such a method, in which the mean fitness and variance of population is calculated and compared to a predefined desired value to decide whether assign more (or less) individuals to a population. This method was extended by Harik [22], in which the distance between the best and second best BBs is also considered. In [74], Tirronen provided a method based on fitness diversity measured by the distance between pairs of individuals along with their fitness values to control population size.

Population Status Based Methods. These methods change the population size when the current population status meets a certain condition. They have been used in genetic programming (GP), where solutions are represented as trees of variable size and depth, for population control. In [75], Wagner proposed to employ the node information for population control. Two kinds of nodes (i.e., soft and hard) are used and the soft node is employed to control the addition of new individuals to a population while the hard node is used to control which individuals might be added. This method has been used in GP to address forecasting problem in [76, 77]. In [70], Ding presented an artificial bee colony algorithm where the colony size is used to control the reproduction and death rate of the population. If the colony size is larger than the initial size, the reproduction rate threshold will be increased while the death threshold is decreased.

2.4 Self-adaptive Methods

This approach encodes the population size to be adapted into the individuals and undergo mutation and recombination. Such a method was first proposed by Teo [78] to adapt the population size in different evolution (DE). The method, however, shows no significant advantages over the DE with fixed population sizes. In [79], Eiben discussed such an approach and proposed a hybrid variant of self-adaptive method for GA. The work shows that it is possible to define the population size at individual level. The results show that the proposed method has better performance than GAs with fixed population sizes.

Table 1. Parameter setting of the six algorithms

3 Experimental Comparison

In this section, we shall experimentally compare several representative methods for dynamically population control to show their advantages and disadvantages. In the following subsections, after describing the data sets in Subsect. 3.1 and the parameter settings of the algorithms in Subsect. 3.2, we report the comparison results in Subsect. 3.3.

3.1 Test Suit

To access the performance of different population control strategies, a range of well-known benchmark test functions have been considered. For the unimodal case, we select four functions presented in [21] with minimum value zero, the Sphere Model, Griewangk’s Function, Step Function and Quartic Function. For the multi-model case, the Rastrigin’s function [29], Six-Hump Camel Back Function, Branin Function and Goldstein-Price’s Function have been considered. In addition, the Schwefel’s Function [71], which is a deceptive function, and the Rosenbrock’s Valley Function [16], which has a very complex search space, has also been considered.

Table 2. Comparison results of six methods on 10 benchmark functions

3.2 Algorithms and Parameter Settings

The following six population control algorithms have been selected for comparison. The GAVaPS [11], in which each newly born individual is allocated with a lifetime measured by fitness. Proceeding in a generational manner, individual’s age is increased at each generation and removed when it exceeded its lifetime. The APAGA [58], which is similar as the GAVaPS with an exception that the fittest member will always kept in the population. In addition, only two new individuals will be generated and inserted into the population at each generation. The PL-GA [17], which starts with a small population size and doubling the population size when the GA converges. The GPS-GA [50], in which two subpopulations are evolved in parallel. The dynNP-DE [13], in which the population size is periodically descending during the evolution, and the new population size is equal to half the previous population size. The SAMDE [45], in which the population diversity is considered to increase or decrease the population sizes. These algorithms are based on different rationales for population control and have been widely applied for various applications.

The parameter setting for the six algorithms are taken from the original papers. However, to make the comparison fair, a few parameters, which are common for these algorithms, are set as the same values. Specifically, the initial population size N of 20, crossover rate of 0.65 and bit mutation operator rate 0.015 are used. All algorithms to be compared adopt the tournament operator for selection. The other parameter settings for the six algorithms are shown in Table 1.

3.3 Results and Discussion

The comparison results of the six algorithms are shown in Table 2 in terms of solution quality. Each algorithm is run on each function 10 times. The best results over the ten trials are then averaged. From the results, it can be seen that the dynNP-DE and SAMDE algorithms outperform other four algorithms on \(f_1\), \(f_3\), \(f_5\), \(f_7\), \(f_9\) functions. Between the dynNP-DE and SAMDE, the dynNP-DE shows better performances on multi-model functions, while SAMDE is good at searching the unimodal functions. From the function point of view, for the function \(f_3\), dynNP-DE, SAMDE, PL-GA and GPS-GA can identify the global optimum, while the GAVaPS and APAGA fail to do so. This result could suggest that frequently removing individuals from the population will decrease the robustness of the algorithm. For the functions \(f_6\), only PL-GA and GPS-GA is able to find promising solutions. For the function \(f_8\), which is the most difficult to solve among the ten functions, it can be noticed that all algorithms can be trapped in local optimum. In overall, dynNP-DE performs the best as it could deliver the best results on most of test functions. This is followed by the SAMDE, PL-GA and GPS-GA.

Fig. 1.
figure 1

Comparing the convergence properties of six algorithms on different benchmark

Figure 1 shows the convergence properties of the six strategies, in which the average of best fitness values are plotted versus the actual function evaluations of the six algorithms. It can be observed that GAVaPS and APAGA can converge much faster than the dynNP-DE, SAMDE, PL-GA and GPS-GA. For example, on the test function \(f_{10}\), GAVaPS and APAGA are able to identify the optimal solution with around \(4*10^{3}\) function evaluations. Specifically, on unimodal functions, for instance, \(f_1\), \(f_2\), PL-GA and GPS-GA perform the worst in terms of convergence and can take more than \(1*10^{4}\) function evaluations to converge than other four algorithms. Generally, PL-GA is the slowest method among the six methods, while the GPS-GA is the fastest strategy.

From both the quality and convergence property, it can be generally concluded that the GAVaPS and APAGA perform well on unimodal functions and its extended version APAGA has even better performance. The main fault of GAVaPS is excessive growth of individuals and results in an expensive evolution while APAGA slows down the increasing rate of population and improves its efficiency. The PL-GA and GPS-GA have good performance for multi-model functions especially in terms of convergence speed. While, the dynNP-DE and SAMDE algorithms can perform the best in most of the test functions by average.

4 Conclusions

In this paper, we first present an extensive review of population control methods in evolutionary algorithms. Then, we experimentally examine and analyze a few representative methods on a set of well-known benchmark functions. The results show that the dynNP-DE perform the best in most of the test functions by average. This mainly due to the dynNP-DE is able to maintain a proper balance of exploration and exploitation thus achieving a good performance.