1 Introduction

Evolutionary algorithms (EAs) are widely used as global optimizers, such as genetic algorithms (GAs) (De Jong 1975; Holland 1975) and simulated annealing (Dekkers and Aarts 1991). Population diversity control is important to obtain desired solutions in each EA. However, how to rationally inject randomness to control population diversity is still a difficult problem. In the above problem, there are two sub-problems: (1) How to determine which subpopulations are more suitable for injecting randomness (i.e. first sub-problem)? (2) How much randomness is injected is suitable for population diversity control (i.e. second sub-problem)?

We propose balanced-evolution genetic algorithm (BEGA). BEGA realizes the balanced evolution of the population, which is controlled by population diversity measures. BEGA is characterized by this balanced-evolution strategy. From a coding-space perspective, this is a case study for solving first and second sub-problems. Note that, coding space is also called genotypic space that is related to coding type and coding position, and solution space is also called phenotypic space that is related to the individual fitness. Our contributions are as follows.

About first sub-problem. Similarity guide matrix (SGM) is defined by us to express the population (or subpopulation) distribution in coding space, and two SGMs (i.e. the population SGM SG and the elitism population SGM SE) are used as reference points. Diversity subpopulation searches in opposite-direction region (od-Region), which is obtained by using SG and negative perturbation. This behavior of diversity subpopulation in Fig. 7 always tries to increase Hamming distances between themselves and the current population. Conversely, intensification subpopulation searches in forward-direction region (fd-Region), which is obtained by using SE and positive perturbation. This behavior of intensification subpopulation in Fig. 8 always tries to decrease Hamming distances between themselves and the current elitism population. Therefore, the subpopulation (i.e. diversity subpopulation), which always tries to increase Hamming distances between themselves and the current population, is more suitable for injecting randomness.

About second sub-problem. Linear diversity index (LDI) is defined by us to measure the individual density in coding space. According to LDI, we control the search efforts of diversity and intensification subpopulations in od-Region and fd-Region, respectively. Therefore, by using population diversity measure (i.e. LDI), we decide how much randomness is injected to different behavior-based subpopulations (i.e. diversity and intensification subpopulations).

The rest of the paper is organized as follows. In Sect. 2, we review related works. In Sect. 3, we introduce BEGA. In Sect. 4, we provide experimental results. In Sect. 5, we give our findings.

2 Related works

2.1 Population diversity measures

Burke et al. (2002, 2004) make a detailed survey of population diversity measures, and analyze the correlated relationships among different measures. Črepinšek et al. (2013) categorize population diversity measures into genotype level, phenotype level, composite genotype–phenotype level measures. In essence, high genotype-level diversity of a population maybe not lead to high behavioral diversity, and behavioral diversity is more important than genotype-level diversity (Darwen and Yao 2001).

2.1.1 Widely used diversity measures

Widely used diversity measures are based on the distances of individual coding. This kind of population diversity measures are based on summing (or averaging) Euclidean distances from every individual coding to the center-point, or based on summing (or averaging) Euclidean distances between all pairs of individual coding (Morrison and De Jong 2002; Ursem 2002). Their linearity features are based on the distance’s linearity in coding space, and this is also their widely used reason. When using these measures, computation cost should be considered.

2.1.2 Recently proposed diversity measures

Mattiussi et al. (2004) propose a diversity measure that is able to be used for the variable length or highly reorganizable coding. Lacevic et al. propose a measure based on Euclidean minimum spanning trees, a class of power-mean-based measures, and three measures based on discrepancy (Lacevic et al. 2007; Lacevic and Amaldi 2010). McGinley et al. (2011) propose healthy population diversity that expresses the population diversity in solution space. These measures are able to improve the performances of one EA, when their features are consistent with specific problems. These measure definitions are generally based on nonlinear functions. This has an effect on their generalities.

2.2 Population diversity controls

Exploration (i.e. diversity-based search in the entirely new regions) and exploitation (i.e. intensification-based search in the neighborhoods of the previously visited regions) are two cornerstones of problem solving by search (Eiben and Schippers 1998; Črepinšek et al. 2013). De Jong (2007) gives us a historical overview of parameter setting, which includes parameter tuning before the run and parameter control during the run (Eiben et al. 1999). In this paper, parameter-tuning-based, parameter-control-based, and strategy-based controls are reviewed as follows.

2.2.1 Parameter-tuning-based controls

Smit and Eiben discuss three layers (i.e. design, algorithm, and application layers) of parameter tuning (Smit and Eiben 2009; Eiben and Smit 2011). Generally, parameter setting of one operator is easy to be set, because this operator function is single. For instance, the function of each operator is single in simple GA (SGA). Thus, to find a satisfactory parameter setting is easy in SGA. Choosing random number generators (RNGs) is important to the robustness of EAs, and the study between RNGs and the robustness of EAs is a neglected research (Eiben and Smit 2011).

2.2.2 Parameter-control-based controls

CHC (Eshelman 1991) uses convergence threshold to decide whether to reinitialize the population. According to saw-tooth function, saw-tooth GA reinitializes a part of populations (Koumousis and Katsaras 2006). According to the numbers of successful matings and failed matings, variable dissortative mating GA (VDMGA) adjusts convergence threshold (Fernandes and Rosa 2006, 2008). Instead of the fixed crossover probability and mutation probability, zhang et al. (2007) propose the fuzzy-controlled crossover probability and mutation probability. Alba and Dorronsoro (2005) study the influence of the ratio for structured dynamical populations in cellular GA. In dual-population GA (DPGA), according to the distance fitness function, reserve population adjusts the distance from main population that is evaluated by using the fitness function (Park and Ryu 2010). Through crossbreeding, DPGA protects population diversity. According to different parameter-control-based rules, these researches realize population behavior diversities. In addition, population behavior diversities are able to be evaluated by using multi criteria analysis to improve the robustness of EAs.

2.2.3 Strategy-based controls

Multi optimized rules for tournament selection are proposed by Chen et al. (2009), and these rules are characterized by their dynamic properties. The individuals, with the poor fitness and lower contribution to population diversity, are replaced with offspring in contribution of diversity/replace worst strategy (Lozano et al. 2008). Brain storm optimizations (BSOs) are proposed as a new swarm-based algorithm (Shi 2015). Finding the search regions with the promising solutions in BSOs is worthwhile work. But, it is difficult in real coding space. For instance, one dimension in binary coding space only contains two coding types: ‘0’ and ‘1’. However, one dimension in real coding space contains infinitely many real numbers. BSOs are characterized by grouping strategies, such as K-means-clustering, K-medians-clustering, simple, random, and objective-space-based grouping strategies (Zhu and Shi 2015; Zhan et al. 2012; Cao et al. 2015; Shi 2015). In addition, BSOs are inspired by brain storm process, and brain storm process includes the thinking process. In some sense, the thinking process is a highlight of BSOs. Instead of Gaussian distribution, Lee and Yao (2004) propose a mutation based on Levy distribution, and this distribution has an infinite second moment. Thus, this mutation is more likely to generate offspring, which are far away from parents. Scatter search (Glover 1997) explores solution space by using a set of reference individuals to generate offspring, which simultaneously uses diversification and intensification strategies (Resende et al. 2010). Based on this systematic set of reference individuals, offspring generally obtain better performances.

These strategies are important to balance exploration and exploitation in EAs. To find optimal solutions, there are three obstacles: the huge size of coding space, the huge size of solution space, and various fitness-function-mapping relationships (FRs). If one FR is one-to-one linear mapping, finding optimal solutions is easy by using uniformly-spaced sampling method, even for the huge sizes of coding and solution spaces. Of the three obstacles, various FRs are the most difficult. Dynamic varying rules for population diversity strategies are the interesting research to solve various FRs.

2.2.4 Discussions

In essence, population diversity measures indirectly or directly express the population distribution in coding, solution, or composite coding-solution spaces. Population diversity controls are different in many respects, especially for balancing exploration and exploitation.

How to control population diversity by using diversity measures is an interesting problem. Generally, feedback control is a good scheme. According to diversity measures, feedback control schemes do population diversity controls in real time. In feedback control schemes, realizing population behavior diversities is the key to success. Each behavior-based method should only search a single region. Moreover, we should avoid that two or more behavior-based methods search a same region.

3 Balanced-evolution genetic algorithm

In BEGA, SG and SE are reference points for diversity and intensification subpopulations, respectively.

3.1 Similarity guide matrix

3.1.1 Motivation

Population similarity researches focus on the similarity of individual coding. Schema is a template that identifies a subset of strings with similarities at certain string positions (Holland 1975). Probability vector in compact GA expresses coding similarity in binary-coding problems (Harik et al. 1999). As a kind of estimation of distribution algorithms, quantum-inspired evolutionary algorithm (QEA) uses Q-bit to represent the population in binary-coding problems (Han and Kim 2002; Platel et al. 2009).

The motivation of SGM is to express the population distribution in coding space. Different from the above researches, we use SGM as the reference point in coding space. According to different reference points, we define different search regions. Different from probability vector and Q-bit, SGM is able to be suitable for binary-coding and symbol-coding problems, simultaneously.

3.1.2 Definition

SGM is a two-dimensional matrix to express the population (or subpopulation) distribution in coding space. The horizontal and vertical axes of this matrix correspond to coding position (also called gene location, i.e. j in Eq. (1)) and coding type (also called gene type, i.e. i in Eq. (2)), respectively. The definition of SGM is given by

$$S_{G} = \{ \begin{array}{*{20}c} {s_{v1} } & {s_{v2} } & \ldots & {s_{vj} } & \ldots & {s_{vN} } \\ \end{array} \} \begin{array}{*{20}c} {} & {1 \le j \le N} \\ \end{array}$$
(1)
$$s_{vj} = [s_{0,j} \, \cdots \, s_{i,j} \, \cdots \, s_{{b_{j} ,j}} ]^{T} \begin{array}{*{20}c} {} & {0 \le i \le b_{j} } \\ \end{array}$$
(2)
$$B = \{ \begin{array}{*{20}c} {b_{1} } & {b_{2} } & \ldots & {b_{j} } & \ldots & {b_{N} } \\ \end{array} \} \begin{array}{*{20}c} {} & {1 \le } \\ \end{array} b_{j} \le b_{\hbox{max} }$$
(3)
$$X = \{ \begin{array}{*{20}c} {x_{1} } & {x_{2} } & \ldots & {x_{j} } & \ldots & {x_{N} } \\ \end{array} \}$$
(4)
$$s_{i,j} = p_{i,j} /P\begin{array}{*{20}c} {} & {0 \le i \le b_{j} } \\ \end{array}$$
(5)
$$\begin{array}{*{20}c} {\forall j} & {\begin{array}{*{20}c} {} & {\sum\limits_{i = 0}^{{b_{j} }} {s_{i,j} } = 1} \\ \end{array} } \\ \end{array} \begin{array}{*{20}c} {} & {0 \le i \le b_{j} } \\ \end{array}$$
(6)

where N is coding length, and svj is similarity guide vector (SGV). bj is the upper bound of each genome xj, and bmax is the upper bound of each bj (1 ≤ bj ≤ bmax). In symbol-coding problems, bmax is greater than 1, and bj may be any integer from 1 to bmax. In binary-coding problems, bmax and bj are always equal to 1. X is individual coding, and xj is the value of each genome (0 ≤ xj ≤ bj). pi,j is the number of individuals with i’s at the jth position, and P is population size.

In Fig. 1, a symbol-coding example of SGM is given to demonstrate the relationships among coding position, coding type, individuals, and SGM. In addition, a binary-coding example of SGM is given in Fig. 2. In a global view, Fig. 2 demonstrates the relationship between coding space and its uniform distribution. Conversely, in a local view, Fig. 2 explains the relationship between individuals and their SGM. By comparing this global view with this local view, SGM is suitable for using as a reference point to express the population distribution in coding space.

Fig. 1
figure 1

A symbol-coding example to demonstrate the relationships among coding position, coding type, individuals, and similarity guide matrix (N = 3, P = 4, bmax = 2)

Fig. 2
figure 2

A binary-coding example to demonstrate that similarity guide matrix is suitable for using as a reference point to express the population distribution in coding space (N = 3, P = 4, bmax = 1)

3.1.3 Principle analysis

The principle of SGM is analyzed as follows. First, SGM indicates the search directions of the promising solutions (i.e. solutions with development potential), especially for SE. In unimodal problems, the promising solutions may be near the local/global optimal solutions. In multimodal problems, the promising solutions may be near the local/global optimal solutions, or may be development-potential solutions that are able to make populations escape from the local optimal solutions. Secondly, SGMs are used as reference points to define different search regions. This is feasible. In coding space, we only know individual coding. By comparing with the huge size of coding space, these information is very little. But, based on SGMs in Eq. (1), this kind of reference point is sensitive to reflect the coding change of each individual. Therefore, by using this kind of reference point, the search regions of different subpopulations are able to be defined.

To further demonstrate the principle of SGM, a unimodal problem case is analyzed as follows. Figure 3a shows the relationships among coding position, coding type, coding space, its uniform distribution, individual coding, fitness, and the global optimal solution. With generations increasing (from Fig. 3b–c), SGM becomes nonuniform. In Fig. 3c, each SGV gives the value judgment at each position of the promising solution. In addition, these indicate that SGM is sensitive to the coding change of each individual.

Fig. 3
figure 3figure 3

A unimodal problem case analysis to demonstrate the principle of similarity guide matrix. a Three-dimensional graph about coding position, fitness, and individuals in coding space. b The population distribution and similarity guide matrix in the early-evolution generations. c The population distribution and similarity guide matrix in the later-evolution generations

The process of breeding new individuals, based on SGM, is given in Algorithm 1.

figure a

3.2 Linear diversity index

3.2.1 Motivation

In biology, biological diversity measures are important to understand the degree of species richness, such as Jaccard Index (Jaccard 1912). Diversity-related concepts in biology are also suitable for EAs. We make some modifications to Jaccard Index for its applications in EAs. These modifications mainly focus on the differences between biology and EAs. Inspired by Jaccard Index, we propose LDI. The motivation of LDI is to measure the individual density in coding space. From a coding-space perspective, LDI is able to be used for population diversity control.

3.2.2 Definition

First, we define the center-point individual Xcp of the population by

$$X_{cp} = \{ IndexMax(s_{v1} ), \ldots IndexMax(s_{vj} ), \ldots IndexMax(s_{vN} )\}$$
(9)

where IndexMax() returns the index of the maximum value in SGV. Xcp includes the index of the maximum value in each SGV. The reason is that the index of the maximum value in each SGV is equal to the majority value for all of coding values in the corresponding coding position. For instance, the index of the maximum value in SGV Sv1 of Fig. 1 is equal to 0, and the majority value in coding position No. 1 of Fig. 1 is also equal to 0.

LDI Dl expresses the individual density around Xcp in coding space, which is defined by

$$D_{l} = \sum\limits_{p = 1}^{P} {D_{p} } /P\begin{array}{*{20}c} {} & {1 \le p \le P} \\ \end{array}$$
(10)
$$D_{p} = Hamdis(X_{p} ,X_{cp} )/N$$
(11)

where P is population size, N is coding length, and Hamdis() returns Hamming distance between Xcp and the pth individual Xp.

3.2.3 Principle analysis

The principle of LDI is analyzed as follows. First, Xcp always exists in coding space, and does not always exist in solution space. In Eq. (9), Xcp may be regarded as individual coding. However, this individual may not be a feasible solution in the solution space of constrained optimization problems. The reason is that this Xcp is not satisfied with constraints, and we cannot obtain the fitness value of Xcp. However, Xcp always exists in coding space, because coding space is not related with constraints and fitness. Secondly, in coding space, linearity is an important condition to control population diversity. In Eq. (9), the approximation between Xcp and SG is acceptable. From a coding-position estimation perspective, each element in Xcp is the majority value in each coding position. Using the majority value as the center point in each coding position is suitable for the center-point meaning to measure diversity. In addition, Hamming distance between two individuals is linear, and LDI is also linear. Thus, LDI is suitable for population diversity control.

To further demonstrate the principle of LDI, we give a case analysis. In Fig. 4a, individual-loose distribution means long Hamming distances between Xcp and each individual, and indicates good diversity. In Fig. 4b, individual-close-together distribution means short Hamming distances between Xcp and each individual, and indicates poor diversity. With generations increasing, Xcp changes continually. Meanwhile, population diversity gradually decreases from individual-loose distribution to individual-close-together distribution.

Fig. 4
figure 4

A case analysis of linear diversity index to demonstrate the individual density around the center-point individual Xcp. a Individual-loose distribution. b Individual-close-together distribution

3.3 Algorithm design

According to SG and SE as reference points, BEGA use negative and positive perturbations to control the search directions of diversity and intensification subpopulations, respectively. In addition, we explain the balanced-evolution strategy in Sects. 3.3.2 and 3.3.3.

3.3.1 Framework

The procedure of BEGA is given in Algorithm 2. Balanced-evolution procedure in Fig. 5 corresponds to Step 3Step 7 in Algorithm 2.

figure b
Fig. 5
figure 5

The balanced-evolution procedure of the population in each generation

We define the first and second stages by using LDI Dl, control amplitude CA, and shift limit of population diversity Dsl. The first stage is the generations for LDI Dl > Dsl, and the second stage is the generations for LDI Dl ≤ Dsl. In Fig. 6a, LDI Dl decreases gradually in BEGA, because this is a general law of population diversity in EAs. In addition, CA is used to control the amplitudes of negative and positive perturbations in diversity and intensification methods, respectively. In Fig. 6b, CA is equal to LDI Dl in the first stage, and is equal to Dsl in the second stage.

Fig. 6
figure 6

An example of linear diversity index Dl, control amplitude CA, and shift limit of population diversity Dsl in the first and second stages

Intensification and diversity subpopulations after updating are 50% of the best individuals in the population. In addition, the procedure of updating elitism population is as follows. In the first stage, by using elite selection, elitism population after updating (EPAU) is obtained for elitism population and intensification subpopulation after updating. In the second stage, EPAU is also obtained by using elite selection. Moreover, Hamming distance between two fitness-adjacent individuals in EPAU is required to be greater than the minimal distance md. If the size of the chosen individuals is less than the size of elitism population Pe, we use the best fitness individuals as other individuals in EPAU.

3.3.2 Diversity method

figure c

od-Region is determined by MN. In Fig. 7a, negative perturbation changes the coding distribution of temporary individuals. In Fig. 7b, comparing with Hamming distances between diversity subpopulation and diversity subpopulation, Hamming distances between temporary individuals and diversity subpopulation increase. Moreover, the new diversity subpopulation is obtained by using temporary individuals and diversity subpopulation. Therefore, Hamming distances between the new diversity subpopulation and diversity subpopulation also increase. This realizes the exploration of the new diversity subpopulation, and improves the robustness of BEGA.

Fig. 7
figure 7

An example to demonstrate that Hamming distances between temporary individuals and diversity subpopulation increase (N = 4, bmax = 1). a The relationship between negative perturbation and temporary individuals. b The sum of all-pair-individual Hamming distances between temporary individuals and diversity-subpopulation individuals is 50. Conversely, the sum of all-pair-individual Hamming distances between diversity-subpopulation individuals and diversity-subpopulation individuals is 36

To explain the behavior features of diversity subpopulation about the balanced-evolution strategy, we use the examples in Figs. 6 and 7. The behavior features of diversity subpopulation are as follows. CA is important to control the amplitude of negative perturbation. In Fig. 6b, with LDI Dl decreasing, CA is decreasing in the first stage, and then is equal to Dsl in the second stage. This kind of CA setting is suitable for improving the exploration abilities of diversity subpopulation, when the population diversity is decreasing in Fig. 6a. In Fig. 7, CA setting is also suitable for increasing Hamming distances between temporary individuals and diversity subpopulation.

3.3.3 Intensification method

figure d

fd-Region is determined by MP. Similarly to Fig. 7, an example of intensification method is given in Fig. 8. Positive perturbation is the reason that Hamming distances between the new intensification subpopulation and elitism population decrease. This realizes the exploitation of the new intensification subpopulation, and improves the efficiency of BEGA.

Fig. 8
figure 8

An example to demonstrate that Hamming distances between temporary individuals and elitism population individuals decrease (N = 4, bmax = 1). a The relationship between positive perturbation and temporary individuals. b The sum of all-pair-individual Hamming distances between temporary individuals and elitism-population individuals is 44. Conversely, the sum of all-pair-individual Hamming distances between elitism-population individuals and elitism-population individuals is 48

To explain the behavior features of intensification subpopulation about the balanced-evolution strategy, we use the examples in Figs. 6 and 8. The behavior features of intensification subpopulation are as follows. CA is important to control the amplitude of positive perturbation. In Fig. 6b, with LDI Dl decreasing, CA is decreasing in the first stage, and then is equal to Dsl in the second stage. This kind of CA setting is suitable for improving the exploitation abilities of intensification subpopulation, when the population diversity is decreasing in Fig. 6a. In Fig. 8, CA setting in Fig. 6a is also suitable for reducing Hamming distance between temporary individuals and elitism population.

3.4 Discussions

3.4.1 Cooperation

The relationship between diversity subpopulation and intensification subpopulation is a kind of cooperation. First, the definitions of od-Region and fd-Region are the foundation of the cooperation. To search od-Region and fd-Region, diversity and intensification subpopulations cooperate with each other. The diversity-subpopulation size is equal to the intensification-subpopulation size. Thus, their search efforts are the same. Secondly, another kind of cooperation is the individual exchanges between diversity subpopulation and intensification subpopulation. In the second stage, there are more and more individual exchanges, which should be called individual shuffling. This kind of individual shuffling is important to find the new promising solutions.

3.4.2 Efficiency

The no-free-lunch-theorem is always there. Although SGM and LDI are useful, their computation costs have an effect on the efficiency of BEGA. Thus, each kind of computation must promote the population evolution. To realize this goal, we pay more attention to selecting the reasonable search region of each individual. Details are as follows. First, we use SE in intensification method, and use positive perturbation to reduce Hamming distances between temporary individuals and elitism population. This method is effective to improve the convergence speed of BEGA, especially in the first stage. Secondly, we use SG in diversity method, and use negative perturbation to increase Hamming distances between temporary individuals and diversity subpopulation. In addition, we use the multiplier factor of mutation operator ms to scatter diversity subpopulation. These methods of scattering diversity subpopulation are good to explore much more search regions in the first stage. Thus, this strategy is able to improve the convergence speed of BEGA, especially in the second stage.

3.4.3 Adaptability

To avoid falling into the local optimal solution, we use SG and SE. The multimodal problems lead to the differences between SG and SE. These differences are useful for making the population escape from the local optimal solution. To demonstrate the abilities of escaping from the local optimal solution, a multimodal problem example is given in Fig. 9. First, Fig. 9a presents a multimodal problem about the relationships among coding position, fitness, and individuals in coding space. Secondly, in Fig. 9b, most of individuals in the population (i.e. individual No. 1–No. 4) are close to the local optimal solution, and fall into the local optimal solution. However, two SGMs are different in Fig. 9b. SG is based on the population, which includes individual No. 1–No. 5. SE is based on the elitism population, which includes individual No. 2 and No. 5. These differences between SG and SE are useful for escaping from the local optimal solution. Thirdly, with the population fitness increasing in Fig. 9c, the fitness threshold of the elitism population also increases. Therefore, SE gradually changes the population distribution from Fig. 9b to Fig. 9c. Thus, the population escapes from the local optimal solution.

Fig. 9
figure 9figure 9

A multimodal problem example to demonstrate how population escapes from the local optimal solution by using similarity guide matrices of the population and the elitism population. a Three-dimensional graph about coding position, fitness, and individuals in coding space. b Most of individuals in the population fall into the local optimal solution. c Population gradually escapes from the local optimal solution by using similarity guide matrix of the elitism population

4 Experiments and discussions

4.1 Experimental scheme

4.1.1 Benchmarks

We used 12 combinatorial optimization problems as benchmarks in Appendix 1.

4.1.2 Compared algorithms

We used CHC, DPGA, VDMGA, and QEA as compared algorithms for binary-coding problems, and used greedy GA (GGA) as the compared algorithm for symbol-coding problems in Appendix 2.

4.1.3 Used parameter settings

See Table 1.

Table 1 Used parameter settings in this paper

4.2 Experiments on convergence

To demonstrate the effectiveness of BEGA, the effectiveness results were given in Table 2. BEGA were acceptable. In Fig. 10, we provided the mean of the best-individual-fitness ranks in binary-coding problems. This indicates that the convergence performances of BEGA are consistent for 100 runs.

Table 2 Effectiveness results: MEAN and STD are the mean and standard deviation of the best individual fitness for 100 runs, respectively.  Efficiency results: ACT is the average computation time for 20 runs
Fig. 10
figure 10

The mean of the best-individual-fitness ranks for 100 runs in binary-coding problems

Because SGM is an estimation parameter, the computation-time test is necessary to demonstrate the efficiency of BEGA. In Table 2, BEGA was first in 8 benchmarks, and second in other 4 benchmarks. This implies that intensification method is effective to improve the convergence speed.

The objective of BEGA is only to improve the best individual fitness, and BEGA does not consider other fitness-related indexes. This is the difference of BEGA. As shown in Fig. 11, the fitness-mean and fitness-minimum curves of BEGA were not better than others. The reason is that diversity subpopulation always tries to increase Hamming distances between themselves and the current population. This indicates that the balanced-evolution strategy is effective.

Fig. 11
figure 11

The fitness curves of PPeaks problem for 100 runs to demonstrate that the objective of BEGA is only to improve the best individual fitness

4.3 Experiments on different parameters

4.3.1 Different scale problems

In Fig. 12, we demonstrated the robustness of BEGA for different scale problems. With coding length increasing, the best individual fitness of BEGA became much better than others. This indicates that the advantage of SGM as an estimation parameter becomes obvious, when coding length increases.

Fig. 12
figure 12

100-run results to demonstrate the robustness of BEGA for different scale problems (N is the coding length)

4.3.2 Different population sizes

In Fig. 13, we demonstrated the robustness of BEGA for different population sizes. Different population sizes had a little effect on the robustness of BEGA. For instance, there were 3.9% differences of BEGA between population size P = 90 and population size P = 40 in Fig. 13a.

Fig. 13
figure 13

100-run results to demonstrate the robustness of BEGA for different population sizes (P is population size)

4.3.3 Different shift limit of population diversities

In Fig. 14, we demonstrated the search-region-control function of Dsl for diversity subpopulation in the second stage. When Dsl was respectively set to 0.075 and 0.35, the fitness and LDI curves simultaneously shifted in 50th generation. Fitness-curve and LDI-curve gaps indicate that the ranges of two od-Regions for Dsl = 0.075 and Dsl = 0.35 are entirely different. This implies that the search-region-control function of Dsl is effective for diversity subpopulation.

Fig. 14
figure 14

5-run results to demonstrate the search-region-control function of Dsl for diversity subpopulation

5 Conclusions

In BEGA, the balanced-evolution strategy realizes the population-evolution balance between the exploration of diversity subpopulation and the exploitation of intensification subpopulation. From a coding-space perspective, this case study solves first and second sub-problems in Sect. 1, which also demonstrates the feasibility of the feedback control scheme. It is worth noting that BEGA is able to directly solve bounded knapsack problem (i.e. symbol-coding problem) as one EA-based solver, and does not transform bounded knapsack problem into an equivalent binary knapsack problem.