1 Introduction

Optimization problems are frequently encountered in numerous science and engineering fields. Traditional problems characterized by being continues, unimodal, differentiable, and linear were widely researched prior to the 1960s. However, real-world optimization problems tend to be nonlinear, discontinuous, non-differentiable, and multimodal. Hence, traditional optimization methods, such as Newton’s method (Roy and Sevick-Muraca 1999) and quasi-Newton (Setiono and Hui 1995) methods, cannot be used. The need for effective optimization algorithms with the ability to solve complex real-world optimization problems led to the development of many evolutionary algorithms (EAs), such as genetic algorithm (GAs) (Holland 1975), particle swarm optimization (PSO) (Kennedy and Eberhart 1995), ant colony optimization (ACO) (Dorigo and Gambardella 1997), differential evolution (DE) (Storn and Price 1997), artificial bee colony (ABC) (Karaboga and Basturk 2007) algorithm. EAs can avoid becoming trapped in local optima in solving many real-world optimization problems, which traditional problems cannot solve. However, EAs also may be poor at exploitation in some complex and multi-dimensional optimization problems. That is why we want to optimize EAs.These EAs have been used to solve many practical problems (Li and Pan 2015; Yi et al. 2016).

EAs inspired by biogenetics, natural phenomena, physical phenomena, and social phenomena have been proposed. These are known as global optimization algorithms, as they avoid becoming trapped in local optima in order to find globally optimal solutions. Generally speaking, an EA starts with an initial population of candidate solutions. New solutions are then generated from solutions in the previous population. Next, the quality of each new solution is evaluated by a fitness function. Finally, a selection process is used to produce a new population of solutions. This iterative process is repeated until the optimum or near-optimum solution is reached.

The ABC algorithm is a relatively new approach that was proposed by Karaboga in 2005 (Karaboga and Basturk 2007). Since then, the interest in ABC algorithms has increased rapidly. Experimental results show that the performance of ABC is better than that of other EAs in many problems, because it has fewer control parameters and is easier to apply (Karaboga and Basturk 2007). ABC has been widely used to solve many real-world problems. For example, ABC has been applied to a loudspeaker design problem (Zhang et al. 2014), and for the design of two-channel quadrature mirror filter banks (Agrawal and Sahu 2015). ABC was also used to minimize the makespan for single machine batch processing with nonidentical job sizes (Al-Salamah 2015). Horng adopted ABC for a stochastic economic lot scheduling problem (Horng 2015), and Pan solved the large-scale hybrid flow shop scheduling problem with ABC (Li and Pan 2015). ABC has also been employed to solve an interest-based forwarding problem (Xia et al. 2015).

The ABC algorithm is a simple, efficient, effective, and robust evolutionary optimization method. As a result, ABC has emerged as a potential tool for solving local and global optimization problems (Gu and Sheng 2013; Wen et al. 2015). An increasing number of numerical benchmark functions are being employed to evaluate the performance of ABC. However, there is no specific algorithm that can achieve the best solution for all optimization problems. The ABC algorithm also has some disadvantages. For instance, it may occasionally stop proceeding toward the global optimum even though the population has not converged to a local optimum, it can struggle with certain classes of optimization problems and suffers from long computation times because of its stochastic nature.

It is well known that EAs include both exploration and exploitation strategies. However, as exploration and exploitation are inherently contradictory, they should be well balanced to ensure good performance. Thus, many new and self-adaptive ABC algorithms have been proposed (Babaoglu 2015; Bansal et al. 2013; Gao et al. 2013, 2014; He et al. 2013; Kang et al. 2013; Li and Yin 2014; Rajasekhar and Pant 2014; Liu et al. 2015). The variation equations were inspired by DE/rand/1 and DE/current-to-rand/1 (Epitropakis et al. 2011). To further enhance the convergence rate of the proposed algorithm, a self-adaptive modification rate (MR) based on a successful update probability was also proposed to generate suitable parameters. However, Liu has not conducted further research on this issue. In order to use the parameters of ABC during the evolutionary process, Bansal et al. (2013) proposed an adaptive version of ABC where the step size in solution modification and the ABC parameter limit are determined adaptively based on the current fitness values. These self-adaptive algorithms have mainly focused on determining the self-adaptive parameters, whereas the focus of this article is on the adaptive search strategies themselves. Usually, there is only one candidate solution generating strategy (CSGS) in each algorithm. The CSGS tends to be either good at exploration or good at exploitation. Thus, it is difficult to simultaneously achieve the two goals of exploration and exploitation using one CSGS. Hence, there is a need to search for an improved optimization method. In this paper, we propose a self-adaptive artificial bee colony algorithm based on the global best (SABC-GB) for global optimization. In SABC-GB, several different CSGSs are employed simultaneously. This modification allows us to tune the balance between the convergence rate and the robustness of the algorithm. As a result, it is expected that the convergence speed of SABC-GB can be accelerated to enable better solutions to be obtained within an acceptable convergence time.

The K-means algorithm, proposed by Macqueen (1967), is an important clustering technique (Macqueen 1967). Its goal is to divide data sets into several clusters, where the data within the same cluster are similar and those in different clusters are as dissimilar as possible. Because of its simple description and high efficiency, K-means has been widely used since the 1970s. However, K-means also has some shortcomings: It is difficult to determine the value of K in advance, it can become stuck around local optimal, and it is sensitive to the initial centers. Thus, SABC-GB is applied to this real problem to overcome these shortcomings.

In this paper, the SABC-GB, which is based on self-adaptive strategies, is proposed. We modify the employed bee phase to improve the global optimal capability of the SABC-GB algorithm and use a novel probabilistic method to enhance the search ability of the onlooker bee phase. Furthermore, we change the initialization phase to avoid local minima, i.e., SABC-GB adopts chaotic systems and opposition-based learning method to initialize the population. The remainder of this paper is organized as follows. Section 2 summarizes the conventional ABC algorithm and K-means algorithms. The SABC-GB algorithm is present and analyzed in Sect. 3. In Sect. 4, we present and discuss the results of a series of experiments. The paper concludes with a discussion in Sect. 5.

2 Related work

2.1 Artificial bee colony algorithm

In the ABC algorithm, the location of a food source represents a feasible solution of the optimization problem, and the quality of solution is referred to as the fitness. There are three kinds of bees in the ABC algorithm: employed bees, onlooker bees, and scout bees. In the algorithmic model, the number of employed bees is always equal to the number of onlooker bees. It is important to note that the three types of bees can transmute into each other. In the initial stage, the population P is produced according to Eq. (1).

$$\begin{aligned} P = Lbound + rand_1*(Ubound - Lbound) \end{aligned}$$
(1)

where Lbound and Ubound are the lower and upper bounds, respectively, and \(rand_1\) is a random number from 0 to 1.

Each solution of P is a D-dimensional vector. The employed bees search for all the food sources for a maximum of MaxFES iterations. The search equation is shown as (2).

$$\begin{aligned} {V_{i,j}} = {X_{i,j}} + rand_2*({X_{i,j}} - {X_{k,j}}) \end{aligned}$$
(2)

where i represents the current individual, \(k,j \in \{ 1,2,\ldots ,SN\}\), \({k \ne j \ne i}\), \({rand_2 \in ( - 1,1)}\), and \({V_{i,j}}\) is the new solution in the next generation.

After gathering honey, the employed bees compare the quality of the former food source with that of the new food. If the quality of the new food is higher than that of the previous one, the bees will memorize the location of the new food source; otherwise, the old one is remained. When the search stage is finished, the employed bees go back to a dance area and transmit information about the food sources to the onlooker bees. According to this information, the onlooker bees choose good food sources from which to gather honey. The richer the source is, the higher the probability it will be selected. The computational formula is shown as (3).

$$\begin{aligned} {P_i} = fi{t_i}/\sum \limits _{k = 1}^{SN} {fi{t_k}} \end{aligned}$$
(3)

where \(fit_i\) is the fitness value of solution I, SN is the number of individuals, and \(P_i\) is the selection probability of the current solution.

Each individual corresponds to one trial counter. A food source that could not be improved through s set number of trials, referred to as the limit, is abandoned. If the trial counter of a solution exceeds the limit, the scout bee will abandon it and generate a new one.

2.2 Parameter optimization in using K-means algorithm

The K-means clustering algorithm has become one of the most frequently used clustering algorithms (Ji et al. 2015; Macqueen 1967). However, K-means is strongly affected by the initial centers. Thus, we use the proposed algorithm to determine the clustering centers.

The idea of the K-means algorithm is as follows: Data are classified into clusters according to the Euclidean distance to the center of the cluster. Assume the number of data objects is n. The sample set is a set of numeric objects \({\text {X = }}\left\{ {{{{x}}_1},{{{x}}_2}, \ldots ,{{{x}}_n}} \right\} \), and the number of clusters is \(k (< n)\). The aim of the K-means algorithm is to search for a partition of X into k clusters such that objects belonging to the same cluster are as similar to each other as possible, whereas objects belonging to different clusters are as dissimilar as possible. The process of the K-means algorithm is as follows:

Step 1 Choose k data objects randomly from the original dataset X as initial cluster centers

$$\begin{aligned} C = \left\{ {{c_1},{c_2}, \ldots ,{c_k}} \right\} \end{aligned}$$
(4)

Step 2 According to the distance calculation Eq. (5), calculate the distances between each data object \({{{x}}_j}\left( {j = 1,2, \ldots ,n} \right) \) and the selected k cluster centers, defined as \({\text {d}}\left( {{{{x}}_j},{{\text {c}}_i}} \right) \left( {i = 1,2, \ldots ,k} \right) \). Each data object is thus classified into a cluster with the least Euclidean distance to the center of the cluster;

Step 3 Recalculate the average value of the data objects in each cluster as a new cluster center;

Step 4 Repeat Steps 2 and 3 until the center points can not be changed or the objective function converges.

$$\begin{aligned} {\text {d}}\left( {j,i} \right) = \sqrt{{{\left( {{x_{j1}} - {x_{i1}}} \right) }^2} + {{\left( {{x_{j2}} - {x_{i2}}} \right) }^2} + \cdots + {{\left( {{x_{js}} - {x_{is}}} \right) }^2}} \end{aligned}$$
(5)

where both \(j = \left( {{x_{j1}},{x_{j2}}, \ldots ,{x_{js}}} \right) \) and

\(i = \left( {{x_{i1}},{x_{i2}}, \ldots ,{x_{is}}} \right) \) are s-dimensional data objects. The aim of the K-means algorithm is to minimize the following objective function:

$$\begin{aligned} {\text {J}} = \sum \limits _{i = 1}^k {{{\sum \limits _{{x_j} \in {c_i}} {\left\| {{x_j} - {c_i}} \right\| } }^2}} \end{aligned}$$
(6)

where J represents the sum of squares of the distances between all objects in a cluster and the cluster center. \({c_i}\) represents the value of the i th cluster center, and \({x_j}\) represents the j th data object belonging to \({c_i}\).

Table 1 Test functions
Table 2 Test functions
Table 3 Test functions
Table 4 Statistical results obtained by SABC-GB algorithm on 25 independent runs with 30-D
Table 5 Statistical results obtained by SABC-GB algorithm on 25 independent runs with 50-D

3 Self-adaptive artificial bee colony algorithm

3.1 Population initialization

Population initialization is an essential phase, as it influences the convergence of the final solution. To reduce the deviation of different initial populations, this paper proposes an initial fixed population. In the first iteration of the algorithm, we use Eq. (7) to generate a population. This population is saved for use in later operations.

$$\begin{aligned} population = {X_\mathrm{min}} + rand*({X_\mathrm{max}} - {X_\mathrm{min}}) \end{aligned}$$
(7)

where \(X_\mathrm{min}\) and \(X_\mathrm{max}\) are the upper and lower bounds of the search space, respectively, and rand is a random number in the range (−1, 1).

Inspired by Ref. Gao and Liu (2012), we also designed a novel initialization algorithm that uses chaotic systems (Alatas 2010) and opposition-based learning method (Rahnamayan et al. 2008). Different from the method proposed by Gao and Liu (2012), cosine function is replaced by sine function and the range of values become a half of original values in the novel algorithm. This is because it can avoid repeated initialization and get better results. As shown in Algorithm , the detail steps are as follows. First, the chaotic method is used to generate the initial population. Next, another population is produced by the opposition-based learning method. The size of the two populations is the same. Then, we evaluate them and figure out the fitness values. We compare them and the better one is reserved. Finally, the final initial population is obtained.

3.2 Self-adaptive mechanism

During different stages of evolution, different CSGSs may be more effective. We develop the SABC-GB algorithm by introducing a self-adaptive mechanism into the ABC algorithm. We maintain a candidate strategy pool that includes several effective CSGSs with diverse characteristics. These CSGSs are used adaptively according to their previous performance in generating promising solutions. The core idea of the self-adaptive mechanism is as follows: During the evolution process, each CSGS is assigned a probability value. Each CSGS is selected according to the probability for each solution through roulette wheel selection. The new individuals are then generated by the selected CSGS.

Fig. 1
figure 1

Convergence characteristics of SABC-GB and SABC-GB2 on fun1 and fun2. Note: because we do not know the range of the possible fitness value (y-axis) in advance, in this paper, we suppose the fitness value range is big enough, in order to make the figure seems clearly, we convert the fitness to the correspond fitness

Table 6 Optimization results of GABC, ABC/best/1, ABC/best/2 and SABC-GB on 25 test functions with 30-D
Table 7 Optimization results of GABC, ABC/best/1, ABC/best/2 and SABC-GB on 25 test functions with 50-D
Fig. 2
figure 2

Convergence performance of SABC-GB, GABC, ABC/best/1 and ABC/best/2. a Fun2 with D \(=\) 30. b Fun2 with D = 50. c (c) Fun13 with D = 30. d Fun13 with D \(=\) 50

In the initialization stage, each CSGS is given an equal selection probability. The selection probability is the reciprocal of the number of CSGSs. Flag matrices for successful and failed evolutions are denoted as nsFlagSABC and nfFlagSABC, respectively. LP is defined as the fixed number of previous generations. The total success and total failure flag matrices of each CSGS in LP are termed \(S_{kg}\) and \(F_{kg}\), respectively. These two matrices are initially null.

figure a

First, for each individual, one strategy is selected from the CSGS pool through roulette wheel selection. Next, with the selected CSGS, a new individual is generated and its fitness value is calculated. If the new fitness value is better than the previous one, the matrix nsFlagSABC is updated. Otherwise, the matrix nfFlagSABC is updated. After the evolution of all individuals, statistical information about the flag matrices nsFlagSABC and nfFlagSABC is recorded in \(S_{kg}\) and \(F_{kg}\). Once the iteration number reaches LP, new selection probabilities for all strategies, \(P_q'\), are calculated through \(S_{kg}\) and \(F_{kg}\). This process is described in Eq. (8), and the probabilities are normalized according to Eq. (9).

(8)
$$\begin{aligned} {P_q} = P_q'/\sum \limits _{q = 1}^Q {P_q'} \end{aligned}$$
(9)

where LP is a fixed integer. In this paper, LP is set to 10. \(P_q\) represents the success rate of candidate solutions generated by one strategy successfully entering the next generation within the previous LP generations. The small constant value \(\varepsilon \) is used to avoid division by zero.

3.3 Self-adaptive candidate strategies

We now investigate several effective CSGSs from the relevant literature and choose from among them to construct the strategy candidate pool. Theoretical studies of the optimal pool size and the selection of strategies used in the pool are attractive research issues and deserve further investigation.

The best solutions in the current population are very useful sources that can be used to improve the convergence performance. Gao proposed the ABC/best/1 and ABC/best/2 strategies (Gao et al. 2012), whereby the best solutions which have been explored are used to direct the movement of the current population. The corresponding strategies are devised as follows:

$$\begin{aligned} {V_{i,j}} = {X_{best,j}} + {\phi _{i,j}}({X_{r1,j}} - {X_{r2,j}}) \end{aligned}$$
(10)
(11)

where the indices i, \(r_1\), \(r_2\), \(r_3\), and \(r_4\) are different integers chosen at random from \(\{ 1,2,\ldots ,SN\}\). \(X_{best}\) is the individual vector with the best fitness in the current population, and j is a random integer chosen from \(\{ 1,2,\ldots ,D\}\).\({\phi _{i,j}}\) is a random number in the range \([-1,1]\).

Besides, Zhu proposed the GABC algorithm (Zhu and Kwong 2010), which can be written as follows:

(12)

where \(X_{k,j}\) is a random individual in the population and \(X_{best,j}\) is the individual with the best fitness in the current population. The indices i, \(r_1\), \(r_2\), \(r_3\), and \( r_4\) are mutually exclusive integers chosen at random from \(\{ 1,2,\ldots ,SN\} \) , and j is a random integer chosen from \(\{ 1,2,\ldots ,D\} \). \({\phi _{i,j}}\) is a random number in the range \([-1,1]\), and \({\phi _{i,j}}\) is a uniform random number in the range [0, C], where C is a nonnegative constant. In this paper, C is set to 1.5.

We use GABC, ABC/best/1, and ABC/best/2 as the CSGSs of the SABC-GB algorithm. For a certain optimization problem, the selection probability of a good CSGS should be higher than that of the others. As the generation number increases, the probabilities will evolve until the best CSGS for the problem has been determined. A description of this framework is present in Algorithm 2.

figure b

4 Experiments

4.1 Benchmark functions and parameter settings

We employed 25 benchmark functions to test the performance of the SABC-GB algorithm. A detailed description of these test functions can be found in Suganthan et al. (2005). The functions are numbered from \(f_1\) to \(f_{25}\). The specific functions are listed in Tables 1, 2 and 3.

The number of food sources is equal to the number of employed or onlooker bees (SN). In the experiments, we considered solution dimensions of 30 and 50. To ensure a fair comparison, the populations for these dimensions were initialized using the same random seeds, and the number of employed bees was set to half the population size, respectively. The maximum number of cycles for the algorithm is related to the dimension D and number of individuals ps, \(\lim \mathrm{{it}} = 0.6*(ps/2)*D\). The fixed number of previous generations \(LP\,=\,10\). The number of decision variables was set to the same for all 25 test functions. For each algorithm on each function, 25 independent runs were conducted with 500,000 number of function evaluations (FEs) as the termination criterion.

4.2 Experimental results

We compared the SABC-GB algorithm to GABC (Zhu and Kwong 2010), ABC/best (Gao et al. 2012), ABC1 (Gao et al. 2012), and ABC2 (Gao et al. 2012). The experimental results are present in the following tables. Through the experiment, we found that the performance of SABC-GB, which consists of three self-adaptive candidate strategies, is better than that of the other algorithms. This section examines the performance of each of the three strategies employed in SABC-GB in comparison with a number of previous algorithms. The statistical results of SABC-GB are listed in Tables 4 and 5, which present the optimum, best, worst, mean, and standard deviation of each benchmark function. As shown in Tables 4 and 5, SABC-GB found the optimum solution with both 30 and 50 dimensions. From Table 4, it is apparent that SABC-GB reached the same values for Fun1, Fun7, Fun9, Fun21, Fun23, and Fun25 in the 30-D case, and similarly for Fun7, Fun8, Fun9, and Fun23 in the 50-D case. SABC-GB can discover close results in most functions. Overall, the efficiency of SABC-GB became lower when the number of dimensions increased.

4.2.1 Performance comparison between SABC-GB and SABC-GB2

As all of the test functions are minimization problems, smaller fitness values correspond to better solutions. For example, Fig. 1 illustrates the convergence characteristics of Fun1 and Fun2 in the 25 independent runs using SABC-GB and SABC-GB2 (SABC-GB2 is SABC-GB without Algorithm 1). It can be seen that, in the later stages of evolution, SABC-GB2 became trapped earlier than SABC-GB. Moreover, the convergence rate of SABC-GB2 was slower. The primary reason is that the distribution of the initial population has a significant influence on the exploitation ability of this algorithm, and good exploration directly affects the convergence rate and quality of the solution, especially in the later phases of evolution. Obviously, the novel initialization approach employing chaotic systems and opposition-based learning reduces the effects of these deficiencies. We research the conclusion that initialization with Algorithm 1 is better than the random initialization for almost all of the test functions.

Table 8 Optimization results of SABC-GB, SACABC, ISABC, QAABC and SSABC on 25 test functions with 30-D
Table 9 Optimization results of SABC-GB, SACABC, ISABC, QAABC and SSABC on 25 test functions with 50-D

4.2.2 Performance comparison between different strategies

The performance of SABC-GB was compared with that of GABC (Zhu and Kwong 2010), ABC/best/1 (Gao et al. 2012), and ABC/best/2 (Gao et al. 2012). The results present in Tables 6 and 7 in terms of the mean and standard deviation of solutions are obtained in the 25 independent runs by each algorithm. Figure 2 presents a comparison in terms of the convergence characteristics of the evolutionary processes in solving the three different kinds of problems.

Interestingly, SABC-GB found the exact minimum of Fun1 and Fun9 in all dimensions. We can also observe that GABC and ABC/best/2 do not find the precise solutions. Hence, these kinds of problems are not easy, but can generally be solved by SABC-GB with a high degree of accuracy. The solutions and the convergence rates of the different algorithms are shown in Tables 6, 7 and Fig. 2. From these results, it is clear that SABC-GB finds the most accurate solutions for most test functions and has the fastest convergence rate. In particular, SABC-GB gives highly accurate results for most test functions except Fun5 with D = 30, Fun7 with D = 30, Fun24 with D = 50 and Fun25 with D = 50. Moreover, the experimental results demonstrate that the efficiency of the four algorithms is similar on Fun8. The standard deviation of the solutions given by SABC-GB is worse in the 50-D case than for 30-D. Nonetheless, SABC-GB outperforms GABC, ABC/best/1, and ABC/best-/2 significantly, with better mean values in all dimensions. In short, SABC-GB has better exploration and exploitation abilities.

It can also be seen from these results that the proposed algorithm finds better solutions for most functions. Figure 2 shows the change in fitness of SABC-GB, GABC, ABC/best/1, and ABC/best/2 for Fun2, Fun13, and Fun25, respectively. These functions are from different classes. Obviously, the convergence speed of SABC-GB is faster than that of the other algorithms. Although the optimization process is similar in the intermediate phase of some problems, SABC-GB eventually gives better solutions than the other algorithms. From these tables, we can conclude that the SABC-GB algorithm is better for global optimization problems than the other algorithms. The proposed SABC-GB avoids becoming trapped in local optima while improving the global search ability.

4.2.3 Performance comparisons between the proposed algorithm with other self-adaptive algorithms

To further verify the efficiency and superiority of SAB

C-GB, its performance was compared with that of some recent excellent self-adaptive algorithms, namely SACABC (Li and Yin 2014), ISABC (Rajasekhar and Pant 2014), QAABC (He et al. 2013), and SSABC (Liu et al. 2015). These algorithms were applied to unimodal, multimodal, and composite benchmark functions. The optimization results of SABC-GB, SA

CABC, ISABC, QAABC, and SSABC on 25 test functions with different dimensions are shown in Tables 8 and 9. Overall, the final solutions generated by SABC-GB are better than those given by the other self-adaptive algorithms for all benchmark functions. This is because SABC-GB employs a self-adaptive strategy selection mechanism and solves each function using the best strategy. Additionally, SABC-GB finds the most accurate solutions for Fun1 and Fun9 in all dimensions, whereas SACABC, ISABC, QAABC, and SSABC cannot find precise solutions for Fun1 and Fun9 in all dimensions. This implies that these two problems are not easy to solve, but can be handled by SABC-GB. It is interesting to note that SABC-GB, SACABC, QAABC, and SSABC found the best solution for Fun21. The optima given by SABC-GB for Fun7, Fun9, and Fun23 are very stable for all dimensions. As shown in Tables 8 and 9, the solutions of f15–f25, which are composite functions, are relatively poor. However, SABC-GB outperformed the other algorithms in terms of accuracy.

Figure 3 shows the convergence performance of SABC-GB, SACABC, ISABC, QAABC and SSABC in solving three different types of functions. The convergence performance of SABC-GB is similar to that of SACABC, QAABC, and SSABC on Fun12 and Fun25 with D = 30, but SABC-GB outperforms these self-adaptive algorithms significantly on Fun12 and Fun25 with D = 50. The efficiency of SABC-GB is analogous to that of the other algorithms at the beginning of Fun6 and Fun12. However, the other approaches fall into local optima, and so the performance of SABC-GB is superior at the end of the evolution process. Thus, the convergence performance of SABC-GB is better than that of the other algorithms in general, and the results demonstrate that the modified self-adaptive strategies are very effective.

Fig. 3
figure 3

Convergence performance of SABC-GB, SACABC, ISABC, QAABC and SSABC. a Fun6 with D \(=\) 30. b Fun6 with D \(=\) 50. c Fun12 with D \(=\) 30. d Fun12 with D \(=\) 50. e Fun25 with D \(=\) 30. f Fun25 with D \(=\) 50

4.2.4 Application of SABC-GB in a real-world problem

To validate the feasibility of SABC-GB, we concluded experiments using real-life problems. This paper employs two standard datasets: Wine and Ionosphere (Frank and Asunction 2010). The information of datasets is shown in Table 10, and the fitness comparisons between K-means and SABC-GB are shown in Table 11. In Table 11, the best results are in bold.

Table 10 Datasets involved in experiments
Table 11 Fitness comparisons between K-means and SABC-GB

First, the initial centers were generated randomly. K-means was used to calculate the distance from each point to the clustering centers. SABC-GB was then used to generate better centers according to previous experience in the evolutionary process. Previous experimental results show that the fitness is greatly influenced by the centers, and better centers would produce better results.

It can be seen from Table 11 that the variation in the fitness values of SABC-GB is small, and that the standard deviation of the SABC-GB results is less than that of K-means on each dataset. SABC-GB can generate better fitness values and find better centers. This is mainly thanks to the self-adaptive candidate strategies, which adjust the search step length adaptively and find optimal solutions. Obviously, the standard deviation of SABC-GB is relatively small, because K-means find better solutions with better initial centers. The initialization and self-adaptive divisor of SABC-GB improve the clustering accuracy. From the above results, we conclude that the clustering precision and problem-solving efficiency are greatly improved by the proposed SABC-GB algorithm.

5 Conclusion

In this paper, we have proposed a novel algorithm called SABC-GB to solve global optimization problems. The initial populations of SABC-GB are generated by a dedicated algorithm, and the solution search strategy is self-adaptively selected. Experimental results using 25 benchmark functions show that the SABC-GB algorithm outperforms GABC, ABC/best/1, and ABC/best/2. The SABC-GB algorithm has excellent optimization ability, and it is a very strong algorithm. It avoids falling into local optima and is worthy of further research. The application of SABC-GB also indicates K-means clustering could be improved using SABC-GB.

However, the proposed approach also has some shortcomings. SABC-GB exhibited worse performance than previous algorithms with a few test functions. There are many other problems with this algorithm at present, but it should be pointed out that it is still a very good probabilistic algorithm.

The present study can be extended in various directions. To accelerate the search process and find better solutions to global optimization problems, an optimal means of combining local search with global search would clearly be beneficial. As the integration of self-adaptive candidate strategies outperformed previous algorithms, we intend to analyze candidate strategies and include the best strategies for each problem. We will also research the candidate solution pool size and the selection of strategies for this pool.