1 Introduction

Global optimization is a concerned research area in science and engineering for several decades. As many real-world optimization problems can be converted as global optimization problems, good optimization algorithms are always needed to derive the global optimal solution. Some traditional methods often fail while solving complex global optimization problems (Floudas and Gounaris 2009). Some meta-heuristics are developed to improve the performance of solving global optimization problems. Among these methods, particle swarm optimization algorithm (PSO) and its variants play an important role in solving global optimization problems for it is easy to be realized. Inertia weight PSO (WPSO) is introduced to balance the global and local search abilities. Linearly decreasing inertia weight PSO (LDWPSO) is proposed to improve the performance of WPSO (Shi and Eberhart 1998). PSO with a constriction factor (PSO-cf) was introduced by Clerc and Kennedy (2002). Some methods have been presented to improve the performance of PSO by designing different types of topologies. The conclusion that PSO with a small neighborhood might perform better on complex problems, while PSO with a large neighborhood would perform better on simple problems is drawn by Kennedy and Mendes (2002), Suganthan (1999). A PSO with dynamically adjusted neighborhoods is proposed by Suganthan (1999), in this method, the neighborhood of a particle gradually increases until it includes all particles. A unified particle swarm optimizer (UPSO) is introduced with combining the global version and local version together (Parsopoulos and Vrahatis 2004). Peram and Veeramachaneni developed the fitness-distance-ratio-based PSO (FDR-PSO) with near neighbor interactions (Peram et al. 2003). Liang et al. (2006) proposed a comprehensive learning particle swarm optimizer (CLPSO), all other particles’ historical best information is used to update the velocity of a particle. Self-learning particles swarm optimizer (SLPSO), in which each particle has a set of four strategies to cope with different situations in the search space, and the cooperation of the four strategies is implemented by an adaptive learning framework at the individual level, is proposed in Li et al. (2012). In addition, PSOs with several swarms are also studied for improving the performance of PSO algorithms. Cooperative particle swarm optimizer (CPSO) using multiple swarms to optimize different components of the solution vector cooperatively is introduced in Bergh and Engelbrecht (2004). Multi-swarm cooperative particle swarm optimizer (MCCPSO) based on master–slave model is presented in Niu et al. (2007); in the method, the slave swarms execute a single PSO and the master swarm evolves based on its own knowledge and the knowledge of the slave swarms. A multi-swarm PSO using charged particles in a partitioned search space is introduced in Dor et al. (2012), two kinds of swarms are used in this algorithm, the main swarm gathers the best particles of auxiliary ones and initialized several times. The auxiliary swarm is initialized in different areas, and an electrostatic repulsion heuristic is applied in each area to increase its diversity. A multi-swarm self-adaptive and cooperative particle swarm optimization (MSCPSO) using four sub-swarms to avoid the algorithm fall into local optimum is introduced in Zhang and Ding (2011). Moreover, some other EAs such as jDE (Brest et al. 2006), CMA-ES (Arnold and Hansen 2012) and JADE (Zhang and Sanderson 2009) are also used for solving global optimization problems.

In recent years, a new meta-heuristic algorithm, which is called teaching–learning-based optimization (TLBO), is proposed for global optimization problem (Rao et al. 2012). The algorithm simulates the teaching–learning process in a classroom, and each student represents a possible solution of the optimization problems. Details of the conceptual basis of TLBO were given by Waghmare (2013). Some results indicate that TLBO outperforms some meta-heuristics for constrained benchmark functions and non-linear numerical optimization problems (Rao et al. 2011, 2012). It has been extended to engineering optimization (Yu et al. 2014), such as constrained mechanical design optimization problems (Rao et al. 2011), engineering structure optimization problems (Vedat 2012), job-shop scheduling problem (Adil et al. 2014; Xu et al. 2015). It is also used for solving multi-objective optimization (Niknam et al. 2012; Rao and Patel 2011, 2013; Rao and Waghmare 2014; Zou et al. 2013), clustering problems (Naik et al. 2012; Suresh and Anima 2011), etc. Several variants of TLBO algorithm have started to improve the performance of TLBO (Hossein et al. 2011; Rao and Patel 2012). Similar to other natural computation methods, TLBO also might trap in local optimum when solving complex problems with multiple local optimal solutions.

To improve the performance of TLBO algorithm for solving global optimization problems, a multi-class cooperative teaching–learning-based optimization algorithm with simulated annealing operator (SAMCCTLBO) is proposed in this paper. A class is divided into several sub-classes, the students in each sub-class update their positions according to the teacher and the mean of their sub-classes in teacher phase, and they also learn knowledge from other students in their sub-classes. To improve the diversity of sub-classes, all individuals should be regrouped to form the new sub-classes after some evolutionary generations. The simulated annealing method is used to select some worse gens for the new population to increase the diversity of the whole class. The proposed algorithm is tested on some benchmark functions, and the results are compared with those of some other algorithms.

The rest of the paper is organized as follows: in Sect. 2, original TLBO algorithm is simply introduced. SAMCCTLBO is proposed in Sect. 3. Section 4 presents the test functions and the discussion of the experimental results. Some conclusions and the future research works are given in Sect. 5.

2 Teaching–learning-based optimization (TLBO) algorithm

Teaching–learning-based optimization (TLBO) algorithm is originally developed by Rao et al. (2012). It is based on the influence of a teacher on the output of learners in a class in terms of results or grades. It is also a population-based algorithm which simulates the teaching–learning process in the classroom. The teacher of the class is generally considered as a highly learned person who shares his or her knowledge with the learners. Generally, a good teacher benefits for improving the marks or grades of the students in his or her class. Moreover, learners also learn knowledge from others in their class to improve their marks and grades. For optimization, the solutions of the problem are represented by learners. The learner with the best fitness in current generation is chosen as the current teacher of the class. The learning process of TLBO is divided into two stages. The first stage is called the teacher phase and the second stage is called the learner phase. In teacher phase of TLBO, learners learn knowledge from the teacher to improve the average score of the class. In learner phase, learners learn knowledge from another random learner to improve their performance. The main stages of original TLBO are simply introduced as follows.

2.1 Teacher phase

In the teacher phase, all students learn knowledge from the teacher. The learner with the best fitness is chosen as the current teacher, which is represented by \(X_{\text {teacher}} \). Suppose an objective function \(f(x)\) with \(n\)-dimensional variables, the learner \(i\) can be represented as \(X_i =[x_{i1} ,x_{i2} ,\ldots ,x_{in} ]\). At any iteration \(g\), the mean position of all learners in current iteration is calculated as \(X_{g{\text {mean}}} =\frac{1}{m}\Big [\sum \nolimits _{i=1}^m {x_{i1} } ,\sum \nolimits _{i=1}^m {x_{i2} } ,\ldots ,\sum \nolimits _{i=1}^m {x_{in} } \Big ]\), \(m \) is the number of learners in a class. For the \(i\)th learner, it updates its position as follows.

$$\begin{aligned} X_{i,\text {new}} =X_i +r_1 (X_{\text {teacher}} -T_F X_{g\text {mean}} ) \end{aligned}$$
(1)

where \(X_{i,\text {new}}\) and \(X_{i}\) are the new and the old positions of the \(i\)th learner, \(r_{1}\) is a random number, uniformly distributed in [0, 1]. Value of \(T_F =round[1+rand(0,1)\{2-1\}]\) can be either 1 or 2 which is shown in Eq. (2). If the new solution of the \(i\)th learner is better than the old one, the old position of the student will be replaced by the new one, or the position of the \(i\)th learner is not changed.

2.2 Learner phase

In the learner phase, learners lean knowledge from others. A learner randomly selects a learner which is different from him or her as the learning object. For \(j\)th learner \( X_{j}\), it randomly selects the \(k\)th learner \( X_{k}\) which is different from him or her, the process of \(X_{j}\) learning knowledge from \(X_{k}\) is shown as follows.

If the fitness of \(X_{j}\) is better than that of \( X_{k}\), then

$$\begin{aligned} X_{j,\text {new}} =X_j +r_j (X_j -X_k ) \end{aligned}$$
(2)

Else

$$\begin{aligned} X_{j,\text {new}} =X_j +r_j (X_k -X_j ) \end{aligned}$$
(3)

If the new position \(X_{j,\text {new}}\) is better than the old one \(X_{j}\), the old position \(X_{j}\) will be replaced by the new position \(X_{j,\text {new}}\), otherwise, the position of \(j\)th learner is not changed. \(r_{j}\) is a random number, uniformly distributed in [0, 1]. The detailed algorithm can be found in Parsopoulos and Vrahatis (2004).

3 Multi-class cooperative TLBO with simulated annealing operator (SAMCCTLBO)

3.1 The main framework of SAMCCTLBO

Just like other population-based algorithms, due to the intrinsic randomness, TLBO suffers from premature convergence when solving complex optimization problems. A proper tradeoff between exploration and exploitation is necessary for the efficient and effective operation of a population-based stochastic optimization algorithm. The main motivation of our method is using cooperation of multi-swarms and simulated annealing method to prevent the whole population from getting trapped in a local optimum and improve the global performance of TLBO. The main framework of SAMCCTLBO is given as follows.

Step 1. Initialize the population of the classroom and randomly group the population into different sub-classes. Set the initial parameters of the algorithm.

Step 2. Calculates the mean position of different sub-classes and chooses the best learner among all sub-classes as the teacher.

Step 3. Executes the teacher phase of SAMCCTLBO with simulated annealing operator.

Step 4. Executes the learner phase of SAMCCTLBO with simulated annealing operator.

Step 5. Randomly regroups the population to form the new sub-classes.

Step 6. If the terminal condition is not satisfied, go to step 2, otherwise, output the best solution.

The main parts of the framework are introduced as follows.

3.2 The need for using multi-class cooperation and SA operator

In the teaching–learning process of a real classroom, a teacher can go to any place of the classroom to impart his or her knowledge, for example, solving the problem of any learner, mentoring the individuals in the class. The smaller size of class will benefit for quickly sharing the knowledge of the teacher, the average grade of the small class might be improved quicker than the big class. Microteaching might increase the convergence speed of the algorithm. Moreover, for a learner, he or she easily learns knowledge from other learners around himself or herself, and the good diversity of the local domain will benefit for local searching. Based on this idea, a multi-swarm TLBO is designed in our algorithm, the population in a class is divided into some sub-classes. In the teacher phase, the positions of learners in different sub-classes are renewed by the mean positions of the sub-classes and the position of the teacher of current generation. In the learner phase, the learners learn knowledge from other learners in their sub-classes.

According to the basic operator of TLBO, the good learners always maintained in the population whatever in teacher phase and learner phase, the searching process of it is greedy. These operators might make the diversity of the class decrease quickly with the development of evolution. When all learners are almost similar to the teacher, it is difficult to renew their position. The algorithm is easily trapped into local convergence. The regroup operator can change the diversity of the sub-classes, it cannot change the diversity of the whole class. Simulated annealing method is a very simple method to change the diversity of the population with introducing some worse learners according to a certain possibility. It is easily realized. In the paper, it is chosen to change the diversity of the population. The average position the population makes large effect for the original TLBO, the serve change of it might make large oscillation for the algorithm. To maintain the stable performance of the algorithm, only some bits of the worse learner are introduced in the new population.

3.3 Teacher phase of SAMCCTLBO

During teacher phase of SAMCCTLBO, the individual with the best fitness is chosen as the teacher (\(X_{\text {teacher}})\) of current generation. The learners in different sub-classes improve their performance by moving their positions towards the position of the teacher (\(X_{\text {teacher}})\) with taking into account the current mean position \(X_{C\text {mean}}\) of the learners in different sub-classes. Equation (4) shows how the learners in different sub-classes renew their positions.

$$\begin{aligned} X_{i,\text {new}} =X_i +r_1 (X_{\text {teacher}} -{T_F X_{C}}_\text {mean}) \end{aligned}$$
(4)

where \(X_{i,\text {new}}\) and \(X_{i}\) are the new and old positions of \(i\)th learner, respectively. \(r_{1}\) is a random number, uniformly distributed in [0, 1]. \(X_{\text {teacher}}\) is the best position of the current generation, \(X_{C\text {mean}}\) is the mean position of \(C\)th sub-class. \(T_{F}\) is defined as shown in Eq. (2). If the new position \(X_{i,\text {new}}\) is better than the old one \(X_{i}\), \(X_{i,\text {new}}\) is accepted, otherwise, if \(X_{i,\text {new}}\) should be accepted according to the possibility which is determined by simulated annealing operator, a bit of the new position \(X_{i}\) will be randomly selected to take the place of the corresponding bit of \(X_{i,\text {new}}\), and \(X_{i,\text {new}}\) with the old bit is made to flow to the next phase. This operation can prevent the population from severe damaged. This operation is not adopted for the teacher in current generation.

3.4 Learner phase of SAMCCTLBO

In the learner phase of SAMCCTLBO, a learner often learns knowledge from other learners around them easily. The learners learn knowledge from other students in their sub-classes in SAMCCTLBO. For \(i\)th learner \( X_{Ci}\) in the Cth class, randomly select \(k\)th individual \( X_{Ck}\) in Cth sub-class which is different from \(X_{Ci}\), \(X_{Ci}\) updates its position as follows.

If \(X_{Ci}\) is better than \(X_{Ck}\) according to their fitness, then

$$\begin{aligned} X_{Ci,\text {new}} =X_{Ci} +r_i (X_{Ci} -X_{Ck} ) \end{aligned}$$
(5)

Else

$$\begin{aligned} X_{Ci,\text {new}} =X_{Ci} +r_i (X_{Ck} -X_{Ci} ) \end{aligned}$$
(6)

where \(r_{i}\) is random number, uniformly distributed in [0, 1]. If the new position \(X_{Ci,C\text {new}}\) is better than \(X_{Ci}\), the new position \(X_{Ci,C\text {new}}\) will be accepted, otherwise, the similar method with SA operator in teacher phase is also used to increase the diversity of the whole class.

3.5 Produce the new sub-classes (regroup)

When all learners complete the teacher and learner process, the new sub-classes of next generation should be generated. To improve the diversity of the sub-classes, the regroup operator is used in the given algorithm. In the real class, the positions of students are often changed after a period time to improve the effectiveness of teaching. Simulating this phenomenon, the learners in the class are regrouped after a given generation. This operator benefits for improving the diversities of the sub-classes, and the process of it is simple. A classroom with nine learners who are divided into three sub-classes is shown in Fig. 1 to interpret the process of forming the new sub-classes.

Fig. 1
figure 1

The regroup process of SAMCCTLBO

In Fig. 1, the initial population is composed by nine learners; in the first generation, the learners are grouped randomly; the first group contains learner 1, learner 3 and learner 5; the second group contains learner 2, learner 4 and learner 7, and the third group contains learner 6, learner 8 and learner 9. After some generations, the learners in the different sub-classes are integrated as a whole, and it is randomly regrouped again. As shown in Fig. 1, in the second generation, the learner 1, the learner 4 and the learner 7 are grouped in the first sub-class, the learner 2, the learner 5 and the learner 8 are grouped in the second sub-class, and the learner 3, the learner 6 and the learner 9 are grouped in the third sub-class. The diversity of the sub-classes is changed with change of the learners around them.

3.6 Simulated annealing operator

The detailed process of it can be found in Dekkers and Aarts (1991). In our method, to prevent the individual seriously damaged, a random bit chosen method is designed. The operator is simple, introduced as follows.

For the teacher, accept the better solution. For other learners, if the new solutions are better than the old ones, accept the new solutions. When the fitness value of the new learner is not better than the old one, calculate the possibility \(p\) according to Eq. (7).

$$\begin{aligned} p=\exp (-(\text {fitness}(X_{\text {new}} )-\text {fitness}(X_{\text {old}} ))/T_k ) \end{aligned}$$
(7)

where \(fitness(X_{\text {new}} )\) and \(fitness(X_{\text {old}} )\) are the fitness of the new and the old positions of the individuals, respectively. \(T_{\text {k}}\) is the temperature of \(k\)th simulated annealing operator. If rand(.)  \(<p\), a bit will be randomly selected from the old learner, and the correspondence bit of the new learner is replaced by it, the new learner with the old bit is made to flow to the next phase.

3.7 The steps of SAMCCTLBO algorithm

The detailed steps of SAMCCTLBO algorithm is described as follows.

Step 1: Set the maximal value \( X_{\text {max}}\) and minimal value \(X_{\text {min }}\) of variables, the maximal evolution generation genmax, the population size popsize, the class size \(C\), the initial temperature \(T_{0}\), the number of simulated annealing operator \(K\), the temperature reduction coefficient \(\lambda \). Initialize the initial population pop as follows.

$$\begin{aligned} {\text {pop}}=X_{\min } +r \times (X_{\max } -X_{\min } ) \end{aligned}$$
(8)

where \(r\) is a random number, uniformly distributed in [0, 1].

Step 2: Calculate the fitness value of all learners, and choose the best learner as the teacher of current generation.

Step 3: Divide the population into some sub-classes, the number of learners in each sub-class is shown in Eq. (9).

$$\begin{aligned} {\text {Csize}}={\text {popsize}}/c \end{aligned}$$
(9)

In general, popsize is the integer times of \(C\). The size of all sub-classes is the same.

Step 4: Calculate the mean value \(X_{C\text {mean}}\) of each sub-class and the learners in each sub-class implement teacher phase in different sub-classes according to Eq. (4). For the teacher, accept the better solution, and for other learners, execute simulated annealing operator according to part 3.6.

Step 5: For learners in each sub-class, randomly select a different learner in the same sub-class and implement the learner phase according to Eqs. (5, 6). The method of accepting new individual is the same as used in Step 4.

Step 6: when the generation satisfied the follow condition, the learners in the class will be regrouped.

$$\begin{aligned} \mod (\mathrm{gen}/M)=0 \end{aligned}$$
(10)

where gen is the current evolutionary generation, and \(M\) is the set period.

Step 7: If the number of simulated annealing operator is not arrived, \(T_k =\lambda \times T_{k-1} \), if the terminal condition of the algorithm is not satisfied, the algorithm will go back to Step 4, or it is ended, output the best solutions.

3.8 The analysis of diversity and the number of class size

In the paper, two methods are designed to improve the diversity of the algorithm. The regrouping method is utilized to improve the diversity of the sub-classes. The simulated annealing operator is used to improve the diversity of the whole class. To show the efficiency of them, Rosenbrock function is simulated in this part. In the example, the size of the class is 4, the maximal evolutionary generation is 500, \(\lambda \) is 0.9, the initial temperature is 1,000, the population size is 20, the dimension of the function is 30, the period of regrouping is 10. The first example is the comparison of SAMCCTLBO with and without regrouping operator for the first sub-class. The second example is the comparison of multi-class cooperative teaching–learning-based optimization algorithm (MCCTLBO) with and without simulated annealing operator. A simple method with absolute distance is used in the paper to express the diversity of the population. It is shown in Eq. (11).

$$\begin{aligned} \mathrm{Div}(\mathrm{gen})=m \times (m-1)/2\sum \limits _{i=1}^{m-1} {\sum \limits _{j=I+1}^m {\sum \limits _{l=1}^D {\left| {X(i,l)-X(j,l)} \right| } } } \end{aligned}$$
(11)

where Div(gen) is the diversity function of gen generation, \(m\) is the population size, \(D \) is the dimension of the individuals, \(X(i\), \(l)\) and \(X(j\), \(l)\) are the \(l\)th dimensional variables of the \(i\)th and the \(j\)th individuals, respectively. The diversity of the first and the second examples is shown in Fig. 2.

Fig. 2
figure 2

The diversity examples for Rosenbrock function with different operator. a The result of SAMCCTLBO with and without regrouping operator, b the result of MCCTLBO and SAMCCTLBO

The Fig. 2a displays that the diversity of SAMCCTLBO with regrouping operator is better than it without regrouping operator for the first class most of the time. Figure 2b shows that the diversity of SAMCCTLBO is better than MCCTLBO. The Fig. 2 also shows that the diversity of the improved algorithm is frequently changed.

The number of class size is also an important parameter for the proposed algorithm. The size of class should be larger than 2 and it is should satisfy Eq. (9). To show the influence of the size of sub-class for the algorithm, Rosenbrock function is also simulated with different sizes of sub-class. In the example, the population is 30, the other parameters of it is the same as those used in the above examples except the maximal evolutionary generation is 10,000. The results with different sizes (2, 3, 5, 6, 10, 15) of sub-class are shown in Table 1.

Table 1 The fitness value of Rosenbrock function with different size of class

Table 1 shows that the fitness value decreases when the size is smaller than 5, and then increases when the size of sub-class is larger then 6. In our experiments, the median value is chosen. For example, if the size of population is 30, the size of sub-classes is 5. If the size of population is 20, the size of sub-classes is 4.

3.9 The computation cost of algorithm

The fitness evaluation (FEs) often affects the computation cost of the algorithm, for large parts of EAs, the function is often evaluated once in a generation. But in TLBO, the function should be evaluated two times in a generation. For example, if the size of population is \(m\), the maximal generation is genmax, the FEs of TLBO is \(2 \times m \times \mathrm{genmax}\). For SAMCCTLBO algorithm, there are some worse individuals according to possibility \(p\) should be evaluated, the FEs is \(2 \times m \times {\text{ genmax }} +2 \times p \times m \times {\text {genmax}}\). The complexity of the proposed algorithm is larger than original TLBO if the size of population of them is the same. To balance the comparison, in our experiments, the size of population in SAMCCTLBO is smaller than that used in TLBO, it is smaller than half the size of population in other EAs with only one FEs in a generation.

4 Simulation results and discussions

Twenty-four of the well-known benchmark functions are used to evaluate the performance of SAMCCTLBO in this paper. The benchmark functions were also used in some other references (Liang et al. 2006; Sabat et al. 2011; Tang et al. 2007; Yao et al. 1999). To compare the performance of SAMCCTLBO with some other methods, UPSO (Parsopoulos and Vrahatis 2004), jDE (Brest et al. 2006), CMA_ES (Arnold and Hansen 2012; Wang et al. 2011), JADE (Zhang and Sanderson 2009) SaDE (Qin et al. 2009) fully informed particle swarm (FIPS) (Mendes et al. 2004) FDR-PSO (Peram et al. 2003) CLPSO (Liang et al. 2006) and TLBO (Rao et al. 2012) ETLBO (Rao and Patel 2012) are also simulated.

4.1 Parameter settings

For the purpose of reducing statistical errors, each function is independently simulated 50 runs, and their mean results are used in the comparison. The value of the function is defined as the fitness function. All the experiments are carried out on the same machine with a Celeron 2.26 GHz CPU, 512-MB memory system with Matlab software. All functions are simulated in 10 and 30 dimensions. The 24 functions are summarized in Table 2. “Range” is the lower and upper bounds for the variables. “Optima” is the theoretical global minimum solution. “Acceptance” is the acceptable solutions of the functions. For all PSOs and DEs, the evolutionary parameters are same as they are used in the references except that the population size is 50. Because the FEs of TLBO is more than 2 times of PSOs and DEs in a generation, the size of population of TLBO is 25; the size of population of SAMCCTLBO is 20. For ten-dimensional functions, the maximal fitness evaluations are 50,000 and 100,000 for 10- and 30-D functions, respectively. The size of class with SAMCCTLBO is 4.

Table 2 24 Benchmark functions

4.2 Experimental results and comparisons

4.2.1 The results and the analysis of ten-dimensional functions

Table 3 displays the best solutions, the mean solutions, the standard deviations of the 50 independent runs of the ten algorithms on the 24 test functions. The best results among ten algorithms are shown in bold words.

Table 3 Search result comparisons (the best solutions, average best solutions, standard deviation) among ten algorithms on 24 functions with 10 dimensions

Table 3 displays CMA_ES has the smallest best solutions, the smallest means and the smallest standard deviations for 10-D functions \(f_{1}, f_{2}, f_{3}, f_{4}\). The performance in terms of the best solutions, the means and the standard deviations of SAMCCTLBO for functions \(f_{1},f_{2},f_{3}\) are better than those of other algorithms except those with CMA_ES. For function \(f_{4}\), the performance of TLBO in terms of three solutions is better than those of other algorithms except with CMA_ES. For function \(f_{5}\), JADE has the smallest solutions, followed by it are CMA_ES, jDE, ETLBO and some other algorithms. For functions \(f_{6},f_{7},f_{8},f_{9}\), the solutions of SAMCCTLBO converge to theoretical optima. For function \(f_{7}\), three DEs have the same solutions as it is with SAMCCTLBO. The performances of three TLBOs, SaDE and PSOFDR are the same. For function \(f_{9}\), jDE and SAMCCTLBO have the same performance. Three DEs have the best performance and the solutions of them are the same. The solutions of SAMCCTLBO are better than those of other two TLBOs. For functions \(f_{11 }\) and \(f_{12}\), the solutions of CMA_ES are the best, and SAMCCTLBO ranked the second. CMA_ES has the best performance for function \(f_{13}\) and that of JADE followed by it. For functions \(f_{14 }\) and \(f_{15}\), the performance in terms of the three merits of SAMCCTLBO is the best among all ten algorithms. Three TLBOs and SaDE have the same best solutions and they all converge to theoretical optima for function \(f_{16}\). CMA_ES and SaDE express the best performance for functions \(f_{17 }\) and \(f_{18}\), respectively. For functions \(f_{19 }\) and \(f_{20}\), all algorithms can converge to theoretical optima except the functions tested with PSOFDR. For function \(f_{21}\), the best solutions of large parts of algorithms equal to theoretical optima, but the means of them cannot arrive the theoretical optima. jDE has the smallest standard deviations. For functions \(f_{22 }\) and \(f_{24}\), three DEs have the same best performances. All algorithms almost converge to theoretical optima for function \(f_{23}\). Table 3 also shows that the performances of CMA_ES outperform those of other algorithms for 12 functions among the 24 test functions. The three merits of SAMCCTLBO generally outperform those of large part of algorithms except those with CMA_ES. The results also show that the performance of TLBO is globally improved by the improved methods.

To compare the computation cost and robustness of different algorithms, the average fitness evaluations (mFEs) and the successful ratios of the algorithms are shown in the Table 4. “mFEs” is the average FEs when the algorithm can converge to the acceptable solutions in all runs. If the algorithm is not convergent in all runs, the “mFEs” is expressed by “NaN”.

Table 4 Search result comparisons (the mFEs and successful ratios) among ten algorithms on 24 functions with 10 dimensions

Table 4 displays that CMA_ES has the smallest mFEs for large part of functions except for functions \(f_{7,} \quad f_{8}\), \( f_{10}\) and \(f_{18}\). For functions \(f_{7,} \quad f_{8}\) and \( f_{10}\), SAMCCTLBO has the smallest mFEs. JADE has the smallest solution in terms of mFEs for function \(f_{18}\). Except for CMA_ES, the mFEs of SAMCCTLBO are smaller than those of other algorithms for 17 functions. For function \(f_{4 }\) and \( f_{24}\), the mFEs of TLBO is smaller than those of SAMCCTLBO. For functions \(f_{5,}\,f_{13,}\, f_{18,}\, f_{21}\), the mFEs of JADE are less than those of SAMCCTLBO. For functions\( f_{22 }\) and \(f_{23,}\) jDE and ETLBO algorithm have better performance in terms of mFEs than SAMCCTLBO. Table 4 also indicates that the successful ratios of SAMCCTLBO for 15 functions are 100%. The successful ratios of SAMCCTLBO for nine functions are lower than some other algorithms. Three DEs have the higher successful ratios for five shift functions. The best solutions are shown in Table 4 with bold words.

To determine whether the results obtained by SAMCCTLBO are statistically different from the results generated by other algorithms, the nonparametric \(t\) tests are conducted between the best results of SAMCCTLBO and those achieved by the other algorithms for all functions. The \(t\) values and \(p\) values on every function of this two-tailed test with a significant level of 0.05 are shown in Table 5. Rows “Better”, “Same”, and “worse” represent the number of functions that SAMCCTLBO performs significantly better than, almost the same as, and significantly worse than the compared algorithm, respectively. The “better” are shown with bold words in the table. Table 5 indicates that the average excellent ratio \(\left( \sum \nolimits _{i=1}^9 {\text {better}(i)} /24\times 9\right) \) of SAMCCTLBO is 51.39 % for ten-dimensional functions. The Table 5 also shows the number of “better” is less than that of “worse” between and SAMCCTLBO and CMA_ES.

Table 5 Comparisons between SAMCCTLBO and other algorithms on \(t\) tests for ten-dimensional functions

4.2.2 The results and analysis of 30\(-\)dimensional functions

The experiments conducted on ten-dimensional functions are repeated on the 30-dimensional functions, the best solutions, the mean solutions, the standard deviations of the 50 independent runs of the ten algorithms for the 24 test functions are shown in Table 6. The mean FEs (mFEs) and the successful ratios are shown in Table 7. The \(t\) tests results are shown in Table 8. To display the convergence process of different algorithms, the results of the front four functions are given in Fig. 3. Figure 3 shows the convergent process of ten algorithms for functions \(f_{1,} \quad f_{2}\), \( f_{3}\) and \(f_{4}\). The figures are only used to show the convergent process of the ten algorithms, the detailed solutions can be found in the tables. Figure 3 displays that SAMCCTLBO has the fast convergence speed for the functions \(f_{1}\) and \(f_{3}\). CMA_ES has the fast convergence speed for the functions \(f_{2}\) and \(f_{4}\).

Table 6 Search result comparisons (the best solutions, average best solutions, standard deviation) among ten algorithms on 24 functions with 30 dimensions
Table 7 Search result comparisons (the mFEs and successful ratios) among ten algorithms on 24 functions with 30 dimensions
Table 8 Comparisons between SAMCCTLBO and other algorithms on \(t\) tests for 30-dimensional functions
Fig. 3
figure 3

The representative convergence curves of the ten algorithms

Table 6 displays that SAMCCTLBO can find the theoretical optima for 12 functions \(f_{1},f_{3}, f_{6},f_{7},f_{8}\), \( f_{9},f_{11}, f_{14}, f_{15},f_{16}\), \( f_{17}\) and \( f_{19}\). For functions \(f_{1},f_{3}, f_{8}\), \( f_{9},f_{16}\) and \( f_{17}\), the merits in terms of the best solutions, the mean solutions, the standard deviations of three TLBOs are the same. The solutions of them are better than those of other algorithms. CMA_ES has the best performance in terms of the three merits for functions \(f_{2},f_{4}, f_{5}\),\( f_{12}\) and \( f_{13}\). For functions \(f_{10},f_{22}, f_{23}\) and \( f_{24}\), three DEs almost have the same performance except that the standard deviations of SaDE is larger than the other DEs for function \(f_{22}\), and smaller than the others for function \(f_{23}\) and \( f_{24}\). For function \(f_{18},\) all algorithms cannot converge to the acceptable solution, the performances of JADE are better than that of other algorithms. The table also indicates that the performances of DEs for shift functions are generally better than those of other algorithms which are used in the paper.

Table 7 shows the mFEs and the successful ratios of different algorithms in the paper. The table indicates that the mFEs of SAMCCTLBO are less than those of other algorithms for functions \(f_{1},f_{2}, f_{3},f_{6},f_{7}\), \( f_{8},f_{11}, f_{14},f_{15},f_{16}\) and \( f_{17}\). The performance in terms of mFEs of CMA_ES is better than those of some other algorithms for functions \(f_{4},f_{5}, f_{9},f_{12},f_{13}\), \( f_{19},f_{21}\). JADE has the smallest mFEs for functions \(f_{10},f_{22}\) and \( f_{23}\). For function \(f_{20}\), SaDE has the smallest mFEs. SAMCCTLBO and CMA_ES can converge to acceptable solutions with 100 % successful ratios for 15 functions among the 24 test functions. JADE can reach the acceptable solutions with 100 % successful ratios for 17 functions. The performance in terms of successful ratios for other algorithms is relative lower than these three algorithms. Table 8 displays that the average excellent ratio of SAMCCTLBO is 55.99 % for 30-dimensional functions.

4.2.3 The results and the analysis of 200\(-\)dimensional functions

To test the performance of the proposed algorithm in dealing with higher dimensional problems, some 200\(-\)dimensional functions \(f_{2},f_{4}, f_{5},f_{12}\) and \(f_{13 }\) are simulated in this section. The training parameters of these experiments are same as those for 30-dimensional functions. Because the algorithms cannot converge to acceptable solutions for large part of functions, only the average solutions, the standard deviations of the 50 independent runs are listed in Table 9.

Table 9 Search result comparisons (the average best solutions, standard deviation) among ten algorithms on 5 functions with 200 dimensions

Table 9 displays that SAMCCTLBO has better performance in terms of the mean and the standard deviations than those of other algorithms for functions \(f_{2}, f_{4}, f_{5 }\) and \( f_{12}\). The performance of CMA_ES is better than those of other algorithms for function \(f_{13}\), and the performances of TLBOs are worse than those of DEs.

To take full advantage of microteaching, the population of the SAMCCTLBO algorithm is divided into several sub-classes, the learning of each learner is now restricted within a certain sub-classes area and might make the mean solutions improved quickly so as to fully utilize the whole sub-classes space and avoid over-congestion around local optima. With considering the limitation of learning ability of learner of the same sub-classes, all learners are regrouped randomly after a certain generation to improve the diversity of the sub-classes by simulated annealing operator. By analyzing the results of different algorithms for 24 benchmark functions considered, one may conclude that SAMCCTLBO has some advantages compared with some other algorithms. First, for 10-, 30- and 200-dimensional functions, the solution accuracies of SAMCCTLBO are generally superior, at least not worse to those of all variants of TLBO; at the same time, they outperform those of the largest part of some other algorithms except those with CMA_ES. Second, the convergence speeds of SAMCCTLBO are smaller than those of other algorithms for 17 functions among 10-dimensional functions and 11 functions among 30-dimensional functions. Third, the successful ratios of SAMCCTLBO are 100  % for 15 functions among the 24 test functions, and are lower than the part of some other algorithms for other nine functions, when the algorithm can converge to the acceptable solutions in all runs for 10- and 30-dimensional functions. Finally, statistical analysis indicates that SAMCCTLBO performs significantly better than the compared algorithm for the largest functions among the 24 test functions and especially for 30-dimensional functions. Although the proposed SAMCCTLBO algorithm has good performance on most of the functions, it does not perform the best for all benchmark functions in all different dimensions. According to the theorem of “no free lunch” (Wolpert and Macready 1997) one algorithm cannot offer better performance than all the others on every aspect or on every kind of problem. This is also observed in our experimental results. For example, some algorithms have good performance in terms of the mean and the standard deviations for large part of functions, but the mFEs or the successful ratios are worse. In summary, the proposed SAMCCTLBO algorithm has good performance, and are especially superior, at least not worse to all variants of TLBO for rotated benchmark functions and high-dimensional benchmark functions.

5 Conclusions

This paper presents a multi-class TLBO algorithm with combining the operator of simulated annealing algorithm; the mean value of sub-class is used in teacher phase to update the position of student in each sub-class. All learners also learn knowledge from others in their sub-class. Compare to original TLBO, there is no removing duplicate process in SAMCCTLBO, the diversity of different sub-classes is maintained by regrouped of learners, and the diversity of whole population is improved by simulated annealing operator. The results of experiments indicate that the performance of SAMCCTLBO is not the best for all benchmark functions, but it has good performance of solution accuracy and convergence speed for rotated benchmark functions and high-dimensional benchmark functions. Because TLBO is a young algorithm, it is not widely used in many practical problems. Future work will emphasize on extensive study of the applications in more complex practical optimization problems to fully investigate the properties and evaluate the performance of SAMCCTLBO.