1 Introduction

Global optimization problems are continually unavoidable in current engineering and science fields. Mathematically, an optimization problem can be expressed as

$$\mathop {x \in \mathop \Re \nolimits^{n} }\limits^{\text{Minimize}} \mathop f\nolimits_{i} (x),\quad (i = 1,2, \ldots ,M)$$
(1)
$${\text{s}} . {\text{t}} .\quad \mathop h\nolimits_{j} (x) = 0,\quad (j = 1,2, \ldots ,J)$$
(2)
$$\mathop g\nolimits_{k} (x) \le 0,\quad (k = 1,2, \ldots ,K)$$
(3)

Here \(\mathop f\nolimits_{i} (x), \, \mathop g\nolimits_{k} (x)\) and hj(x) are design vector functions

$$x = \mathop {\left( {\mathop x\nolimits_{1} ,\mathop x\nolimits_{2} , \ldots ,\mathop x\nolimits_{n} } \right)}\nolimits^{\text{T}} .$$
(4)

Here the factor xi of x is termed decision or design variables, and they are real discrete, continuous or the combination of two.

The fi(x) functions where i = 1, 2,…, M are termed the cost or objective functions, and in M = 1 case, there is only a single objective. The area covered by the design variables is called the search space or design space ℜn, while the area formed by the cost function results is termed as the response or solution space. The inequalities for gk and equalities for hj are termed as constraints.

Grouping of optimization method can be accomplished in several ways. An uncomplicated way is to view the type of the method, and this segregates the algorithms into two groups: deterministic algorithms and stochastic algorithms. Deterministic algorithms seek a complicated process, and its value and paths of both decision variables and the functions are repeatable. Hill climbing [1] is an example of a deterministic algorithm, which follows the same path every time it is executed for the same preliminary point. The next set is stochastic algorithms, and it consistently has some of the randomness values. Genetic algorithms (GAs) [2] are the best example for stochastic algorithms, whose response to the population changes every time it is executed since it uses various pseudorandom values, and yet, there may not be any vast difference in the final solutions.

Also, there is a third group of algorithms which is a hybrid or a mixture of the stochastic algorithm and deterministic algorithm. The key idea is to start the deterministic algorithm with different initial points. Hill climbing with a random restart is the best example of this group.

Stochastic optimization algorithms are grouped into three main classifications: evolutionary-based [3], physics-based [4] and swarm intelligence-based [5] algorithms. Physics-based methods are divided into two categories: population-based and individual-based algorithms. Evolutionary-based methods are motivated by the laws of a natural progression. The search action begins with a randomly engender population which is going forward over consequent generations. The strong point of these algorithms is that the best entity is consistently joined to form the following generation of the entity. It allows the population to be optimized in the way of generations. GA is the best important evolution motivated method that replicates the Darwinian evolution. Evolution strategy (ES) [6], genetic programming (GP) [7], probability-based incremental learning (PBIL) [8] and biogeography-based optimizer (BBO) [9] are other evolution-based algorithms.

Physics-based algorithms impersonate the physical guidelines in the space. Gravitational search algorithm (GSA) [10], simulated annealing (SA) [11, 12], small world optimization algorithm (SWOA) [13], ray optimization (RO) [14] algorithm, central force optimization (CFO) [15], artificial chemical reaction optimization algorithm (ACROA) [16], galaxy-based search algorithm (GbSA) [17], colliding bodies optimization (CBO) [18] and charged system search (CSS) [19] are the physics-based algorithms.

The third classification of nature-inspired algorithms has swarm-based techniques that mimic the social behavior of groups of animals. Particle swarm optimization (PSO) [20], ant colony optimization (ACO) [21] and animal migration optimization (AMO) [22] are some of the swarm-based algorithms.

In the literature, there are also more metaheuristic algorithms motivated by human behaviors in the literature. Harmony search [23] (HS), teaching–learning-based optimization [24, 25] (TLBO), group search optimizer [26] (GSO), tabu (taboo) search [27] (TS), imperialist competitive algorithm [28] (ICA), group counseling optimization [29] (GCO) algorithm, colliding bodies optimization [18] (CBO), league championship algorithm (LCA) [30], soccer league competition [31] (SLC) algorithm, exchange market algorithm [32] (EMA), mine blast algorithm [33] (MBA), social-based algorithm [34] (SBA), firework algorithm [35] and seeker optimization algorithm [36] (SOA) are some of the popular algorithms.

The research in general in the present area may be divided into three key directions: coming up with novel algorithms, improving the existing algorithms and hybridizing different algorithms. The third research direction deals with hybridizing or mixing different algorithms to reform the achievement or solve distinct problems [37]. Recently, hybridization of algorithms is getting more popular due to their efficiency in handling many real-world issues affecting uncertainty, intricacy, imprecision, noisy habitat and ambiguity.

Despite the symbolic amount of newly recommended algorithms in optimization field, there is an essential query here as to why we need other optimization algorithms. This query can be answered referring to no free lunch (NFL) [38] theorem. It logically substantiates that no one can recommend a method for solving all optimization problems. In other words, it cannot be assured that the achievement of an algorithm in solving a definitive set of problems could solve every optimization problem with different nature and type.

A novel optimization algorithm called hybrid sine–cosine algorithm with teaching–learning-based optimization algorithm (SCA–TLBO) is proposed in this paper, for solving optimization problems and visual tracking. The performance of hybrid SCA–TLBO is benchmarked in following three test phases. In the first phase, a set of eminent test functions comprising of unimodal, multimodal and fixed-dimension multimodal functions is used to verify convergence, exploitation, exploration and local optima avoidance of hybrid SCA–TLBO. In the second phase, performance metrics (the best solutions and average fitness of solution during optimization) are employed to qualitatively observe and substantiate the performance of hybrid SCA–TLBO on shifted two-dimensional test functions. In the final phase, the hybrid SCA–TLBO algorithm is applied for visual tracking as a real thought-provoking case study to demonstrate and verify the performance of this algorithm in practice. The hybrid SCA–TLBO-based tracking framework is used to experimentally measure object tracking error, absolute error, tracking detection rate, root mean square error and time cost as parameters. A comparison of hybrid SCA–TLBO-based tracking framework and three probability-based trackers, viz. alpha–beta (αβ) filter, linear Kalman filter (LKF) and extended Kalman filter (EKF), particle filter (PF), scale-invariant feature transform (SIFT), particle swarm optimization (PSO) and bat algorithm (BA), is presented to unveil the capability of proposed algorithm.

The organization of this paper is as follows. The basics of SCA and TLBO are briefly introduced in Sect. 2. The proposed algorithm is presented in Sect. 3. Section 4 deals with the evaluation of proposed algorithm using twenty-three well-known benchmark functions. Application of hybrid SCA–TLBO for visual tracking is proposed in Sect. 5. Finally, the conclusion is given in Sect. 6.

2 SCA and TLBO

In this section, basic theories of SCA and TLBO are introduced, followed by a detailed introduction of hybrid SCA–TLBO algorithm in the next section.

2.1 Basic of SCA

The sine–cosine algorithm (SCA) is a new metaheuristic algorithm [39], and the solutions are updated based on the sine or cosine function as in Eqs. (5) or (6), respectively:

$$\mathop X\nolimits_{i}^{g + 1} = \mathop X\nolimits_{i}^{g} + \mathop r\nolimits_{1} \times \sin \left( {\mathop r\nolimits_{2} } \right) \times \left| {\mathop r\nolimits_{3} \mathop P\nolimits_{i}^{g} - \mathop X\nolimits_{i}^{g} } \right|$$
(5)
$$X_{i}^{g + 1} = X_{i}^{g} + r_{1} \times \cos \left( {r_{2} } \right) \times \left| {r_{3} P_{i}^{g} - X_{i}^{g} } \right|$$
(6)

In general, the above two functions are combined into one function as in the following equation [39]:

$$\mathop X\nolimits_{i}^{g + 1} = \left\{ \begin{aligned} \mathop X\nolimits_{i}^{g} + \mathop r\nolimits_{1} \times \sin \left( {\mathop r\nolimits_{2} } \right) \times \left| {\mathop r\nolimits_{3} \mathop P\nolimits_{i}^{g} - \mathop X\nolimits_{i}^{g} } \right| ;\quad \mathop r\nolimits_{4} < 0.5 \hfill \\ \mathop X\nolimits_{i}^{g} + \mathop r\nolimits_{1} \times \cos \left( {\mathop r\nolimits_{2} } \right) \times \left| {\mathop r\nolimits_{3} \mathop P\nolimits_{i}^{g} - \mathop X\nolimits_{i}^{g} } \right| ;\quad \mathop r\nolimits_{4} \ge 0.5 \hfill \\ \end{aligned} \right.$$
(7)

where X g+1 i is the location of the present solution with ith dimension and gth generation or iteration, Pi is the destination solution with ith dimension, |·| specifies the absolute cost and r1r2r3 and r4 are random variables. The effects of sine and cosine on Eqs. (5) and (6) are shown in Fig. 1; it shows how the equations describe a space amidst two solutions in the search area.

Fig. 1
figure 1

Effects of sine and cosine in Eqs. (5) and (6) on the next position

The parameter r1 is a random variable which responsible for determining the area of the next solution; this area may be either outside space between Xi and Pi or inside them. In [39], the authors update the parameter r1 using Eq. (8) to balance exploration and exploitation [40].

$$\mathop r\nolimits_{1} = a - a\left( {\frac{g}{G}} \right)$$
(8)

where a is constant defined by the user, G is the maximum number of generations and g is the current generation.

The r2 is a random variable which used to find the direction of the movement of the next solution (i.e., if it toward or outwards Pi). Also, the r3 is a random variable which gives random weights for Pi in order to stochastically de-emphasize (r3 < 1) or emphasize (r3 > 1) the effect of desalination in defining the distance. The r4 is used to switch between the sine and cosine functions as in Eq. (7). A theoretical model of the things of the sine and cosine functions with the range in [− 2, 2] is shown in Fig. 2.

Fig. 2
figure 2

Sine and cosine functions with the range in [− 2, 2] tolerate a solution to go beyond (outside the space among them) or around (inside the space among them) the destination

2.2 Basic of TLBO

Rao et al. [24, 25] proposed a new population-based optimization algorithm called TLBO, inspired by the philosophy of teaching and learning, for the optimization of mechanical design problems. The main idea behind TLBO is the imitation of a traditional learning process consisting of a teacher phase and a learner phase. In the teacher phase, the teacher distributes his knowledge to all students (i.e., students learn from the teacher), whereas in the learner phase, students learn with the help of fellow students (i.e., students learn through interaction with other students). Under TLBO, the population is considered to be a group or class of learners. In optimization algorithms, the population consists of different design variables. A detailed description of TLBO can be found in [24, 25]. In this paper, only the two most important phases of the algorithm are considered.

2.2.1 Teacher phase

The teacher can be considered to be the most educated individual in the society. Hence, the student with the highest marks acts as a teacher during the teacher phase. The teacher tries to build up the mean of the class to her level. This, however, depends on the learning efficiency of the class. This is expressed as

$$\mathop X\nolimits_{{\mathop i\nolimits_{\text{temp}} }} = \mathop X\nolimits_{i} + {\text{random}} \times \left( {{\text{Teacher}} - {\text{TF}} \times {\text{mean}}} \right)$$
(9)

where TF is the teaching factor that decides the value of mean to be changed and mean is the average of the class. TF can be either 1 or 2, (1 means student learns nothing from the teacher, and 2 means student learns everything from the teacher) which is again a heuristic step and decided randomly with equal probability as \({\text{TF}} = {\text{ceil }}[0.5 + {\text{random}}]\). The new solution, \(\mathop X\nolimits_{{\mathop i\nolimits_{\text{temp}} }}\) is accepted only if it is better than the previous solution, that is,

$$\mathop X\nolimits_{i} = \left\{ {\begin{array}{*{20}l} {\mathop X\nolimits_{{\mathop i\nolimits_{\text{temp}} }} ;} \hfill & {f\left( {\mathop X\nolimits_{{\mathop i\nolimits_{\text{temp}} }} } \right) > f\left( {\mathop X\nolimits_{i} } \right)} \hfill \\ {\mathop X\nolimits_{i} ;} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right..$$
(10)

2.2.2 Learner phase

Teaching is not the only education students receive, but they also learn by interacting with each other. This is simulated in the learner phase. In each generation, two students Xm and Xn interact with each other, with the smarter one enhancing the others marks. It can be formulated as

$$\mathop X\nolimits_{{\mathop m\nolimits_{\text{temp}} }} = \left\{ \begin{aligned} \mathop X\nolimits_{m} + {\text{random}} \times \left( {\mathop X\nolimits_{m} - \mathop X\nolimits_{n} } \right);\quad f\left( {\mathop X\nolimits_{m} } \right) > f\left( {\mathop X\nolimits_{n} } \right) \hfill \\ \mathop X\nolimits_{m} + {\text{random}} \times \left( {\mathop X\nolimits_{n} - \mathop X\nolimits_{m} } \right);\quad f\left( {\mathop X\nolimits_{n} } \right) > f\left( {\mathop X\nolimits_{m} } \right) \hfill \\ \end{aligned} \right.$$
(11)

In Eq. (11), (Xm − Xn) is taken as a step. The temporary solution is accepted only if it is better than the previous solution, that is,

$$\mathop X\nolimits_{m} = \left\{ {\begin{array}{*{20}l} {\mathop X\nolimits_{{\mathop m\nolimits_{\text{temp}} }} ;} \hfill & {f\left( {\mathop X\nolimits_{{\mathop m\nolimits_{\text{temp}} }} } \right) > f\left( {\mathop X\nolimits_{m} } \right)} \hfill \\ {\mathop X\nolimits_{m} ;} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right..$$
(12)

3 Proposed algorithm

The hybridizing SCA with TLBO (hybrid SCA–TLBO) is introduced in detail in this section. Both SCA and TLBO are population-based algorithms with individuals being considered as positions in SCA and learners in TLBO. In SCA, the way algorithm moves toward the next position is based on random and adaptive variables; therefore, the satisfactory solution cannot always be found every time. For some cases, SCA is well capable of making full use of the population knowledge and has proved good performance on some benchmark functions. However, sometimes, SCA may not proceed to the optimal solutions to some complex problems. In order to overcome these disadvantages, the SCA method is hybridized with TLBO algorithm to form a novel hybrid SCA–TLBO method. In general, a hybrid metaheuristic method attempts to combine two or more metaheuristic methods. This can fully exert the useful features from the original algorithms. In the proposed hybrid SCA–TLBO, the idea of TLBO is introduced into the SCA in order to improve its search ability.

In hybrid SCA–TLBO, standard SCA algorithm is utilized to search globally for the purpose of making most solution move toward a more promising area. After exploration stage, TLBO algorithm is utilized to search locally with a small step to get the final best solution. Based on the mainframe of hybrid SCA–TLBO, the SCA emphasizes the diversification at the beginning of the search with a big step to explore the search space extensively and evade trapping into local optima, while later TLBO algorithm focuses on the intensification and lets the individuals move toward the best solution at the later stage of the optimization process. Also, this method can cope with the conflict global and local search effectively.

Initially, it evaluated the fitness of each learner and obtained the best solution (teacher). Assign the best solution to SCA as the position of the destination point in Eq. 7. The process might be better understood by the pseudocode, and the flow chart of hybrid SCA–TLBO is shown in Figs. 3 and 4.

Fig. 3
figure 3

Pseudocode of hybrid SCA–TLBO

Fig. 4
figure 4

Flow chart of proposed hybrid SCA–TLBO algorithm

3.1 Steps of hybrid SCA–TLBO algorithm

The clear steps of the proposed hybrid SCA–TLBO algorithm are presented below. Moreover, the detailed pseudocode and flow chart of hybrid SCA–TLBO are also outlined in Figs. 3 and 4.

  • Step 1 Initialize the parameters of the algorithm, such as maximum iteration or generation (G), number of search agents (N), ar2r3r4. The initial population is defined as follows:

    $$\mathop X\nolimits_{i} = \mathop X\nolimits_{\text{Min}} + {\text{rand}} \cdot \left( {\mathop X\nolimits_{\text{Max}} - \mathop X\nolimits_{\text{Min}} } \right)$$
    (13)

    where XMin and XMax are the lower bound and upper bound of the variables, rand generates random numbers uniformly between zero and one.

  • Step 2 Evaluate the fitness of all search agents (learners) X, i.e., calculate the function values of X for all N and choose the best fitness value to be the teacher (destination position) of the present generation.

  • Step 3 Generate r1 and teaching factor (TF) as follows:

    $$\mathop r\nolimits_{1} = a - a\left( {\frac{g}{G}} \right)$$
    (14)
    $${\text{TF}} = {\text{ceil }}[0.5 + {\text{random}}]$$
    (15)

    where g is the current generation.

  • Step 4 During the SCA, all the search agents are updated according to Eq. (7), and the search agent solutions which are accepted are passed to the teacher phase.

  • Step 5 For each search agent Xi, the teacher phase is implemented as follows:

    $$\mathop X\nolimits_{{i,{\text{new}}}} = \mathop X\nolimits_{{i,{\text{old}}}} + {\text{rand}} \times \left( {\mathop X\nolimits_{\text{Teacher}} - {\text{TF}} \cdot \mathop X\nolimits_{\text{mean}}^{g} } \right)$$
    (16)

    where Xi,old and Xi,new are the old and new positions of the ith learner, rand is a random number, the search agent or learner with the value of best fitness in the present generation is chosen as XTeacher. If the ith search agent’s new solution (or position) is best, then it will be accepted; otherwise, the old one is unchanged.

  • Step 6 In case of the learner phase, each learner or search agent can learn from their neighbors or the entire class. The primary learning process ensures excellent local search ability, and the next assures good global searching characteristic. In case of neighborhood search, choose a best search agent neighbor (Teacher(i)), select a neighboring search agent (Xm) randomly and update the search agents according to the following equation:

    $$\mathop X\nolimits_{{i,{\text{new}}}} = \mathop X\nolimits_{{i,{\text{old}}}} + {\text{rand}} \times \left( {{\text{Teacher}}(i) - \mathop X\nolimits_{{i,{\text{old}}}} } \right) + {\text{rand}} \times \left( {\mathop X\nolimits_{m} - \mathop X\nolimits_{{i,{\text{old}}}} } \right)$$
    (17)

    Accept fitness value of Xi,new if it is enhanced; otherwise, the position of learner’s remnant unchanged.

  • Step 7 Otherwise, select search agents randomly from the total class, and update the search agents according to the following Equations:

  • If Xm is better than Xn, then

  • $$\mathop X\nolimits_{{i,{\text{new}}}} = \mathop X\nolimits_{{i,{\text{old}}}} + {\text{rand}} \times \left( {\mathop X\nolimits_{m} - \mathop X\nolimits_{n} } \right)$$
    (18)
  • Else

  • $$\mathop X\nolimits_{{i,{\text{new}}}} = \mathop X\nolimits_{{i,{\text{old}}}} + {\text{rand}} \times \left( {\mathop X\nolimits_{n} - \mathop X\nolimits_{m} } \right)$$
    (19)
  • Step 8 If the new position or solution Xi,new is better than the old one Xi,old, Xi,new will be accepted; otherwise, the position of the ith search agent is unchanged. The accepted search agent solutions are passed to the next generation.

  • Step 9 If the stopping criteria are satisfied, then terminate the process and print the final value of the solution. Otherwise, go to Step 3.

4 Experimental results and discussion

To check the efficiency of the proposed algorithm, the hybrid SCA–TLBO algorithm is benchmarked with twenty-three popular and classical benchmark functions employed by many researchers [39, 41, 42]. The twenty-three benchmark functions (shown in Table 1) can be classified into three groups: unimodal, multimodal and fixed-dimension multimodal benchmark functions. Hybrid SCA–TLBO, SCA, PSO, DE and TLBO algorithms have several parameters that should be initialized before running. Table 2 shows the initial values of the basic parameters for these algorithms. For SCA, TLBO and hybrid SCA–TLBO, the same parameters are used.

Table 1 Twenty-three benchmark functions used in the experimental studies
Table 2 Initial parameters for hybrid SCA–TLBO, SCA, PSO, DE and TLBO (\({\text{rand}}\): random number between [0, 1])

The hybrid SCA–TLBO algorithm and other algorithms (SCA, PSO, DE and TLBO) are simulated with the initial population size NP = 30, 50 and 100 on each twenty-three benchmark functions for comparison. The experimental results are comprised of several statistical parameters, such as average, standard deviation, median and worst of the best-so-far solution in the last generation are reported. Statistical parameter results are presented in Tables 34567 and 8. To check the efficiency of the proposed algorithm, the same maximum number of generations and the same population size are set for all algorithms. In the present work, the initial population size NP = 30, 50 and 100 is selected. The outcomes are averaged over initial population size, and the best outcomes are shown in bold type in tables.

Table 3 Minimization results of twenty-three benchmark functions with initial population size NP = 30 (Ave: average, SD: standard deviation)
Table 4 Minimization results of twenty-three benchmark functions with initial population size NP = 30 (med: median)
Table 5 Minimization results of twenty-three benchmark functions with initial population size NP = 50
Table 6 Minimization results of twenty-three benchmark functions with initial population size NP = 50
Table 7 Minimization results of twenty-three benchmark functions with initial population size NP = 100
Table 8 Minimization results of twenty− three benchmark functions with initial population size NP = 100

According to Derrac et al. [43], to improve the evaluation of evolutionary algorithms achievement, statistical tests should be conducted. A statistical analysis is essential to verify that a proposed novel algorithm is offering a significant advancement over other existing algorithms for a particular problem.

4.1 Unimodal functions

The unimodal functions have no local solution, and there is only one global solution. Consequently, they are used to examine heuristic optimization algorithms in terms of convergence rate. Functions F1F7 are unimodal functions, and the results for these unimodal functions are shown in Tables 34567 and 8. As given in these tables, the hybrid SCA–TLBO outperforms the other algorithms on the unimodal benchmark functions regarding the mean, standard deviation, median and worst value of the results. Therefore, this is evidence that the proposed algorithm has high performance in finding the global solution of unimodal benchmark functions. Figure 5 shows the graphical analysis results of the ANOVA tests.

Fig. 5
figure 5

ANOVA tests of the global minimum values, which are computed by using the SCA–TLBO, SCA, PSO, DE and TLBO for functions F1, F4, respectively

4.2 Multimodal high-dimensional functions

For multimodal functions F8F13 with many local minima, the final results are more important because this function can reflect the algorithm’s ability to escape from poor local optima and obtain the near-global optimum. We have tested the experiments on F8F13 where the number of local minima increases exponentially as long as the dimension of the function increases. As the results of mean, standard deviation, median and the worst value are shown in Tables 34567 and 8, the hybrid SCA–TLBO performs better than the other algorithms on the multimodal high-dimensional benchmark functions.

4.3 Multimodal low-dimensional functions

For F14F23 with only a few local minima, the dimension of the function is also small. The significant difference compared with functions F8F13 is that functions F14F23 appear to be simpler than F8F13 due to their low dimensionalities and a lower number of local minima. As the results of mean, standard deviation, median and the worst value are shown in Tables 34567 and 8, the hybrid SCA–TLBO performs better than the other algorithms on the multimodal low-dimensional benchmark functions.

4.4 Convergence behavior analysis

To confirm the convergence of the hybrid SCA–TLBO algorithm, two metrics are employed: convergence rate and average fitness of all search agents.

Figure 6 show the convergence rate of the of the hybrid SCA–TLBO, SCA, PSO, DE and TLBO algorithms for functions F1F23. The fitness of the best solution in each generation is saved and drawn as the convergence curves in Fig. 6. As evident from the Fig. 6, the proposed hybrid SCA–TLBO algorithm gives optimum solution for functions F1, F2, F3, F4 at the end of first iteration itself. The descending trend is quite evident in the convergence curves of hybrid SCA–TLBO on many of the test functions investigated. This strongly proves the ability of the hybrid SCA–TLBO algorithm in obtaining a better approximation of the global optimum over the course of generations.

Fig. 6
figure 6

Comparison between convergence curves of the SCA–TLBO, SCA, PSO, DE and TLBO algorithms for functions F1–F23, respectively

Figure 7a, b shows a quantitative measure and averages the fitness of all search agents in each generation. If an algorithm improves its candidate solutions, apparently, the average of fitness should be enhanced over the course of generations. As the average fitness curves in Fig. 7a, b suggest, the hybrid SCA–TLBO algorithm shows degrading fitness on all of the test functions. Another fact worth mentioning here is the accelerated decrease in the average fitness curves, which shows that the improvement of the candidate solutions becomes faster and better over the course of generations.

Fig. 7
figure 7figure 7

Comparison of average fitness of search agents during optimization of the SCA–TLBO, SCA, PSO, DE and TLBO algorithms

5 Application of hybrid SCA–TLBO for visual tracking

The computer vision is the advancement of science and technology that include methods for acquiring, processing, analyzing and understanding images. As a scientific discipline, computer vision is fretful with the theory behind artificial systems that excerpt information from images. The image data can take various forms, such as video sequences, views from many cameras or multi-dimensional data from a medical scanner. As a scientific discipline, computer vision pursues to relate its theories and models to the design of computer vision systems. Many applications include: uses in security and surveillance, video communication and compression, navigation, display technology, high-level video analysis, traffic control, metrology, video editing, augmented reality and human–computer interfaces to medical imaging [44, 45]. Given the most important state (e.g., position and velocity) of a target object in the leading image, the objective of visual tracking is to estimate the states of the target in the subsequent frames. Even though visual tracking has been studied for more than a few decades and significant evolution has been made in recent years [46,47,48,49,50,51], it remains a challenging problem.

5.1 Optimization-based tracking system

Essentially, the tracking of an object in video sequences or the issue of tracing the target in every frame can be inferred as an optimization problem. The observation distance amidst the candidate and target forms a similarity function (fitness function). Tracing the target can be inferred as maximizing or minimizing the similarity function in the candidate solution. In this view, visual tracking, as an optimization problem, can be accomplished using optimization methods.

In order to compare the tracking achievement of optimization-based trackers, a general optimization-based tracking framework is designed to which other optimizers could also be applied. Tracking framework using the proposed hybrid SCA–TLBO algorithm is designed as shown in Fig. 8.

Fig. 8
figure 8

Proposed hybrid SCA–TLBO-based tracking framework

The target is preferred by the user or detected by some distinct object detectors in the first frame. Then, the state vector is initialized. The state vector in our work is defined as X = [xys], where xy is the target position in pixel coordinates and s represents the scale parameter that restraints the size of the object. Then, after the target is selected in the first frame the state vector is initialized as X0 = [x0y0, 1], where x0y0 is the target’s initial location and s = 1 specifies that there is no scale variation in the first frame.

After selection of target and initialization of state vector, a dynamic model produces new candidates’ state vectors. Here, the random walk model is applied considering that there is very little movement of the object among frames. [Note: the vector can also be initialized using one of the motion models like constant velocity (CV) model or constant acceleration (CA) model, and in this instance, the state vector is a 5D problem]. In this work, a simple motion model is preferred to compare the novel method with the KF, αβ filter, EKF and BA. The correlation between the appearance and the state of the object is interpreted using an observation model. As known to all, a good observation model is essential to a tracker, but it is not easy to choose a certain observation model for all tracking scenarios. In this work, we are more concerned with the search performance, so we selected a standard kernel-based spatial color histogram [47] as the observation model.

Let {ci}i=1,…,n be the normalized pixel locations of the target candidate, centered at c in the present frame. r = (hxhy), where hx and hy, respectively, denote the width and the height of the target’s rectangle. The kernel-based spatial color histogram is given by

$$\mathop p\nolimits_{c}^{(u)} \left( {\mathop X\nolimits_{k} } \right) = C\sum\limits_{i = 1}^{M} {k\left( {\mathop {\left\| {\frac{{c - \mathop c\nolimits_{i} }}{r}} \right\|}\nolimits^{2} } \right)} \delta \left[ {b\left( {\mathop c\nolimits_{i} } \right) - u} \right],\quad u = 1, \ldots ,m$$
(20)

where δ is the Kronecker delta function. The function b:R2 → {1, …, m} associates with the pixel at the location ci and the index b(ci) of its bin in the quantized feature space. k(x) is an isotropic kernel assigning a smaller weight to pixels farther from the center. \(C = 1/\sum\nolimits_{i = 1}^{M} {k\left( {\mathop {\left\| {\frac{{c - \mathop c\nolimits_{i} }}{r}} \right\|}\nolimits^{2} } \right)}\) is the normalization constant.

When the state vector is described, a similarity (fitness) function is employed to calculate the observation distance between the candidate and target. Generally, the Bhattacharyya coefficient is employed to calculate the similarity between two histograms [47, 52]. It is given by,

$$B\left( {\mathop h\nolimits_{1} ,\mathop h\nolimits_{2} } \right) = 1 - \sum\limits_{i = 1}^{N} {\sqrt {\mathop h\nolimits_{1} (i)\mathop h\nolimits_{2} (i)} }$$
(21)

where N is the number of bins in the histograms and h1 and h2 are the histograms being compared. The coefficient B(h1h2) is big when the histograms are alike and small when they are very unlike.

The dashed box in Fig. 8 represents the optimization process. This is the core part of the optimization-based visual tracking algorithm. In this part, an optimizer is adopted to select the candidate solution. This way can be carried out by maximizing or minimizing the similarity function. Each time the optimizer is queried for the target position, and the frame is displayed to indicate the position of the target. The entire loop extends until no other frame is available.

5.2 Experiments and discussion

5.2.1 Experimental setup details

In this work, we implemented our tracker in MATLAB R2013a software on PC machine with Intel i7-3770 CPU (3.4 GHz) with 2 GB memory, which runs 29 fps on this platform. The self-made video for the experiment consists of JPEG image sequence with 720 × 1280 pixels per frame resolution. The environment for the bad light condition is created by switching off all the lights in the room (improper illumination), while the environment is considered as a normal light condition when the room is properly illuminated. The distance considered for tracking the target object is approximately 6 m with target objects being balls of different sizes (small, medium and big) whose radii are 3.325, 4.9 and 8.5 cm, respectively.

5.2.2 Performance measures

5.2.2.1 Absolute error (AE)

AE is the magnitude of the difference between the true value and the tracked value of the object.

$${\text{AE}} = \left| {\mathop x\nolimits_{\text{true}} - \mathop x\nolimits_{\text{tracked}} } \right|$$
(22)

where xtrue is the true value of object parameters and xtracked is the tracked value of the object parameters.

5.2.2.2 Root mean square error (RMSE)

RMSE is one of the most widely used full-reference quality assessment metrics, which are computed by the square root of the average of squared intensity differences between tracked xtracked and true image pixels xtrue

$${\text{RMSE}} = \sqrt {\frac{1}{NM}\sum\limits_{n = 1}^{N} {\sum\limits_{m = 1}^{M} {\mathop {\left( {\mathop x\nolimits_{\text{tracked}} - \mathop x\nolimits_{\text{true}} } \right)}\nolimits^{2} } } }$$
(23)

where N and M are the image dimensions.

5.2.2.3 Tracking detection rate (TDR)

Tracking detection rate is the ratio of a number of frames in which the object is detected to the total number of frames in which the object present

$${\text{TDR}} = \frac{\text{Object detected in number of frames}}{\text{Object present in number of frames }} \times 100.$$
(24)
5.2.2.4 Object tracking error (OTE)

Object tracking error is the normal inconsistency in the centroid of the tracked object from its true value. It is given by,

$${\text{OTE}} = \frac{{\sqrt {\sum\nolimits_{i = 1}^{N} {\mathop {\left( {\mathop x\nolimits_{\text{true}} - \mathop x\nolimits_{\text{tracked}} } \right)}\nolimits^{2} - \mathop {\left( {\mathop y\nolimits_{\text{true}} - \mathop y\nolimits_{\text{tracked}} } \right)}\nolimits^{2} } } }}{N}$$
(25)

where xtrue and ytrue are the actual 2D coordinates of the object, and xtracked and ytracked are the tracked 2D coordinates of the object.

The proposed hybrid SCA–TLBO-based tracking framework is used to experimentally measure object tracking error, absolute error, tracking detection rate, root mean square error and time cost as parameters for hybrid SCA–TLBO. To reveal the capability of tracker proposed in this work, a comparison of hybrid SCA–TLBO-based tracker and three probability-based trackers, viz. alpha–beta (αβ) filter [53], linear Kalman filter (LKF) [54], extended Kalman filter (EKF) [54] and bat algorithm (BA) [52], is presented. The αβ, LKF, EKF and BA have several parameters that should be initialized before tracking. Table 9 shows the initial values of basic parameters for these algorithms.

Table 9 Simulation parameters for trackers

The RMSE, AE and OTEs of the hybrid SCA–TLBO, αβ filter, LKF, EKF and BA for different-size balls’ dataset are given in Table 10 under normal light condition. As observed in Table 10, under the normal light condition, the average value of the RMSE was reduced after applying the hybrid SCA–TLBO to small-, medium- and big-size balls’ data. On average, the minimum RMSE reduction was 0.01, 0.01 and 0.02 for small-, medium- and big-size balls’ data, respectively, under normal light conditions for the hybrid SCA–TLBO algorithm. The average value of AE was reduced after applying the hybrid SCA–TLBO to small-, medium- and big-size balls’ data. On average, the minimum AE was 0.06, 0.10 and 0.14 for small-, medium- and big-size balls’ data, respectively, under normal light conditions for the hybrid SCA–TLBO algorithm. Also, the average value of the OTE was reduced after applying the hybrid SCA–TLBO to small-, medium- and big-size balls’ data. On average, the minimum OTE was 0.01, 0.02 and 0.04 for small-, medium- and big-size balls’ data, respectively, under normal light conditions for the hybrid SCA–TLBO algorithm.

Table 10 Average parameters comparison for different sizes of balls using five trackers under normal and bad light conditions

The RMSE, AE and OTEs of the hybrid SCA–TLBO, αβ filter, LKF, EKF and BA for only small-size balls’ dataset are given in Table 10 under bad light condition. As observed in Table 10, under the bad light condition, the average value of the AE, RMSE and OTE was reduced after applying the hybrid SCA–TLBO to small ball data. On average, the minimum AE, RMSE and OTE were 0.08, 0.01 and 0.01 for small ball data, respectively, under the bad light condition for the hybrid SCA–TLBO algorithm. The best average values are emphasized in Table 10.

Figure 9a–c shows five frames out of the tracking results for small-, medium- and big-size balls’ data under the normal light condition, and Fig. 9d shows under the bad light condition of the proposed hybrid SCA–TLBO algorithm. The proposed approach was evaluated using a self-made database, with the frame size being 720 × 1280. Frames shown in the figure were selected from the video while tracking the object continuously during its movement. The target object is marked by the bounding box (red circle). For normal light condition, in frames 25 and 31 of Fig. 9a, the target object tracked using hybrid SCA–TLBO is dominated by the red bounding box as the camera is static and the size of the object is reduced with distance. Even though the target object is small, the proposed tracking algorithm can detect the object (small ball) nevertheless.

Fig. 9
figure 9

a Tracking result of hybrid SCA–TLBO for the small-size ball (6, 10, 21, 25 and 31 frames) under normal light condition. b Tracking result of hybrid SCA–TLBO for the medium-size ball (11, 15, 31, 36 and 52 frames) under normal light condition. c Tracking result of hybrid SCA–TLBO for the big-size ball (16, 25, 39, 44 and 55 frames) under normal light condition. d Tracking result of hybrid SCA–TLBO for the small-size ball (24, 28, 36, 40 and 52 frames) under bad light condition

For bad light condition, in frames 40 and 52 of Fig. 9d, the target object tracked using hybrid SCA–TLBO is dominated by the red bounding box as the camera is static and the size of the object is reduced with distance. Even though the target object is small, the proposed tracking algorithm can detect the object (small ball) under bad light condition.

To analyze the time complexity, the average time costs of the five trackers in the tracking process are calculated, and the comparative results are shown in Table 11. Table 11 shows that in all tracking examples (e.g., small-, medium- and big-size balls’ data), the average time cost of hybrid SCA–TLBO is less than alpha–beta (αβ) filter, linear Kalman Filter, extended Kalman filter and bat algorithm.

Table 11 Average time cost of the five trackers for different sizes of balls under normal and bad light conditions

To illustrate the efficiency of hybrid SCA–TLBO more clearly, the proposed algorithm is also compared with four other tracking algorithms including particle filter (PF) [55], scale-invariant feature transform (SIFT) [56], particle swarm optimization (PSO) [57] and bat algorithm (BA) [52]. To carry on the comparison, some videos [58] which cover all challenging factors are selected and used for evaluation. (Note: the tracking examples are available on the Web site http://visual-tracking.net.) The targets in these cases are suffered from various challenging factors as depicted in Table 12.

Table 12 Description of the tracking examples

In hybrid SCA–TLBO, we used different population sizes (NP) from 5 to 50 and found that it is sufficient to use 15–20 population sizes for most tracking problems. The performance is evaluated using two measures, some of lost targets, i.e., the number of frames where the overlap region between its bounding box and the ground truth is less than 50%, and time complexity, i.e., the average time costs (ms). Figure 10 shows the performance comparison using different values of NP. Figure 10 shows that there is a general trend that when NP increases, the number of lost targets decreases and the time cost increases. This means that the increasing population size enhances the tracking accuracy and makes the tracker more time-consuming. Therefore, we have used the fixed population size of NP = 20 in all our experiments. Like in [47, 52], the optimization process was ended by using three termination conditions as follows:

Fig. 10
figure 10

Comparison of tracking performance with different population size NP

  • The fitness of fworst (the worst solution) is good enough.

  • The fbest (best solution) is also good enough and the Euclidean distance between the best and the lowest solution (d) is below a certain threshold.

  • A maximum number of iterations or generations reached the algorithm ends and is set as 100 in this work.

It is worth mentioning that the values used in the first two conditions rely heavily on the fitness function, and in this case, it is the Bhattacharyya coefficient as mentioned in [47, 52]. Experimental results showed a general tendency that a region with Bhattacharyya coefficient being higher than 0.6 could mainly represent the target. Therefore, to be on the safe side, in the first termination condition, the value of (fworst) is set as 0.6 to guarantee that all the potential solutions are close to the actual target. To ensure all candidates close together with the best candidate near the target, we set (fbest) as 0.8 and the distance d is 5. Same target model and motion model are used for fair comparison.

Tracking result of hybrid SCA–TLBO for David, Panda, Subway, Mountain Bike, Boy, Crossing, Car and Couple Video sequences are shown in Fig. 11, and experimental results depict that the hybrid SCA–TLBO-based tracker is capable of tracking a random target in various challenging situations. For evaluating the accuracy of trackers, by identifying the center of the tracked target in each frame visually, the videos have been manually labeled. Then, the Euclidean distance between the center estimated by the trackers and the actual center is calculated and is used as accuracy measure. The results of the comparison are shown in Fig. 12. Figure 12 shows that the hybrid SCA–TLBO-based tracker performs better in all challenging conditions. It is capable of successfully tracking the target during the whole tracking process.

Fig. 11
figure 11

Tracking result of PF, SIFT, PSO, BA and hybrid SCA–TLBO trackers for David, Panda, Subway, Mountain Bike, Boy, Crossing, Car and Couple Video sequences (from top to bottom)

Fig. 12
figure 12

Tracking accuracy comparisons of different trackers for David, Panda, Subway, Mountain Bike, Boy, Crossing, Car and Couple Video sequences

6 Conclusion

In this paper, a novel optimization algorithm called hybrid SCA–TLBO is proposed, for solving optimization problems and visual tracking. This algorithm is tested using twenty-three eminent test functions. The experimental results show that the performance of the proposed algorithm is superior to that of the other existing algorithms in exploiting the optimum and it also has advantages in exploration. The performance measures show that the hybrid SCA–TLBO algorithm possesses a better capability to track an object as compared to other trackers.