1 Introduction

Visual tracking has been applied in domains such as intelligent monitoring, vehicle-assisted driving, early warning systems for crime purposes, precision guidance and other cutting-edge technologies. Visual tracker is a challenging and valuable research. However, the following main issues still need to be addressed: (i) the presence of various interference factors and (ii) the low efficiency, accuracy and stability of these systems [18]. In this study, we focus on finding an effective solution to the above problems.

A visual target tracking system needs to deal with a variety of complex aspects, such as fast speeds, changes in illumination, complex backgrounds, similar target disturbances and occlusions. Much research has been done to deal with the factors affecting tracking interference, and several improved mean shift tracking algorithms have been proposed [2, 33]. Although these methods have overcome the problems of tracking in a working environment with low lighting and interactions between similar objects, they fail to give effective tracking results when the external illumination changes significantly. A particle filter tracker has been proposed that uses binary labels to describe each image block; this approach can process the tracking of fast-moving objects with high efficiency [37], but it cannot effectively handle interference between similar targets or the presence of complex and variable backgrounds. A new tracker using a foreground probability and candidate model weight histogram has been proposed to suppress background interference and increase the foreground weight at the same time [41]. This tracker is suitable for variations in background and lighting, but cannot adapt to high-speed movements of the target. Another visual tracker combining the PF and mean shift has been developed, in which the mean shift was introduced to guide particles to move to the local peak of probability [11, 15]. The position estimation of particles is improved by this operation, meaning that the candidate area is closer to the real position. This tracker gives excellent real-time performance and can deal with both occlusion and interference between similar targets; however, it often fails in poor lighting environments, and its efficiency needs to be improved. Ref. [16] put forward SAMF: a frame size adaptive adjustment strategy on the basis of the correlation filter (CF). This tracker can effectively deal with target size change and deformation. Ref. [8] brought forward the discriminative scale space tracker (DSST), which is realized by the discriminant CF on the basis of the scale pyramid representation. Ref. [9] presented the learning ccontinuous convolution operators (C-COT), which creates a new method to calculate the convolution operator in continuous space domain and realizes the effective fusion of multi-resolution feature mapping in joint formula. Overall, most current trackers behave well in certain tracking scenarios, but fail to adapt to other conditions. The main aim of this study is to optimize the robustness and intelligence of the traditional PF algorithm under different tracking conditions.

Visual target tracking is a nonlinear state estimation process, as the motion of the tracked target has a strong random component. PF is suitable for dealing with visual target tracking due to its advantages in terms of handling nonlinear state estimation [17, 38], but the traditional PF algorithm still suffers from sample degeneracy and impoverishment. In addition, when solving complex nonlinear problems, there is room for improvement in its stability, robustness, accuracy and efficiency [18, 26]. There have been many improvements in particle filter tracking algorithms. Ref. [3]proposed a PF tracker modified by mean-shift (MSPF). This method can effectively deal with fast movement. A PF has been proposed based on particle swarm optimization (PSO-PF) that takes advantage of the excellent iterative capabilities of PSO [6, 32]. PSO-PF significantly improves the computational efficiency of the PF algorithm, but does not handle the problem of particle impoverishment. An improved PF based on the genetic algorithm (GA-PF) has also been presented [5, 40] that solves the problem of particle impoverishment and integrates the global searching and optimizing ability of genetic algorithm (GA) with PF. Although it gives significant improvements in computational accuracy and numerical stability, this approach imposes large computational costs, and its efficiency therefore needs to be improved.

In a word, PF still suffer from particle degeneracy and problems with sample impoverishment. While the recent improvement of transitional PF such as MSPF, GA-PF and PSO-PF can not take into account the efficiency and accuracy of PF at the same time. To deal with the above issues, in this paper, we put forward the QAPF algorithm, which combines the particular advantages of efficiency of PSO-PF and the precision of GA-PF to further improve the PF algorithm. Moreover, this paper introduces the tracking frame adaptive adjustment model, which is the result of our previous research [34], to solve the problem that the tracking frame is easy to mix invalid features in the traditional target tracking process, so as to improve the effective feature proportion, the success rate of feature matching and the tracking accuracy. At the same time, on the basis of this achievement, we abandon the complex feature selection strategy and focus the feature selection on the color feature to reduce the computational burden of feature matching and tracking. At the same time, QPSO optimization process is introduced into PF algorithm to further improve the calculation efficiency. The results of nonlinear system simulation and a large number of visual experiments show that the efficiency, accuracy and stability of our improved pf tracking algorithm have been greatly improved.

The main contributions of this study include:

  1. 1)

    We propose an improved PF algorithm based on QPSO and AGA (called QAPF) that can improve the efficiency, sample diversity and precision of the traditional PF algorithm.

  2. 2)

    We develop a tracking method based on QAPF, and carry out several simulation and tracking experiments to prove that the tracking efficiency, accuracy and stability of QAPF are significantly superior to those of similar tracking algorithms.

  3. 3)

    A frame scale adaptive adjustment method is applied that automatically adjusts the frame size based on the actual size of the object. This effectively reduces the number of useless features and improves the accuracy of target positioning.

Through the comparison of quantitative and qualitative tests on different challenging video sequences, GAPF performs well in robustness, precision and efficiency, and can well adapt to various typical tracking scenarios.

2 Particle filter based on QPSO and AGA

2.1 Quantum behaved particle swarm optimization based on wave function

In QPSO, the individual state is generated by the wave function ψ(X,t) [19, 39]. The square of ψ(X,t) is |ψ|2, denoted as Q. Q is the probability density of the individual location and refers to the probability density of the particle in position X at time t [12, 36]. The expression of QPSO is:

$$ \left\{\begin{array}{c}\psi \left({\mathrm{x}}_{ij}\left(\mathrm{t}+1\right)\right)=\frac{1}{\sqrt{L_{ij}\left(\mathrm{t}\right)}}\exp \left(-\left|{\mathrm{x}}_{ij}\left(\mathrm{t}+1\right)-{p}_{ij}\left(\mathrm{t}\right)\right|/{L}_{ij}\left(\mathrm{t}\right)\right)\\ {}{p}_{ij}=\frac{c_1{r}_1 pbes{t}_{ij}(t)+{c}_2{r}_2 gbes{t}_{ij}(t)}{c_1{r}_1+{c}_2{r}_2}\\ {}\begin{array}{c}{L}_{ij}\left(\mathrm{t}\right)=2\alpha \cdotp \left|{M}_j(t)-{\mathrm{x}}_{ij}\left(\mathrm{t}\right)\right|\\ {}Q\left({\mathrm{x}}_{ij}\left(\mathrm{t}+1\right)\right)={\left|\psi \left({\mathrm{x}}_{ij}\left(\mathrm{t}+1\right)\right)\right|}^2\\ {}\begin{array}{c}{\mathrm{x}}_{ij}\left(\mathrm{t}+1\right)={p}_{ij}(t)\pm \frac{L_{ij}\left(\mathrm{t}\right)}{2}\ln \left(1/{\mathrm{u}}_{ij}\left(\mathrm{t}\right)\right)\\ {}\begin{array}{c}{\int}_{-\infty}^{+\infty }{\left|\psi \right|}^2 dxdydz={\int}_{-\infty}^{+\infty } Qdxdydz=1\\ {}{r}_1,{r}_2,{c}_1,{c}_2,{\mathrm{u}}_{ij}\left(\mathrm{t}\right)\in U\left(0,1\right)\end{array}\end{array}\end{array}\end{array}\right. $$
(1)

where Mj(t) represents the mean value of the optimal positions of each individual; pij is the j-th component of individual i; and c1 and c2 are learning or acceleration coefficients, which are used to regulate the maximum step size for the sample with the global or individual best position. The formula for updating the individual’s positions is:

$$ {x}_{ij}\left(t+1\right)={p}_{ij}(t)\pm \alpha \cdotp \left|{M}_j(t)-{x}_{ij}(t)\right|\cdotp \ln \left(1/{u}_{ij}(t)\right) $$
(2)

where α is the compression expansion coefficient, which is used to adjust the particle convergence rate and is the only parameter in this method, except for the overall size and the total number of iterations. Sun et al. [29] has proved that it converges when α < 1.78. The updating formula of the best population location Pg and the optimal position of particle Pi are shown below:

$$ \left\{\begin{array}{c}{P}_{ij}\left(t+1\right)=\left\{\begin{array}{c}{P}_{ij}(t)\kern2.2em if\kern1.25em f\Big({x}_{ij}\left(t+1\right)\ge f\left({P}_{ij}(t)\right)\\ {}{x}_{ij}\left(t+1\right)\ if\kern1.25em f\Big({x}_{ij}\left(t+1\right)<f\left({P}_{ij}(t)\right)\end{array}\right.\\ {}{P}_{ij}\left(t+1\right)=\mathrm{argmin}f\left({P}_g\left(t+1\right)\right)\end{array}\right. $$
(3)

2.2 Adaptive genetic algorithm based on degree of convergence

Definition 1

The fitness of generation t is expressed as \( {f}_t^1 \), \( {f}_t^2\cdots \), \( {f}_t^S \), where the average fitness is expressed as \( {f}_t=\frac{1}{S}\sum \limits_{i=1}^S{f}_t^i \), S is the scale of the population; and fmax is the best individual fitness. f is the average fitness of all individuals whose fitness are greater than ft. Define the population convergence as △ = fmax − f. Here, we calculate the difference between fmax and f, rather than the difference between fmax and ft. In this way, we can eliminate individuals with a fitness that is lower than the average value, and estimate the degree of premature convergence more accurately. The probability equation for the genetic operation of AGA can be expressed as follows:

$$ \left\{\begin{array}{c}{P}_c=1/1+\exp \left(-{k}_1\cdotp \Delta \right)+1.5\\ {}{P}_m=-1/1+\exp \left(-{k}_2\cdotp \Delta \right)\end{array}\right. $$
(4)

where k1 and k2 are positive constants, and △ is non-negative. Then, the range of the crossover probability Pg and the mutation probability Pm are [0.5,1] and [0,0.5], respectively. Traditionally, the probability of the genetic operation is constant in the whole iterative process, which leads to premature convergence [7, 21]. In our algorithm, the probabilities Pg and Pm of the GA operation are automatically controlled via the population convergence △. When △ is large, Pc and Pm are automatically increased to prevent premature convergence. When the particles tend towards differentiation, Pc and Pm will automatically decrease, driving the sample to converge.

The genetic operation of AGA involves a non-uniform mutation and arithmetic crossover [1], which are expressed as in Eqs. (5) and (6):

$$ \left\{\begin{array}{c}{\left({x}_t^i\right)}^{\prime }=\left\{\begin{array}{c}{x}_t^j+f\left(t,{m}_t-{x}_t^j\right),p<0.5\\ {}{x}_t^j-f\left(t,{m}_t-{l}_t\right),p\ge 0.5\end{array}\right.,{x}_t^i\in \left[{l}_t,{m}_t\right]\\ {}f\left(t,y\right)=y\cdotp \left(1-{p}^{{\left(1-t/T\right)}^b}\right)\in \left(0,y\right),p\in U\left(0,1\right)\end{array}\right. $$
(5)
$$ \left\{\begin{array}{c}{\left({x}_t^i\right)}^{\prime }=\alpha \cdotp {x}_t^i+\left(1-\alpha \right)\cdotp {x}_t^j\\ {}{\left({x}_t^j\right)}^{\prime }=\alpha \cdotp {x}_t^j+\left(1-\alpha \right)\cdotp {x}_t^i\end{array}\right. $$
(6)

where α has a random distribution within (0,1); T represents the upper limit on the iterations; b determines the non-uniformity of Pm; and f is an adaptive mutation operator, which controls the step scale in an adaptive way and searches for potential solutions in the entire domain. At the end of each iteration, it searches only a tiny adjacent region of the current solution to guarantee the positioning accuracy and lock in the optimal solution.

2.3 QAPF algorithm

The optimization ability and computational efficiency of QPSO is superior to the traditional PSO algorithm, but it tends to be precocious and is subject to low precision and sample diversity loss problems [30]. Although the computational accuracy and optimization capability of AGA is better than GA, AGA still has a poor local search ability and low computational efficiency [13]. On the basis of the above analysis, we present the QAPF algorithm, which combines the strong global search capability of GA with the local search capability and efficiency of PSO.

More specifically, the position updating equation of QPSO is introduced into PF to optimize the particle distribution and efficiency. We then replace particles with lower fitness with individuals with higher fitness. The genetic operation of AGA is also applied to increase the particle diversity and computational efficiency. The basic calculation process of QAPF is as follows

  1. Step 1.

    Initialization of the sample

On the basis of the prior probability distribution function of the particle states p(x0) ∈ U(0, 1), a set of particles \( {x}_0^i{x}_0^i{x}_0^i \)(i = 1, 2, ⋯, S)is obtained. The initial weights are \( {w}_0^i=1/S \), and the initialization function of the probability density is denoted as p(x0| y0) = p(x0).

  1. Step 2.

    Importance density sampling

    1. a)

      Computation of importance density

$$ {x}_t\sim q\left({x}_t^i|\ |{y}_t\right)=p\left({x}_t^i|\ |{x}_{t-1}^i\right)p\left({x}_{t-1}^i|\ |{y}_{t-1}\right) $$
(7)
  1. b)

    Updating of particle weights

$$ {w}_t^i={w}_{t-1}^i\frac{p\left({y}_t\left|{x}_t^i\right.\right)p\left({x}_t^i\left|{X}_{t-1}^i\right.\right)}{q\left({x}_t^i\left|{X}_{t-1}^i\right.,{Y}_t\right)} $$
(8)
  1. c)

    Updating of particle probability density

$$ p\left({x}_t\left|{y}_{1:t}\right.\right)=\sum \limits_{i=1}^S{w}_t^i\delta \left({x}_t-{x}_t^i\right) $$
(9)

where δ means the Dirac function. We first generate S particles using Eq. (7), and then update the weights and probability density using Eqs. (8) and (9).

  1. Step 3.

    Determination of the level of sample degradation

$$ {N}_{eff}=\frac{1}{\sum \limits_{i=1}^S{\left({w}_t^i\right)}^2} $$
(10)

where Neff is the level of sample impoverishment. If Neff is higher than a threshold Nth, the sample impoverishment is not significant, then directly move to step 5; otherwise, serious particle impoverishment may exist and resampling must be carried out before the locations are updated.

  1. Step 4.

    Resampling

Reset all the weights to \( {w}_t^i=1/S \).

  1. Step 5.

    Updating of the particle positions using QPSO

Particles are updated using the position update equation of QPSO. The purpose of this procedure is to improve the particle distribution and accelerate the optimization process to improve the efficiency.

  1. Step 6.

    Genetic manipulation

Individuals with lower fitness are replaced by others with greater fitness. We then perform a non-uniform mutation and arithmetic crossover, where the probability is automatically adjusted based on the sample convergence, to reduce the sample impoverishment and increase the computational precision.

Step 7: Output results.

If the condition for terminating iteration has not been reached, we return to Step 2; otherwise, the algorithm ends and the final value is output, as follows:

$$ {x}_t=\sum \limits_{i=1}^S{w}_t^i{x}_t^i $$
(11)

3 Tracking algorithm based on QAPF

3.1 Mechanism model of the target state

The mechanism model of the target state is as follows:

$$ \left\{\begin{array}{c}{X}_t={AX}_{t-1}+{Bu}_{t-1}+{Cf}_{t-1}\\ {}A=\left[\begin{array}{cccc}1& T& 0& 0\\ {}0& 1& 0& 0\\ {}0& 0& 1& T\\ {}0& 0& 0& 1\end{array}\right],B=\left[\begin{array}{cc}{T}^2/2& 0\\ {}T& 0\\ {}0& {T}^2/2\\ {}0& T\end{array}\right],C=\left[\begin{array}{cc}{T}^2/2& 0\\ {}T& 0\\ {}0& {T}^2/2\\ {}0& T\end{array}\right]\end{array}\right. $$
(12)

where \( {X}_t=\left[{x}_t,{x}_t^{\prime },{y}_t,{y}_t^{\prime}\right] \) are the coordinates of the motion state at time t; x and y are the coordinates of the particle position; x and y represent the components of velocity along the X and Y axes [20, 31]; A represents the transition matrix of the target state; T represents the sampling period; B and C correspond to the input matrices; \( {u}_{t-1}={\left[{u}_{t-1}^x,{u}_{t-1}^y\right]}^T \) is the acceleration at time (t − 1); and ft − 1 is the white Gaussian noise at time (t − 1), the expression is shown below [27]:

$$ \left\{\begin{array}{c}{f}_{t-1}={\left[{f}_{t-1}^x,{f}_{t-1}^y\right]}^T\in N\left(0,Q\right)\\ {}Q=\mathit{\operatorname{diag}}\left({\sigma}_x^2,{\sigma}_y^2\right)\end{array}\right. $$
(13)

The target observation model is:

$$ \left\{\begin{array}{c}{Y}_t=h\left({X}_t\right)+{v}_t,\\ {}h\left({X}_t\right)={\left[{r}_t,{\theta}_t\right]}^T\\ {}{r}_t=\sqrt{x_t^2+{y}_t^2},\\ {}{\theta}_t=\mathit{\arctan}\left(\frac{y_t}{x_t}\right)\end{array}\right. $$
(14)

In Eq. (17), Yt = [rt, θt]T refers to the phase and distance information of the tracked object, and vt is white Gaussian noise, as expressed below:

$$ \left\{\begin{array}{c}{v}_t={\left[{v}_t^r,{v}_t^{\theta}\right]}^T\in N\left(0,R\right)\\ {}R=\mathit{\operatorname{diag}}\left({\sigma}_r^2,{\sigma}_{\theta}^2\right)\end{array}\right. $$
(15)

The state model Eq. (14) and the observation model Eq. (15) are essential in estimating the real target state.

3.2 Target and candidate models

The area of the tracked object in the former frame is defined as the target region, whereas the area in which the target has the highest probability of existence is defined as the candidate area. Both the target model q(x0) and the observation model p(y) are expressed as characteristic probability density functions of the eigenvalues of the corresponding areas, as follows [24]:

$$ q\left({x}_0\right)=C\sum \limits_{i=1}^sk\left(\left\Vert {\left.\frac{x_i-{x}_0}{r}\right\Vert}^2\right.\right)\delta \left[b\left({x}_i\right)-u\right] $$
(16)
$$ p(y)=C\sum \limits_{i=1}^sk\left(\left\Vert {\left.\frac{x_i-y}{r}\right\Vert}^2\right.\right)\delta \left[b\left({x}_i\right)-u\right] $$
(17)

where C represents the normalized constant of the feature probability density, r refers to the bandwidth of the kernel function, S represents the number of feature points, x0 represents the center of the target area, y is the center of the candidate region, xi is the i-th feature point, k(x) is the outer contour function of the kernel function K (in which higher weights are assigned to the feature points closer to the center of the target area). The feature index interval mapping function b and the Dirac function δ are used to evaluate whether the target feature belongs to the u-th feature index interval (if so, then δ = 1; otherwise, δ = 0). The formulae for k(x) and C are as follows:

$$ k(x)=\frac{3}{4}\left(1-{x}^2\right) $$
(18)
$$ C=1/\sum \limits_{i=1}^Sk\left(\left\Vert {\left.\frac{x_0-{x}_i}{r}\right\Vert}^2\right.\right) $$
(19)

3.3 Similarity measure

The visual target tracking process involves finding the item with the highest similarity to the target area determined in the former frame from each candidate region of the current image frame. The Bhattacharyya coefficient can be used as a similarity function and is expressed as follows [28]:

$$ \rho (y)\equiv \rho \left(p(y),q\left({x}_0\right)\right)=\sum \limits_{u=1}^m\sqrt{p(y)q\left({x}_0\right)} $$
(20)

The greater the similarity ρ(y), the greater the chance of matching between the target model q(x0) and the observation model p(y).

3.4 Updating the particle weights and target model

The measured probability density function is the fitness function, and is expressed as follows:

$$ p\left({y}_t|{x}_t^i\right)=\frac{1}{\sqrt{2\pi}\sigma}\exp \left(-\frac{1-\rho \left({y}_t\right)}{2{\sigma}^2}\right) $$
(21)

where σ is the standard deviation of the Gaussian density [35]. Then update the weights with the fitness formula:

$$ {w}_t^i={w}_{t-1}^ip\left({y}_t\left|{x}_t^i\right.\right) $$
(22)

where \( {w}_0^i=1/\mathrm{S} \) are the initial particle weights. The formula for the normalized particle weights is:

$$ {\hat{w}}_t^i={w}_t^i/\sum \limits_{i=1}^S{w}_t^i $$
(23)

The equation for the average particle weight is as follows:

$$ \overline{w_t}=\sum \limits_{i=1}^S{w}_i/S $$
(24)

The mean value of the feature probability density of the best candidate area is generated by the normalized weight:

$$ {\overline{p}}_t(y)=\sum \limits_{i=1}^s{\hat{w}}_t^i\times {p}_t(y) $$
(25)

To better integrate the information on the best candidate model and the existing target model, the target model is updated based on \( {\overline{p}}_t(y) \), qt − 1(x) and \( \overline{w_t} \):

$$ {q}_t(x)={\overline{w}}_t\times {\overline{p}}_t(y)+\left(1-{\overline{w}}_t\right){q}_{t-1}(x) $$
(26)

where qt − 1(x) is to the target model of the former frame.

3.5 Target localization and sample updating

The location of the target is generated based on the weighted sum of all positions, as follows:

$$ {x}_t=\sum \limits_{i=1}^S{w}_t^i{x}_t^i $$
(27)

After positioning the tracked object, the extent of the particle impoverishment is calculated using Eq. (28):

$$ {N}_{eff}=1/\sum \limits_{i=1}^S{\left({w}_t^i\right)}^2<{N}_{th} $$
(28)

where Neff means the amount of the effective individuals and Nth refers to the threshold. If Neff is smaller than Nth, we resample the particles, and the weighted particles \( {\left\{{x}_t^i|{w}_t^i\right\}}_{i=1}^S \) are then reset as a new template set with equal weights \( {\left\{{x}_t^i|1/N\right\}}_{i=1}^S \). Otherwise, the particle position update step of QPSO and the optimization scheme of AGA are applied to generate a new population.

4 Adaptive adjustment model for the tracking frame

In traditional trackers, the frame scale is determined by the first frame and remains unchanged throughout the entire tracking process. In practice, however, the target size is variable, and a frame with fixed size cannot adapt to the optimal selection of target features, leading to low positioning accuracy or a high frequency of tracking failure. As a result, it is important to apply automatic adjustment of the tracking box.

$$ \left\{\begin{array}{c}d=\frac{1}{M}\sum \limits_{i=1}^M\sqrt{{\left(x-{x}_i\right)}^2+{\left(y-{y}_i\right)}^2}\\ {}{d}_x=\frac{1}{M}\sum \limits_{i=1}^M\sqrt{{\left(x-{x}_i\right)}^2}\\ {}{d}_y=\frac{1}{M}\sum \limits_{i=1}^M\sqrt{{\left(y-{y}_i\right)}^2}\end{array}\right.{w}^i>{w}_T $$
(29)

We denote the average value of the distance between the target center and the particles in the candidate area as d. The value of d is positively correlated with the target scale. More specifically, d increases or decreases synchronously with the scale variation of the tracked object. The frame size can then be automatically adjusted in real time based on d, as shown in formula (29).

where (x, y) refers to the target center, (xi, yi) represents the position of the i-th particle, and the normalized weight of the i-th particle is wi(i = 1, 2, …, M). M is the number of individuals with weight larger than the threshold wT (which is set to 0.6 times the average of the sample weights in this research). The proposed adjustment model for the size of the target frame can be expressed as follows:

$$ \left\{\begin{array}{c}{h}_{x,t}={h}_{x,t-1}\cdotp \left(\alpha \cdotp {d}_t/{d}_{t-1}+\beta \cdotp {d}_{x,t}/{d}_{x,t-1}\right)\\ {}{h}_{y,t}={h}_{y,t-1}\cdotp \left(\alpha \cdotp {d}_t/{d}_{t-1}+\beta \cdotp {d}_{y,t}/{d}_{y,t-1}\right)\end{array}\right. $$
(30)

where the weight coefficients α and β are set to 0.25 and 0.75, respectively, in this case. Based on d and the ratio between the sizes of the former and current frames, the current frame scale h can be obtained using Eq. (30).

5 Simulation and experiment

5.1 Simulation analysis of QAPF in a nonlinear system

We carry out simulations to compare QAPF against PSO-PF, GA-PF and the traditional PF algorithm for a nonlinear dynamic system. The state transfer formula and the observation formula of the nonlinear dynamic system are given in Eq. (31):

$$ \left\{\begin{array}{c}{x}_t=1+\mathit{\sin}\left(0.04\pi t\right)+\frac{x_{t-1}}{6}+{u}_{t-1}\\ {}{y}_t=\frac{x_t^2}{6}+{v}_t\end{array}\right. $$
(31)

where xt is the state value, yt is the measurement value, vt and ut are the observation and process noise respectively, ut is a random noise term with a Gamma distribution ut~G(3, 2), and vt is a random noise term with a Gauss distribution vt~N(0, 0.00001). We set the number of sample particles to 100 and the initial state value to one. The output result is the mean value of all particle sets:

$$ {\hat{x}}_t=\frac{1}{S}\sum \limits_{i=1}^S{x}_t^i $$
(32)

The mean square error for each independent experiment is:

$$ RMSE={\left(\frac{1}{T}\sum \limits_{t=1}^T{\left({x}_t-{\hat{x}}_t\right)}^2\right)}^{1/2} $$
(33)

The simulation results from each algorithm for this system are shown in Figs. 1, 2, 3 and 4 below.

Fig. 1
figure 1

Simulation of a nonlinear system with the traditional PF algorithm

Fig. 2
figure 2

Simulation of a nonlinear system with the PSO-PF algorithm

Fig. 3
figure 3

Simulation of a nonlinear system with the GA-PF algorithm

Fig. 4
figure 4

Simulation of a nonlinear system with the QAPF algorithm

Figures 1, 2, 3 and 4 show that the traditional PF, PSO-PF and GA-PF algorithms give severe deviations, moderate deviations, and slight oscillations, respectively, between the state estimation curve and the true state distribution curve. In contrast, the state estimation curve of QPSO fits the true state distribution curve almost exactly.

Fifty independent experiments were carried out. The mean value of the RMSE, its variance, and the average running time of each algorithm are shown below.

It can be observed from Table 1 that PSO-PF and GA-PF are superior to the traditional PF algorithm, although the mean value and variance of the RMSE are greater than for QAPF. It is also obvious that the accuracy and numerical stability of QAPF are significantly improved compared with PF, PSO-PF and GA-PF. From the running times, it can be seen that the computational efficiency of QAPF is significantly better than for PF and GA-PF.

Table 1 Comparison of results for different PF state estimation methods

5.2 Experimental analysis of QAPF tracking

To realize real-time tracking of fast-moving targets, a tracking algorithm needs to have high accuracy and a strong anti-interference ability. Ensuring the accuracy of tracking is a difficult task, especially when the illumination and background vary significantly. To verify the actual tracking effectiveness of our QAPF tracking algorithm and to compare it with similar algorithms, the license plate of a car on the highway was used as a tracking target. Parameters such as the focal length of the camera and the exposure were fixed throughout the entire video recording process. Several frames suffering from illumination variation were selected to analyze and compare the performance of each method. These real sequences were captured from a video with 720 × 576 pixels and a frame rate of 30 fps.

The tracking process used in Experiment 1 was as follows: the target was the license plate of a black car, which was driven along a highway and through a viaduct at a high speed. When the car was outside the viaduct, the license plate was brightly lit and the background was also variable and complex. When entering or leaving the viaduct, there was a rapid shift between strong and weak illumination. The unevenness of the road surface and the high speed of the vehicle also led to frequent fluctuations and lateral displacements of the license plate.

Experiment 1–1: The PF (white tracking frame) and PSO-PF (green tracking frame) algorithms were used to track the license plate of the car. The tracking results are shown in Fig. 5 and Tables 2 and 3.

Fig. 5
figure 5

Experimental results from the traditional PF and PSO-PF trackers

Table 2 Tracking results from the traditional PF tracking algorithm
Table 3 Tracking results from the PSO-PF tracking algorithm

Experiment 1–2: GAPF (red tracking frame) was used to track the license plate. The tracking results are shown in Fig. 6 and Table 4.

Fig. 6
figure 6

Experimental results from the GAPF tracker

Table 4 Tracking results from the GAPF tracking algorithm

Experiment 1–3: QAPF (blue tracking frame) was used to track the license plate. The tracking results are shown in Fig. 7 and Table 5.

Fig. 7
figure 7

Experimental results from the QAPF tracker

Table 5 Tracking results from the QAPF tracking algorithm

Analysis of results for Experiment 1–1: Before frame 184, the license plate was brightly lit. Figure 5 shows that the tracking boxes of PF and PSO-PF both captured the rear license plate accurately in this case. However, in frame 185, the car suddenly moved into a dark environment, and the tracking boxes of PF and PSO-PF began to lose the target. After frame 205, the tracking boxes of PF and PSO-PF completely deviated from the target vehicle, and were unable to locate the target again. Through a comparison of Tables 2 and 3, it can be seen that the performance of the PF and PSO-PF tracking algorithms was similar in terms of the effective feature ratio and tracking speed. In weak light, and especially between frames 185 and 205, the number of targets missed by PSO-PF tracking was slightly lower than for traditional PF.

Analysis of results for Experiment 1–2: As shown in Fig. 6 and Table 4, the tracking box of GAPF accurately captured the rear license plate when it experienced a rapid shift between strong and weak light (between frames 184 and 185). After frame 185, the car was weakly lit, and there was a reduction in the effective feature rate for the GAPF algorithm. Although the GAPF tracking box did not miss the plate, it still showed a positioning error and partially deviated from the target (between frames 226 and 233). The proportion of effective features increased when the plate returned from weak light to bright light (frames 233 to 234), and the tracking box then accurately captured the entire plate. From a comparison of Tables 2, 3 and 4, it can be seen that the ratio of the effective features extracted by GAPF was significantly higher than for PF and PSO-PF, but the total tracking time for each frame was higher for the GAPF algorithm than for the first two algorithms.

Analysis of results for Experiment 1–3: Figs. 6 and 7 and Tables 4 and 5 show that the average processing time and tracking failure rate of QAPF are much lower than that of GAPF. The tracking box of QAPF can accurately capture the license plate, regardless of whether the car is experiencing rapidly alternating environmental light intensity, is driving in weak light, or is driving rapidly against a complex background with strong illumination.

5.3 Experimental analysis of scale adaptive tracking

To test and compare the tracking performance of PF, PSO-PF, GAPF, and QAPF with the proposed adaptive adjustment model for the tracking frame, we carried out tracking experiments using eight challenging sequences. Other similar tracking methods such as MSPF [39], SAMF [18], DSST [26] and C-COT [3] are also introduced for comparison, as shown in Fig. 8. The average success scores and precision rates for these trackers in different attribute on the OTB-100 are shown in Tables 6 and 7, respectively. There are 11 attributes, including illumination variation (IV), scale variation (SV), occlusion (OCC), deformation (DEF), motion blur (MB), fast motion (FM), in plane rotation (IPR), out of plane rotation (OPR), out of view (OV), complex background (BC) and low resolusion (LR). Among all of these 11 attributes, no matter in the success rates or in the precision rates, our tracker ranks the top two in most of the attributes and the top three in all attributes. It can be obtained from these attribute-based data that our tracker achieves a good performance in terms of success rate and accuracy rate in most tracking interference scenarios. To evaluate the stability, robustness and efficiency of QAPF, we compared it statistically with the typical advanced methods in recent years in the VOT2018 [14] database, including EACOFT [25], ECO [10], D2CIP [22], MSOAPF [4], C-COT, DCPF-Likelihood [23] and MSPF. In the statistical comparison, we compare these trackers in terms of robustness (R. for short, measured by the failure proportion), accuracy (Acc.) and average overlap (EAO). As shown in Table 8, QAPF ranked fourth in ACC, third in R.Fail., and second in EAO and FPS, which prove that QAPF has strong competitiveness in robustness, stability and efficiency.

Fig. 8
figure 8

Comparison of eight different methods on each challenging sequence (from top to bottom: MotorRolling, Dancer, Football1, Freeman4, BlurOw, Shaking, Skater, Singer1 and Bolt, which contain fast moving objects, complex background, deformation, low resolution, motion blur, illumination change, rotation, size change and occlusion)

Table 6 Average success rates for different trackers for each individual attribute on the OTB-100 dataset
Table 7 Average precision scores for different trackers for each individual attribute on the OTB-100 dataset
Table 8 Statistical comparison on VOT2018

Our scale-adaptive adjustment model was used in these experiments to automatically adjust the frame size based on the actual object size. This was shown to effectively reduce the number of useless features and to improve the accuracy of the target positioning. Compared with the PF, PSO-PF and GA-PF tracking algorithms, QAPF gives better results. As can be seen from Fig. 8 and Tables 6 and 7, compared with the typical generative target tracking algorithm, QAPF algorithm achieves good performance in these eight typical tracking scenarios in three respects: tracking accuracy, success rate and the actual tracking application effect. Moreover, compared with C-COT, which obtains the optimal overall performance, QAPF behaves better in several specific tracking scenes. As shown in Tables 6 and 7, C-COT tracker has advantages over the algorithm in this paper in more than 7 sub-scenes. However, QAPF has better tracking accuracy in three tracking scenes including illumination change, scale change and complex background. Therefore, C-COT is recommended to be applied in other tracking scenes except the above three.

6 Conclusion

An improved PF tracker based on QPSO and AGA is presented in this paper. The nonlinear filtering capability of the PF algorithm forms the basis of our QAPF algorithm. The position updating equation of the QPSO algorithm is used for local optimization of the particle distributions after resampling, which greatly improves the efficiency. In each iteration, particles with lower fitness values are removed after resampling and are replaced by a corresponding number of optimal particles. This ensures an effective particle number and improves the precision of the QAPF algorithm. Meanwhile, the introduction of the arithmetic crossover and non-uniform mutation from AGA means that the particles are updated based on the degree of convergence of the sample, which helps to guarantee the diversity of the sample. A simulation analysis using a nonlinear system indicated that the accuracy, efficiency and numerical stability of our approach was considerably improved compared to PF, PSO-PF and GA-PF. Multiple tracking experiments were carried out, and the tracking results from QAPF were compared with those of PF, PSO-PF, GAPF and other tracking algorithms. The results demonstrated the robustness and efficiency of the proposed algorithm using eight challenging tracking sequences, such as a fast-moving target, variations in illumination, occlusion and motion blurring. Furthermore, an analysis and comparison of the experimental data from amultiple independent experiments prove that our QAPF tracking algorithm is superior to other PF tracking methods and typical alternative generative target trackers in terms of the effective feature ratio, feature extraction efficiency, tracking speed, tracking stability and accuracy. Our proposed frame size adaptive adjustment strategy was implemented, which automatically adjusts the frame size based on the actual size of the object; this effectively reduces the number of useless features and improves the accuracy of target positioning. Sufficient tracking experiments in various tracking scenes show that our tracker performs great in terms of efficiency, robustness and accuracy.